| 1 | # BZip3 Format Documentation |
| 2 |
|
| 3 | BZip3 is a modern compression format designed for high compression ratios while maintaining |
| 4 | reasonable decompression speeds. It is intended to provide similar compression ratio and |
| 5 | performance to LZMA and BZip2; as opposed to faster Lempel-Ziv codecs that usually offer worse |
| 6 | compression ratio like ZStandard or LZ4. |
| 7 |
|
| 8 | This documentation covers the technical specifications of the BZip3 format. |
| 9 |
|
| 10 | ## Format Characteristics |
| 11 |
|
| 12 | - Block level compression (no streams) |
| 13 | - Maximum block size ranges from 65KiB to 511MiB |
| 14 | - Memory usage of ~(6 x block size), both compression and decompression |
| 15 | - Little-endian encoding for integers |
| 16 | - Embedded CRC32 checksums for data integrity |
| 17 | - Combines LZP, RLE followed by Burrows-Wheeler transform and arithmetic coding coupled with |
| 18 | a statistical predictor. |
| 19 |
|
| 20 | ## Format Overview |
| 21 |
|
| 22 | BZip3 uses two main top-level formats: |
| 23 |
|
| 24 | 1. **File Format**: The standard format used by the command-line tool |
| 25 | 2. **Frame Format**: Used by the high-level API functions `bz3_compress` and `bz3_decompress`. |
| 26 |
|
| 27 | These formats are very similar: the file format is a superset of the frame format and thus also |
| 28 | contains a block count field. |
| 29 |
|
| 30 | See [bzip3_format.md](./bzip3_format.md) for more details. |
| 31 |
|