:: commit b4e15dc7060eb8174d6548a1117862db3dbc5cdb

Kamila Szewczyk <kspalaiologos@gmail.com> — 2022-10-30 07:10

parents: 532677aaae

split the documentation into many files

diff --git a/doc/file_format.md b/doc/file_format.md
index c2e6fd1..43b66ca 100644
--- a/doc/file_format.md
+++ b/doc/file_format.md
@@ -1,8 +1,6 @@
 
 # The bzip3 file format
 
-## The file header
-
 Each bzip3-compressed file starts with the marker `BZ3v1`. After the signature, the compressor encodes a 32-bit number signifying the maximum block size in bytes in the file. As such, no block after decompression in the stream can exceed it. The maximum block size must be between 65KiB and 511MiB.
 
 The following functions are used for serialising all 32-bit numbers to the archive:
@@ -20,22 +18,4 @@ static void write_neutral_s32(u8 * data, s32 value) {
 }
 ```
 
-After the file header, the bzip3-compressed file contains a series of independent blocks.
-
-## The blocks
-
-After the header, the file may contain an unlimited amount of chunks. Each chunk starts with the _new_ size - a 32-bit integer signifying the _compressed_ size of the block, and the _old_ size - a 32-bit integer signifying the _decompressed_ size. Then, a sequence of bzip3-compressed data follows. CRC32 checking is left up to libbz3.
-
-If the chunk is smaller than 64 bytes, then compression is not attempted. Instead, the content is prepended with the 32-bit CRC32 checksum and a 0xFFFFFFFF literal.
-
-Otherwise, the chunk starts with the 32-bit CRC32 checksum value, the Burrows-Wheeler transform permutation index and the compression _model_ - a 8-bit value specifying the compression preset used. As such:
-
-- 2-s bit set in the _model_ - LZP was used and the 32-bit size is prepended to the block.
-- 4-s bit set in the _model_ - RLE was used and the 32-bit size is prepended to the block.
-- No other bit can be set in the _model_.
-
-The size of libbz3's block header can be calculated using the formula `popcnt(model) * 4 + 9`.
-
-## The frame format
-
-The bzip3 frame format is a concatenation of bzip3-compressed blocks. It's used exclusively by the `bz3_compress` and `bz3_decompress` function. Each frame start with the ASCII "BZ3v1" signature, followed by the 32-bit maximum block size in bytes and the 32-bit amount of blocks in the frame. After the 13 byte header, a sequence of independent blocks follows.
+After the file header, the bzip3-compressed file contains a series of independent blocks compressed using the low level API.
\ No newline at end of file
diff --git a/doc/high_level_format.md b/doc/high_level_format.md
new file mode 100644
index 0000000..7f05ff1
--- /dev/null
+++ b/doc/high_level_format.md
@@ -0,0 +1,4 @@
+
+# High level API bzip3 frame format.
+
+The bzip3 frame format is a concatenation of bzip3-compressed blocks. It's used exclusively by the `bz3_compress` and `bz3_decompress` functions and will not work with the command-line tool or low level functions. Each frame start with the ASCII "BZ3v1" signature, followed by the 32-bit maximum block size in bytes and the 32-bit amount of blocks in the frame. After the 13 byte header, a sequence of independent blocks encoded using the low level API follows.
diff --git a/doc/low_level_format.md b/doc/low_level_format.md
new file mode 100644
index 0000000..e84d7cb
--- /dev/null
+++ b/doc/low_level_format.md
@@ -0,0 +1,14 @@
+
+# Low level API bzip3 block format.
+
+Each chunk starts with the _new_ size - a 32-bit integer signifying the _compressed_ size of the block, and the _old_ size - a 32-bit integer signifying the _decompressed_ size. Then, a sequence of bzip3-compressed data follows. CRC32 checking is left up to libbz3.
+
+If the chunk is smaller than 64 bytes, then compression is not attempted. Instead, the content is prepended with the 32-bit CRC32 checksum and a 0xFFFFFFFF literal.
+
+Otherwise, the chunk starts with the 32-bit CRC32 checksum value, the Burrows-Wheeler transform permutation index and the compression _model_ - a 8-bit value specifying the compression preset used. As such:
+
+- 2-s bit set in the _model_ - LZP was used and the 32-bit size is prepended to the block.
+- 4-s bit set in the _model_ - RLE was used and the 32-bit size is prepended to the block.
+- No other bit can be set in the _model_.
+
+The size of libbz3's block header can be calculated using the formula `popcnt(model) * 4 + 9`.
tab: 248 wrap: offon