mirror of
https://github.com/kubernetes-sigs/kustomize.git
synced 2026-06-30 18:01:21 +00:00
92 lines
3.1 KiB
Markdown
92 lines
3.1 KiB
Markdown
% LZMA2 format
|
|
|
|
The LZMA2 format supports flushing, parallel encoding or decoding.
|
|
Chunks of data that cannot be compressed are copied as such.
|
|
|
|
## Dictionary Size
|
|
|
|
LZMA2 requires information about the size of the dictionary. This is
|
|
provided by a single byte.
|
|
|
|
Bits | Mask | Description
|
|
----:|-----:|:------------------------------------------------
|
|
0-5 | 0x3F | Dictionary Size
|
|
6-7 | 0xC0 | Reserved for future use; Must be zero
|
|
|
|
The dictionary size is encoded with a one-bit mantissa and five-bit
|
|
exponent. The smallest dictionary size is 4 KiB and the biggest is 4 GiB
|
|
- 1 B.
|
|
|
|
|Raw Value | Mantissa | Exponent | Dictionary size|
|
|
|---------:|---------:|---------:|---------------:|
|
|
| 0 | 2 | 11 | 4 KiB |
|
|
| 1 | 3 | 11 | 6 KiB |
|
|
| 2 | 2 | 12 | 8 KiB |
|
|
| 3 | 3 | 12 | 12 KiB |
|
|
| ... | ... | ... | ... |
|
|
| 36 | 2 | 29 | 1024 MiB |
|
|
| 37 | 3 | 29 | 1536 MiB |
|
|
| 38 | 2 | 30 | 2048 MiB |
|
|
| 39 | 3 | 30 | 3072 MiB |
|
|
| 40 | 2 | 31 | 4096 MiB - 1B |
|
|
|
|
For test purposes we add the dictionary size byte as first byte of an
|
|
LZMA2 stream.
|
|
|
|
## Chunks
|
|
|
|
An LZMA2 stream is a sequence of chunks. Each chunk is preceded by a
|
|
control byte and other information.
|
|
|
|
Following the C implementation in the LZMA SDK the control byte can be
|
|
described as such:
|
|
|
|
Chunk header | Description
|
|
:------------------- | :--------------------------------------------------
|
|
`00000000` | End of LZMA2 stream
|
|
`00000001 U U` | Uncompressed chunk, reset dictionary
|
|
`00000010 U U` | Uncompressed chunk, no reset of dictionary
|
|
`100uuuuu U U C C` | LZMA, no reset
|
|
`101uuuuu U U C C` | LZMA, reset state
|
|
`110uuuuu U U C C S` | LZMA, reset state, new properties
|
|
`111uuuuu U U C C S` | LZMA, reset state, new properties, reset dictionary
|
|
|
|
The symbols used are described by following table.
|
|
|
|
Symbol | Description
|
|
:----- | :--------------------
|
|
u | uncompressed size bit
|
|
U | uncompressed size byte
|
|
C | uncompressed size byte
|
|
S | properties byte
|
|
|
|
A dictionary reset requires always new properties. If this is an
|
|
uncompressed chunk the properties need to be provided in the next
|
|
compressed chunk. New properties require a reset of the state.
|
|
|
|
A dictionary reset puts the current position to zero. Uncompressed data
|
|
is written into the dictionary.
|
|
|
|
The uncompressed size and compressed size are given in big-endian byte order.
|
|
The values need to be incremented for the actual size. So a chunk with 1
|
|
byte uncompressed data will store size 0 in the uncompressed bits and bytes.
|
|
|
|
The properties byte provides the parameters pb, lc, lp using following
|
|
formula:
|
|
|
|
S = (pb * 5 + lp) * 9 + lc
|
|
|
|
This is same encoding used for LZMA. For LZMA2 following condition has
|
|
been introduced:
|
|
|
|
lc + lp <= 4.
|
|
|
|
The parameters are defined as follows:
|
|
|
|
Name | Range | Description
|
|
:---- | :----- | :------------------------------
|
|
lc | [0,8] | number of literal context bits
|
|
lp | [0,4] | number of literal pos bits
|
|
pb | [0,4] | the number of pos bits
|
|
|