lib/uzlib: Add memory-efficient, streaming LZ77 compression support.
The compression algorithm implemented in this commit uses much less memory
compared to the standard way of implementing it using a hash table and
large look-back window. In particular the algorithm here doesn't allocate
hash table to store indices into the history of the previously seen text.
Instead it simply does a brute-force-search of the history text to find a
match for the compressor. This is slower (linear search vs hash table
lookup) but with a small enough history (eg 512 bytes) it's not that slow.
And a small history does not impact the compression too much.
To give some more concrete numbers comparing memory use between the
approaches:
- Standard approach: inplace compression, all text to compress must be in
RAM (or at least memory addressable), and then an additional 16k bytes
RAM of hash table pointers, pointing into the text
- The approach in this commit: streaming compression, only a limited amount
of previous text must be in RAM (user selectable, defaults to 512 bytes).
To compress, say, 1k of data, the standard approach requires all that data
to be in RAM, plus an additional 16k of RAM for the hash table pointers.
With this commit, you only need the 1k of data in RAM. Or if it's
streaming from a file (or elsewhere), you could get away with only 256
bytes of RAM for the sliding history and still get very decent compression.
In summary: because compression takes such a large amount of RAM (in the
standard algorithm) and it's not really suitable for microcontrollers, the
approach taken in this commit is to minimise RAM usage as much as possible,
and still have acceptable performance (speed and compression ratio).
Signed-off-by: Damien George <damien@micropython.org>
2023-01-18 04:46:23 +00:00
|
|
|
/*
|
|
|
|
* Simple LZ77 streaming compressor.
|
|
|
|
*
|
|
|
|
* The scheme implemented here doesn't use a hash table and instead does a brute
|
|
|
|
* force search in the history for a previous string. It is relatively slow
|
|
|
|
* (but still O(N)) but gives good compression and minimal memory usage. For a
|
|
|
|
* small history window (eg 256 bytes) it's not too slow and compresses well.
|
|
|
|
*
|
|
|
|
* MIT license; Copyright (c) 2021 Damien P. George
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include "uzlib.h"
|
|
|
|
|
2023-06-26 17:56:58 +01:00
|
|
|
#include "defl_static.c"
|
|
|
|
|
lib/uzlib: Add memory-efficient, streaming LZ77 compression support.
The compression algorithm implemented in this commit uses much less memory
compared to the standard way of implementing it using a hash table and
large look-back window. In particular the algorithm here doesn't allocate
hash table to store indices into the history of the previously seen text.
Instead it simply does a brute-force-search of the history text to find a
match for the compressor. This is slower (linear search vs hash table
lookup) but with a small enough history (eg 512 bytes) it's not that slow.
And a small history does not impact the compression too much.
To give some more concrete numbers comparing memory use between the
approaches:
- Standard approach: inplace compression, all text to compress must be in
RAM (or at least memory addressable), and then an additional 16k bytes
RAM of hash table pointers, pointing into the text
- The approach in this commit: streaming compression, only a limited amount
of previous text must be in RAM (user selectable, defaults to 512 bytes).
To compress, say, 1k of data, the standard approach requires all that data
to be in RAM, plus an additional 16k of RAM for the hash table pointers.
With this commit, you only need the 1k of data in RAM. Or if it's
streaming from a file (or elsewhere), you could get away with only 256
bytes of RAM for the sliding history and still get very decent compression.
In summary: because compression takes such a large amount of RAM (in the
standard algorithm) and it's not really suitable for microcontrollers, the
approach taken in this commit is to minimise RAM usage as much as possible,
and still have acceptable performance (speed and compression ratio).
Signed-off-by: Damien George <damien@micropython.org>
2023-01-18 04:46:23 +00:00
|
|
|
#define MATCH_LEN_MIN (3)
|
|
|
|
#define MATCH_LEN_MAX (258)
|
|
|
|
|
|
|
|
// hist should be a preallocated buffer of hist_max size bytes.
|
|
|
|
// hist_max should be greater than 0 a power of 2 (ie 1, 2, 4, 8, ...).
|
|
|
|
// It's possible to pass in hist=NULL, and then the history window will be taken from the
|
|
|
|
// src passed in to uzlib_lz77_compress (this is useful when not doing streaming compression).
|
2023-06-26 17:56:58 +01:00
|
|
|
void uzlib_lz77_init(uzlib_lz77_state_t *state, uint8_t *hist, size_t hist_max) {
|
|
|
|
memset(state, 0, sizeof(uzlib_lz77_state_t));
|
lib/uzlib: Add memory-efficient, streaming LZ77 compression support.
The compression algorithm implemented in this commit uses much less memory
compared to the standard way of implementing it using a hash table and
large look-back window. In particular the algorithm here doesn't allocate
hash table to store indices into the history of the previously seen text.
Instead it simply does a brute-force-search of the history text to find a
match for the compressor. This is slower (linear search vs hash table
lookup) but with a small enough history (eg 512 bytes) it's not that slow.
And a small history does not impact the compression too much.
To give some more concrete numbers comparing memory use between the
approaches:
- Standard approach: inplace compression, all text to compress must be in
RAM (or at least memory addressable), and then an additional 16k bytes
RAM of hash table pointers, pointing into the text
- The approach in this commit: streaming compression, only a limited amount
of previous text must be in RAM (user selectable, defaults to 512 bytes).
To compress, say, 1k of data, the standard approach requires all that data
to be in RAM, plus an additional 16k of RAM for the hash table pointers.
With this commit, you only need the 1k of data in RAM. Or if it's
streaming from a file (or elsewhere), you could get away with only 256
bytes of RAM for the sliding history and still get very decent compression.
In summary: because compression takes such a large amount of RAM (in the
standard algorithm) and it's not really suitable for microcontrollers, the
approach taken in this commit is to minimise RAM usage as much as possible,
and still have acceptable performance (speed and compression ratio).
Signed-off-by: Damien George <damien@micropython.org>
2023-01-18 04:46:23 +00:00
|
|
|
state->hist_buf = hist;
|
|
|
|
state->hist_max = hist_max;
|
|
|
|
state->hist_start = 0;
|
|
|
|
state->hist_len = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Push the given byte to the history.
|
|
|
|
// Search back in the history for the maximum match of the given src data,
|
|
|
|
// with support for searching beyond the end of the history and into the src buffer
|
|
|
|
// (effectively the history and src buffer are concatenated).
|
2023-06-26 17:56:58 +01:00
|
|
|
static size_t uzlib_lz77_search_max_match(uzlib_lz77_state_t *state, const uint8_t *src, size_t len, size_t *longest_offset) {
|
lib/uzlib: Add memory-efficient, streaming LZ77 compression support.
The compression algorithm implemented in this commit uses much less memory
compared to the standard way of implementing it using a hash table and
large look-back window. In particular the algorithm here doesn't allocate
hash table to store indices into the history of the previously seen text.
Instead it simply does a brute-force-search of the history text to find a
match for the compressor. This is slower (linear search vs hash table
lookup) but with a small enough history (eg 512 bytes) it's not that slow.
And a small history does not impact the compression too much.
To give some more concrete numbers comparing memory use between the
approaches:
- Standard approach: inplace compression, all text to compress must be in
RAM (or at least memory addressable), and then an additional 16k bytes
RAM of hash table pointers, pointing into the text
- The approach in this commit: streaming compression, only a limited amount
of previous text must be in RAM (user selectable, defaults to 512 bytes).
To compress, say, 1k of data, the standard approach requires all that data
to be in RAM, plus an additional 16k of RAM for the hash table pointers.
With this commit, you only need the 1k of data in RAM. Or if it's
streaming from a file (or elsewhere), you could get away with only 256
bytes of RAM for the sliding history and still get very decent compression.
In summary: because compression takes such a large amount of RAM (in the
standard algorithm) and it's not really suitable for microcontrollers, the
approach taken in this commit is to minimise RAM usage as much as possible,
and still have acceptable performance (speed and compression ratio).
Signed-off-by: Damien George <damien@micropython.org>
2023-01-18 04:46:23 +00:00
|
|
|
size_t longest_len = 0;
|
|
|
|
for (size_t hist_search = 0; hist_search < state->hist_len; ++hist_search) {
|
|
|
|
size_t match_len;
|
|
|
|
for (match_len = 0; match_len <= MATCH_LEN_MAX && match_len < len; ++match_len) {
|
|
|
|
uint8_t hist;
|
|
|
|
if (hist_search + match_len < state->hist_len) {
|
|
|
|
hist = state->hist_buf[(state->hist_start + hist_search + match_len) & (state->hist_max - 1)];
|
|
|
|
} else {
|
|
|
|
hist = src[hist_search + match_len - state->hist_len];
|
|
|
|
}
|
|
|
|
if (src[match_len] != hist) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (match_len >= MATCH_LEN_MIN && match_len > longest_len) {
|
|
|
|
longest_len = match_len;
|
|
|
|
*longest_offset = state->hist_len - hist_search;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return longest_len;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Compress the given chunk of data.
|
2023-06-26 17:56:58 +01:00
|
|
|
void uzlib_lz77_compress(uzlib_lz77_state_t *state, const uint8_t *src, unsigned len) {
|
lib/uzlib: Add memory-efficient, streaming LZ77 compression support.
The compression algorithm implemented in this commit uses much less memory
compared to the standard way of implementing it using a hash table and
large look-back window. In particular the algorithm here doesn't allocate
hash table to store indices into the history of the previously seen text.
Instead it simply does a brute-force-search of the history text to find a
match for the compressor. This is slower (linear search vs hash table
lookup) but with a small enough history (eg 512 bytes) it's not that slow.
And a small history does not impact the compression too much.
To give some more concrete numbers comparing memory use between the
approaches:
- Standard approach: inplace compression, all text to compress must be in
RAM (or at least memory addressable), and then an additional 16k bytes
RAM of hash table pointers, pointing into the text
- The approach in this commit: streaming compression, only a limited amount
of previous text must be in RAM (user selectable, defaults to 512 bytes).
To compress, say, 1k of data, the standard approach requires all that data
to be in RAM, plus an additional 16k of RAM for the hash table pointers.
With this commit, you only need the 1k of data in RAM. Or if it's
streaming from a file (or elsewhere), you could get away with only 256
bytes of RAM for the sliding history and still get very decent compression.
In summary: because compression takes such a large amount of RAM (in the
standard algorithm) and it's not really suitable for microcontrollers, the
approach taken in this commit is to minimise RAM usage as much as possible,
and still have acceptable performance (speed and compression ratio).
Signed-off-by: Damien George <damien@micropython.org>
2023-01-18 04:46:23 +00:00
|
|
|
const uint8_t *top = src + len;
|
|
|
|
while (src < top) {
|
|
|
|
// Look for a match in the history window.
|
|
|
|
size_t match_offset = 0;
|
|
|
|
size_t match_len = uzlib_lz77_search_max_match(state, src, top - src, &match_offset);
|
|
|
|
|
|
|
|
// Encode the literal byte or the match.
|
|
|
|
if (match_len == 0) {
|
2023-06-26 17:56:58 +01:00
|
|
|
uzlib_literal(state, *src);
|
lib/uzlib: Add memory-efficient, streaming LZ77 compression support.
The compression algorithm implemented in this commit uses much less memory
compared to the standard way of implementing it using a hash table and
large look-back window. In particular the algorithm here doesn't allocate
hash table to store indices into the history of the previously seen text.
Instead it simply does a brute-force-search of the history text to find a
match for the compressor. This is slower (linear search vs hash table
lookup) but with a small enough history (eg 512 bytes) it's not that slow.
And a small history does not impact the compression too much.
To give some more concrete numbers comparing memory use between the
approaches:
- Standard approach: inplace compression, all text to compress must be in
RAM (or at least memory addressable), and then an additional 16k bytes
RAM of hash table pointers, pointing into the text
- The approach in this commit: streaming compression, only a limited amount
of previous text must be in RAM (user selectable, defaults to 512 bytes).
To compress, say, 1k of data, the standard approach requires all that data
to be in RAM, plus an additional 16k of RAM for the hash table pointers.
With this commit, you only need the 1k of data in RAM. Or if it's
streaming from a file (or elsewhere), you could get away with only 256
bytes of RAM for the sliding history and still get very decent compression.
In summary: because compression takes such a large amount of RAM (in the
standard algorithm) and it's not really suitable for microcontrollers, the
approach taken in this commit is to minimise RAM usage as much as possible,
and still have acceptable performance (speed and compression ratio).
Signed-off-by: Damien George <damien@micropython.org>
2023-01-18 04:46:23 +00:00
|
|
|
match_len = 1;
|
|
|
|
} else {
|
2023-06-26 17:56:58 +01:00
|
|
|
uzlib_match(state, match_offset, match_len);
|
lib/uzlib: Add memory-efficient, streaming LZ77 compression support.
The compression algorithm implemented in this commit uses much less memory
compared to the standard way of implementing it using a hash table and
large look-back window. In particular the algorithm here doesn't allocate
hash table to store indices into the history of the previously seen text.
Instead it simply does a brute-force-search of the history text to find a
match for the compressor. This is slower (linear search vs hash table
lookup) but with a small enough history (eg 512 bytes) it's not that slow.
And a small history does not impact the compression too much.
To give some more concrete numbers comparing memory use between the
approaches:
- Standard approach: inplace compression, all text to compress must be in
RAM (or at least memory addressable), and then an additional 16k bytes
RAM of hash table pointers, pointing into the text
- The approach in this commit: streaming compression, only a limited amount
of previous text must be in RAM (user selectable, defaults to 512 bytes).
To compress, say, 1k of data, the standard approach requires all that data
to be in RAM, plus an additional 16k of RAM for the hash table pointers.
With this commit, you only need the 1k of data in RAM. Or if it's
streaming from a file (or elsewhere), you could get away with only 256
bytes of RAM for the sliding history and still get very decent compression.
In summary: because compression takes such a large amount of RAM (in the
standard algorithm) and it's not really suitable for microcontrollers, the
approach taken in this commit is to minimise RAM usage as much as possible,
and still have acceptable performance (speed and compression ratio).
Signed-off-by: Damien George <damien@micropython.org>
2023-01-18 04:46:23 +00:00
|
|
|
}
|
|
|
|
|
2023-06-27 13:03:52 +01:00
|
|
|
// Push the bytes into the history buffer.
|
|
|
|
size_t mask = state->hist_max - 1;
|
|
|
|
while (match_len--) {
|
|
|
|
uint8_t b = *src++;
|
|
|
|
state->hist_buf[(state->hist_start + state->hist_len) & mask] = b;
|
|
|
|
if (state->hist_len == state->hist_max) {
|
|
|
|
state->hist_start = (state->hist_start + 1) & mask;
|
|
|
|
} else {
|
|
|
|
++state->hist_len;
|
lib/uzlib: Add memory-efficient, streaming LZ77 compression support.
The compression algorithm implemented in this commit uses much less memory
compared to the standard way of implementing it using a hash table and
large look-back window. In particular the algorithm here doesn't allocate
hash table to store indices into the history of the previously seen text.
Instead it simply does a brute-force-search of the history text to find a
match for the compressor. This is slower (linear search vs hash table
lookup) but with a small enough history (eg 512 bytes) it's not that slow.
And a small history does not impact the compression too much.
To give some more concrete numbers comparing memory use between the
approaches:
- Standard approach: inplace compression, all text to compress must be in
RAM (or at least memory addressable), and then an additional 16k bytes
RAM of hash table pointers, pointing into the text
- The approach in this commit: streaming compression, only a limited amount
of previous text must be in RAM (user selectable, defaults to 512 bytes).
To compress, say, 1k of data, the standard approach requires all that data
to be in RAM, plus an additional 16k of RAM for the hash table pointers.
With this commit, you only need the 1k of data in RAM. Or if it's
streaming from a file (or elsewhere), you could get away with only 256
bytes of RAM for the sliding history and still get very decent compression.
In summary: because compression takes such a large amount of RAM (in the
standard algorithm) and it's not really suitable for microcontrollers, the
approach taken in this commit is to minimise RAM usage as much as possible,
and still have acceptable performance (speed and compression ratio).
Signed-off-by: Damien George <damien@micropython.org>
2023-01-18 04:46:23 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|