This sub-package is in beta testing. Ahead of version 1.0 there may be API changes, but these are expected to be minimal, if any.
This sub-package provides the capability to compress and decompress data using the LZ4 frame specification.
The frame specification is recommended for most applications. A key benefit of using the frame specification (compared to the block specification) is interoperability with other implementations.
These functions are bindings to the LZ4 Frame API functions for compressing data into a single frame, and decompressing a full frame of data.
lz4.frame.
compress
()¶compress(data, compression_level=0, block_size=0, content_checksum=0, block_linked=True, store_size=True, return_bytearray=False)
Compresses data
returning the compressed data as a complete frame.
The returned data includes a header and endmark and so is suitable for writing to a file.
Parameters: | data (str, bytes or buffer-compatible object) – data to compress |
---|---|
Keyword Arguments: | |
|
|
Returns: | Compressed data |
Return type: | bytes or bytearray |
lz4.frame.
decompress
(data, return_bytearray=False, return_bytes_read=False)¶Decompresses a frame of data and returns it as a string of bytes.
Parameters: | data (str, bytes or buffer-compatible object) – data to decompress. This should contain a complete LZ4 frame of compressed data. |
---|---|
Keyword Arguments: | |
|
|
Returns: | Uncompressed data and optionally the number of bytes read If the
|
Return type: | bytes/bytearray or tuple |
These functions are bindings to the LZ4 Frame API functions allowing piece-wise compression and decompression. Using them requires managing compression and decompression contexts manually. An alternative to using these is to use the context manager classes described in the section below.
lz4.frame.
create_compression_context
()¶Creates a compression context object.
The compression object is required for compression operations.
Returns: | A compression context |
---|---|
Return type: | cCtx |
lz4.frame.
compress_begin
()¶compress_begin(context, source_size=0, compression_level=0, block_size=0, content_checksum=0, content_size=1, block_mode=0, frame_type=0, auto_flush=1)
Creates a frame header from a compression context.
Parameters: | context (cCtx) – A compression context. |
---|---|
Keyword Arguments: | |
|
|
Returns: | Frame header. |
Return type: | bytes or bytearray |
lz4.frame.
compress_chunk
(context, data)¶Compresses blocks of data and returns the compressed data.
The returned data should be concatenated with the data returned from
lz4.frame.compress_begin
and any subsequent calls to
lz4.frame.compress_chunk
.
Parameters: |
|
---|---|
Keyword Arguments: | |
return_bytearray (bool) – If |
|
Returns: | Compressed data. |
Return type: | bytes or bytearray |
Notes
If auto flush is disabled (auto_flush=False
when calling
lz4.frame.compress_begin
) this function may buffer and retain
some or all of the compressed data for future calls to
lz4.frame.compress
.
lz4.frame.
compress_flush
(context, end_frame=True, return_bytearray=False)¶Flushes any buffered data held in the compression context.
This flushes any data buffed in the compression context, returning it as
compressed data. The returned data should be appended to the output of
previous calls to lz4.frame.compress_chunk
.
The end_frame
argument specifies whether or not the frame should be
ended. If this is True
and end of frame marker will be appended to
the returned data. In this case, if content_checksum
was True
when calling lz4.frame.compress_begin
, then a checksum of the uncompressed
data will also be included in the returned data.
If the end_frame
argument is True
, the compression context will be
reset and can be re-used.
Parameters: | context (cCtx) – Compression context |
---|---|
Keyword Arguments: | |
|
|
Returns: | compressed data. |
Return type: | bytes or bytearray |
Notes
If end_frame
is False
but the underlying LZ4 library does not support flushing without ending the frame, a RuntimeError
will be
raised.
lz4.frame.
create_decompression_context
()¶Creates a decompression context object.
A decompression context is needed for decompression operations.
Returns: | A decompression context |
---|---|
Return type: | dCtx |
lz4.frame.
reset_decompression_context
(context)¶Resets a decompression context object.
This is useful for recovering from an error or for stopping an unfinished decompression and starting a new one with the same context
Parameters: | context (dCtx) – A decompression context |
---|
lz4.frame.
decompress_chunk
(context, data)¶Decompresses part of a frame of compressed data.
The returned uncompressed data should be concatenated with the data
returned from previous calls to lz4.frame.decompress_chunk
Parameters: |
|
---|---|
Keyword Arguments: | |
|
|
Returns: | uncompressed data, bytes read, end of frame indicator This function returns a tuple consisting of:
|
Return type: | tuple |
The end of frame indicator is True
if the end of the compressed
frame has been reached, or False
otherwise
The following function can be used to retrieve information about a compressed frame.
lz4.frame.
get_frame_info
(frame)¶Given a frame of compressed data, returns information about the frame.
Parameters: | frame (str, bytes or buffer-compatible object) – LZ4 compressed frame |
---|---|
Returns: | Dictionary with keys:
|
Return type: | dict |
These classes, which utilize the low level bindings to the Frame API are more convenient to use. They provide context management, and so it is not necessary to manually create and manage compression and decompression contexts.
lz4.frame.
LZ4FrameCompressor
(block_size=0, block_linked=True, compression_level=0, content_checksum=False, block_checksum=False, auto_flush=False, return_bytearray=False)¶Create a LZ4 frame compressor object.
This object can be used to compress data incrementally.
Parameters: |
|
---|
begin
(source_size=0)¶Begin a compression frame.
The returned data contains frame header information. The data returned
from subsequent calls to compress()
should be concatenated with
this header.
Keyword Arguments: | |
---|---|
source_size (int) – Optionally specify the total size of the uncompressed data. If specified, will be stored in the compressed frame header as an 8-byte field for later use during decompression. Default is 0 (no size stored). | |
Returns: | frame header data |
Return type: | bytes or bytearray |
compress
(data)¶Compresses data and returns it.
This compresses data
(a bytes
object), returning a bytes or
bytearray object containing compressed data the input.
If auto_flush
has been set to False
, some of data
may be
buffered internally, for use in later calls to
LZ4FrameCompressor.compress()
and LZ4FrameCompressor.flush()
.
The returned data should be concatenated with the output of any
previous calls to compress()
and a single call to
compress_begin()
.
Parameters: | data (str, bytes or buffer-compatible object) – data to compress |
---|---|
Returns: | compressed data |
Return type: | bytes or bytearray |
flush
()¶Finish the compression process.
This returns a bytes
or bytearray
object containing any data
stored in the compressor’s internal buffers and a frame footer.
The LZ4FrameCompressor instance may be re-used after this method has been called to create a new frame of compressed data.
Returns: | compressed data and frame footer. |
---|---|
Return type: | bytes or bytearray |
reset
()¶Reset the LZ4FrameCompressor
instance.
This allows the LZ4FrameCompression
instance to be re-used after an
error.
lz4.frame.
LZ4FrameDecompressor
(return_bytearray=False)¶Create a LZ4 frame decompressor object.
This can be used to decompress data incrementally.
For a more convenient way of decompressing an entire compressed frame at
once, see lz4.frame.decompress()
.
Parameters: | return_bytearray (bool) – When False a bytes object is returned from
the calls to methods of this class. When True a bytearray
object will be returned. The default is False . |
---|
eof
¶bool – True
if the end-of-stream marker has been reached.
False
otherwise.
unused_data
¶bytes – Data found after the end of the compressed stream.
Before the end of the frame is reached, this will be b''
.
needs_input
¶bool – False
if the decompress()
method can
provide more decompressed data before requiring new uncompressed
input. True
otherwise.
decompress
(data, max_length=-1)¶Decompresses part or all of an LZ4 frame of compressed data.
The returned data should be concatenated with the output of any
previous calls to decompress()
.
If max_length
is non-negative, returns at most max_length
bytes
of decompressed data. If this limit is reached and further output can
be produced, the needs_input
attribute will be set to False
. In
this case, the next call to decompress()
may provide data as
b''
to obtain more of the output. In all cases, any unconsumed data
from previous calls will be prepended to the input data.
If all of the input data
was decompressed and returned (either
because this was less than max_length
bytes, or because
max_length
was negative), the needs_input
attribute will be set
to True
.
If an end of frame marker is encountered in the data during
decompression, decompression will stop at the end of the frame, and any
data after the end of frame is available from the unused_data
attribute. In this case, the LZ4FrameDecompressor
instance is reset
and can be used for further decompression.
Parameters: | data (str, bytes or buffer-compatible object) – compressed data to decompress |
---|---|
Keyword Arguments: | |
max_length (int) – If this is non-negative, this method returns at
most max_length bytes of decompressed data. |
|
Returns: | Uncompressed data |
Return type: | bytes |
reset
()¶Reset the decompressor state.
This is useful after an error occurs, allowing re-use of the instance.
These provide capability for reading and writing of files using LZ4 compressed frames. These are designed to be drop in replacements for the LZMA, BZ2 and Gzip equivalent functionalities in the Python standard library.
lz4.frame.
open
(filename, mode='rb', encoding=None, errors=None, newline=None, block_size=0, block_linked=True, compression_level=0, content_checksum=False, block_checksum=False, auto_flush=False, return_bytearray=False, source_size=0)¶Open an LZ4Frame-compressed file in binary or text mode.
filename
can be either an actual file name (given as a str, bytes, or
PathLike object), in which case the named file is opened, or it can be an
existing file object to read from or write to.
The mode
argument can be 'r'
, 'rb'
(default), 'w'
,
'wb'
, 'x'
, 'xb'
, 'a'
, or 'ab'
for binary mode, or
'rt'
, 'wt'
, 'xt'
, or 'at'
for text mode.
For binary mode, this function is equivalent to the LZ4FrameFile
constructor: LZ4FrameFile(filename, mode, ...)
.
For text mode, an LZ4FrameFile
object is created, and wrapped in an
io.TextIOWrapper
instance with the specified encoding, error handling
behavior, and line ending(s).
Parameters: | filename (str, bytes, os.PathLike) – file name or file object to open |
---|---|
Keyword Arguments: | |
|
lz4.frame.
LZ4FrameFile
(filename=None, mode='r', block_size=0, block_linked=True, compression_level=0, content_checksum=False, block_checksum=False, auto_flush=False, return_bytearray=False, source_size=0)¶A file object providing transparent LZ4F (de)compression.
An LZ4FFile can act as a wrapper for an existing file object, or refer directly to a named file on disk.
Note that LZ4FFile provides a binary file interface - data read is returned as bytes, and data to be written must be given as bytes.
When opening a file for writing, the settings used by the compressor can be
specified. The underlying compressor object is
lz4.frame.LZ4FrameCompressor
. See the docstrings for that class for
details on compression options.
Parameters: | filename (str, bytes, PathLike, file object) – can be either an actual file name (given as a str, bytes, or PathLike object), in which case the named file is opened, or it can be an existing file object to read from or write to. |
---|---|
Keyword Arguments: | |
|
close
()¶Flush and close the file.
May be called more than once without error. Once the file is closed, any other operation on it will raise a ValueError.
closed
¶Returns True
if this file is closed.
Returns: | True if the file is closed, False otherwise. |
---|---|
Return type: | bool |
fileno
()¶Return the file descriptor for the underlying file.
Returns: | file descriptor for file. |
---|---|
Return type: | file object |
peek
(size=-1)¶Return buffered data without advancing the file position.
Always returns at least one byte of data, unless at EOF. The exact number of bytes returned is unspecified.
Returns: | uncompressed data |
---|---|
Return type: | bytes |
read
(size=-1)¶Read up to size
uncompressed bytes from the file.
If size
is negative or omitted, read until EOF
is reached.
Returns b''
if the file is already at EOF
.
Parameters: | size (int) – If non-negative, specifies the maximum number of uncompressed bytes to return. |
---|---|
Returns: | uncompressed data |
Return type: | bytes |
read1
(size=-1)¶Read up to size
uncompressed bytes.
This method tries to avoid making multiple reads from the underlying stream.
This method reads up to a buffer’s worth of data if size
is
negative.
Returns b''
if the file is at EOF.
Parameters: | size (int) – If non-negative, specifies the maximum number of uncompressed bytes to return. |
---|---|
Returns: | uncompressed data |
Return type: | bytes |
readable
()¶Return whether the file was opened for reading.
Returns: |
|
---|---|
Return type: | bool |
readline
(size=-1)¶Read a line of uncompressed bytes from the file.
The terminating newline (if present) is retained. If size is non-negative, no more than size bytes will be read (in which case the line may be incomplete). Returns b’’ if already at EOF.
Parameters: | size (int) – If non-negative, specifies the maximum number of uncompressed bytes to return. |
---|---|
Returns: | uncompressed data |
Return type: | bytes |
seek
(offset, whence=0)¶Change the file position.
The new position is specified by offset
, relative to the position
indicated by whence
. Possible values for whence
are:
io.SEEK_SET
or 0: start of stream (default): offset must not be
negativeio.SEEK_CUR
or 1: current stream positionio.SEEK_END
or 2: end of stream; offset must not be positiveReturns the new file position.
Note that seeking is emulated, so depending on the parameters, this operation may be extremely slow.
Parameters: |
|
---|---|
Returns: | new file position |
Return type: | int |
seekable
()¶Return whether the file supports seeking.
Returns: | True if the file supports seeking, False otherwise. |
---|---|
Return type: | bool |
tell
()¶Return the current file position.
Parameters: | None – |
---|---|
Returns: | file position |
Return type: | int |
writable
()¶Return whether the file was opened for writing.
Returns: |
|
---|---|
Return type: | bool |
write
(data)¶Write a bytes object to the file.
Returns the number of uncompressed bytes written, which is always
len(data)
. Note that due to buffering, the file on disk may not
reflect the data written until close() is called.
Parameters: | data (bytes) – uncompressed data to compress and write to the file |
---|---|
Returns: | the number of uncompressed bytes written to the file |
Return type: | int |
A number of module attributes are defined for convenience. These are detailed below.
The following module attributes can be used when setting the
compression_level
argument.
lz4.frame.
COMPRESSIONLEVEL_MIN
¶Specifier for the minimum compression level.
Specifying compression_level=lz4.frame.COMPRESSIONLEVEL_MIN
will
instruct the LZ4 library to use a compression level of 0
lz4.frame.
COMPRESSIONLEVEL_MINHC
¶Specifier for the minimum compression level for high compression mode.
Specifying compression_level=lz4.frame.COMPRESSIONLEVEL_MINHC
will
instruct the LZ4 library to use a compression level of 3, the minimum for the
high compression mode.
lz4.frame.
COMPRESSIONLEVEL_MAX
¶Specifier for the maximum compression level.
Specifying compression_level=lz4.frame.COMPRESSIONLEVEL_MAX
will
instruct the LZ4 library to use a compression level of 16, the highest
compression level available.
The following attributes can be used when setting the block_size
argument.
lz4.frame.
BLOCKSIZE_DEFAULT
¶Specifier for the default block size.
Specifying block_size=lz4.frame.BLOCKSIZE_DEFAULT
will instruct the LZ4
library to use the default maximum blocksize. This is currently equivalent to
lz4.frame.BLOCKSIZE_MAX64KB
lz4.frame.
BLOCKSIZE_MAX64KB
¶Specifier for a maximum block size of 64 kB.
Specifying block_size=lz4.frame.BLOCKSIZE_MAX64KB
will instruct the LZ4
library to create blocks containing a maximum of 64 kB of uncompressed data.
lz4.frame.
BLOCKSIZE_MAX256KB
¶Specifier for a maximum block size of 256 kB.
Specifying block_size=lz4.frame.BLOCKSIZE_MAX256KB
will instruct the LZ4
library to create blocks containing a maximum of 256 kB of uncompressed data.
lz4.frame.
BLOCKSIZE_MAX1MB
¶Specifier for a maximum block size of 1 MB.
Specifying block_size=lz4.frame.BLOCKSIZE_MAX1MB
will instruct the LZ4
library to create blocks containing a maximum of 1 MB of uncompressed data.
lz4.frame.
BLOCKSIZE_MAX4MB
¶Specifier for a maximum block size of 4 MB.
Specifying block_size=lz4.frame.BLOCKSIZE_MAX4MB
will instruct the LZ4
library to create blocks containing a maximum of 4 MB of uncompressed data.