Compression

The armaio.compression module provides functions to handle the decompression of data blocks in certain binary formats. The module does not currently implement the compression methods themselves.

Exceptions

class armaio.compression.LzoError

Exception raised upon LZO decompression errors.

class armaio.compression.LzssError

Exception raised upon LZSS decompression errors.

Functions

Note

The LZO1X decompression was implemented based on the format documentation found in the Linux kernel documentation. Additional inspiration was taken from the libavutil libarary C implementation of LZO1X decompression in the FFMPEG project.

The original LZO implementations as written by Markus F.X.J. Oberhumer:

armaio.compression.lzo1x_decompress(data: bytes | bytearray | IO[bytes] | BinaryIO, expected_length: int) tuple[int, bytes]

Decompresses data compressed with the LZO1X algorithm.

Parameters:
  • data (bytes | bytearray | IO[bytes] | BinaryIO) – Source binary data

  • expected (int) – Expected decompressed length

Raises:

LzoError – Could not decompress data due to an error

Returns:

Number of consumed bytes and the decompressed data

Return type:

tuple[int, bytes]

Note

The LZSS decompression was implemented based on information given on the Community Wiki LZSS page.

Additional references:

The compression method used in Arma 3 binary files is a custom flavor of LZSS, with the following general parameters:

  • Flag bits are grouped into bytes by eights (this allows per-byte reading instead of per-bit).

  • 16-bit pointers: 12-bit offset, 4-bit match length (4096 bytes sliding window).

  • Filler value: 0x20 (space character)

The pointer in the file is stored in 2 bytes as OOOOOOOO OOOOLLLL. In a somewhat confusing way, the one and a half byte offset value is stored as little-endian. This means that the 4-bits stored - together with the length - in the second byte must be prepended to the 8-bits in the first byte to get the offset.

Since the main advantage of LZSS over LZ77 is that only those sequences are dictionary encoded where the pointer actually saves bytes over the literal copy (with a 2 byte pointer, it does not make sense to encode single bytes this way), the length has an implicit minimum value. To get the actual match length, 3 must be added to the read length.

armaio.compression.lzss_decompress(data: bytes | bytearray | IO[bytes] | BinaryIO, expected_length: int, *, signed_checksum: bool = False) tuple[int, bytes]

Decompress data compressed with the LZSS algorithm.

Parameters:
  • data (bytes | bytearray | IO[bytes] | BinaryIO) – Source binary data

  • expected_length (int) – Expected decompressed length

  • signed_checksum (bool, optional) – Use signed checksum instead of unsigned, defaults to False

Raises:

LzssError – Could not decompress data due to an error

Returns:

Number of consumed bytes and the decompressed data

Return type:

tuple[int, bytes]