Skip to content

aiogzip ⚡️

An asynchronous library for reading and writing gzip-compressed files.

License: MIT PyPI version PyPI - Python Version Tests

aiogzip provides a fast, simple, and asyncio-native interface for handling .gz files, making it a useful complement to Python's built-in gzip module for asynchronous applications.

It is designed for high-performance I/O operations, especially for text-based data pipelines, and integrates seamlessly with other async libraries like aiocsv.

Features

  • Truly Asynchronous: Built with asyncio and aiofiles for non-blocking file I/O.
  • High-Performance Text Processing: Significantly faster than the standard gzip library for text and JSONL file operations.
  • Simple API: Mimics the interface of gzip.open(), making it easy to adopt.
  • Separate Binary and Text Modes: AsyncGzipBinaryFile and AsyncGzipTextFile provide clear, type-safe handling of data.
  • Excellent Compression Quality: Achieves compression ratios nearly identical to the standard gzip module.
  • aiocsv Integration: Read and write compressed CSV files effortlessly.


Quickstart

Using aiogzip is as simple as using the standard gzip module, but with async/await.

Writing to a Compressed File

import asyncio
from aiogzip import AsyncGzipFile

async def main():
    # Write binary data
    async with AsyncGzipFile("file.gz", "wb") as f:
        await f.write(b"Hello, async world!")

    # Write text data
    async with AsyncGzipFile("file.txt.gz", "wt") as f:
        await f.write("This is a text file.")

asyncio.run(main())

Reading from a Compressed File

import asyncio
from aiogzip import AsyncGzipFile

async def main():
    # Read the entire file
    async with AsyncGzipFile("file.gz", "rb") as f:
        content = await f.read()
        print(content)

    # Iterate over lines in a text file
    async with AsyncGzipFile("file.txt.gz", "rt") as f:
        async for line in f:
            print(line.strip())

asyncio.run(main())

Compatibility

aiogzip provides comprehensive compatibility with the standard gzip module's GzipFile API, including:

  • seek() and tell() methods for stream navigation (with the same performance characteristics as gzip.GzipFile)
  • peek() and readinto() for advanced reading patterns
  • ✅ Reading and writing gzip headers and metadata (e.g., mtime, original_filename)
  • ✅ Text and binary mode operations with proper encoding/decoding
  • ✅ Full compatibility with tarfile for reading .tar.gz archives
  • ✅ Seamless integration with aiocsv for CSV processing

For AsyncGzipTextFile, tell() returns an opaque cookie value for the current open stream. Use it only with seek(cookie) on the same open handle.

Backward seeks restart decompression from the beginning of the gzip stream. For non-seekable fileobj inputs, aiogzip keeps a bounded compressed-input replay cache so rewind can work without loading unbounded data; tune it with max_rewind_cache_size or set it to None for the previous unbounded behavior.

Note: aiogzip focuses on file-based operations and does not currently support in-memory compression/decompression (e.g., gzip.compress/gzip.decompress).