aiogzip ⚡️
An asynchronous library for reading and writing gzip-compressed files.
aiogzip provides a fast, simple, and asyncio-native interface for handling .gz files, making it a useful complement to Python's built-in gzip module for asynchronous applications.
It is designed for high-performance I/O operations, especially for text-based data pipelines, and integrates seamlessly with other async libraries like aiocsv.
Features
- Truly Asynchronous: Built with
asyncioandaiofilesfor non-blocking file I/O. - High-Performance Text Processing: Significantly faster than the standard
gziplibrary for text and JSONL file operations. - Simple API: Mimics the interface of
gzip.open(), making it easy to adopt. - Separate Binary and Text Modes:
AsyncGzipBinaryFileandAsyncGzipTextFileprovide clear, type-safe handling of data. - Excellent Compression Quality: Achieves compression ratios nearly identical to the standard
gzipmodule. aiocsvIntegration: Read and write compressed CSV files effortlessly.
Quick Links
Quickstart
Using aiogzip is as simple as using the standard gzip module, but with async/await.
Writing to a Compressed File
import asyncio
from aiogzip import AsyncGzipFile
async def main():
# Write binary data
async with AsyncGzipFile("file.gz", "wb") as f:
await f.write(b"Hello, async world!")
# Write text data
async with AsyncGzipFile("file.txt.gz", "wt") as f:
await f.write("This is a text file.")
asyncio.run(main())
Reading from a Compressed File
import asyncio
from aiogzip import AsyncGzipFile
async def main():
# Read the entire file
async with AsyncGzipFile("file.gz", "rb") as f:
content = await f.read()
print(content)
# Iterate over lines in a text file
async with AsyncGzipFile("file.txt.gz", "rt") as f:
async for line in f:
print(line.strip())
asyncio.run(main())
Compatibility
aiogzip provides comprehensive compatibility with the standard gzip module's GzipFile API, including:
- ✅
seek()andtell()methods for stream navigation (with the same performance characteristics asgzip.GzipFile) - ✅
peek()andreadinto()for advanced reading patterns - ✅ Reading and writing gzip headers and metadata (e.g.,
mtime,original_filename) - ✅ Text and binary mode operations with proper encoding/decoding
- ✅ Full compatibility with
tarfilefor reading.tar.gzarchives - ✅ Seamless integration with
aiocsvfor CSV processing
For AsyncGzipTextFile, tell() returns an opaque cookie value for the current open stream. Use it only with seek(cookie) on the same open handle.
Note: aiogzip focuses on file-based operations and does not currently support in-memory compression/decompression (e.g., gzip.compress/gzip.decompress).