Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Idea] Make gzp usable as C library and/or python module #17

Open
ghuls opened this issue Sep 10, 2021 · 1 comment
Open

[Idea] Make gzp usable as C library and/or python module #17

ghuls opened this issue Sep 10, 2021 · 1 comment

Comments

@ghuls
Copy link

ghuls commented Sep 10, 2021

Make gzp usable as C library and/or python module

For example, there is a python wrapper around ISA-Lwhich uses theigzip` code to provide fast gzip decompression/compression (bad compression ratios) which mimics the standard gzip module to provide faster gzip (de)compression speed.

python-isal

Faster zlib and gzip compatible compression and decompression by providing Python bindings for the ISA-L library.

This package provides Python bindings for the ISA-L library. The Intel(R) Intelligent Storage Acceleration Library (ISA-L) implements several key algorithms in assembly language. This includes a variety of functions to provide zlib/gzip-compatible compression.

python-isal provides the bindings by offering three modules:

    isal_zlib: A drop-in replacement for the zlib module that uses ISA-L to accelerate its performance.
    igzip: A drop-in replacement for the gzip module that uses isal_zlib instead of zlib to perform its compression and checksum tasks, which improves performance.
    igzip_lib: Provides compression functions which have full access to the API of ISA-L's compression functions.

isal_zlib and igzip are almost fully compatible with zlib and gzip from the Python standard library. There are some minor differences see: differences-with-zlib-and-gzip-modules.

https://github.com/pycompression/python-isal

It would be great if gzp could be exposed in a similar way, so more bioinformatics (python) programs can read/write gzip/bgzf files faster.

@sstadick
Copy link
Owner

I would very much like for both of those things to happen! If someone else starts on that before I do I'm more than happy to accommodate that / help make it happen. Especially the python library.

If I, or anyone else starts on this it'd be great to post a comment in this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants