-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce size of the data files #11
Comments
To stay with the
@DominikBernhardt is the format of the data files documented somewhere? Anyway, in this specific example all three groups are the same. I think the socle should always be expressed in terms of the generators of the full group, perhaps via words in the generators. Doing so, I think this > 500kb entry could be shrunk by a factor 500. It won't be as dramatic everywhere, but I am hopeful we can reduce by at least an order of magnitude. |
For the socle, we can in fact just store (information about) a normal generating set, to be fed into |
Also, for the |
@glukemorgan I just learned from @aniemeyer that you already have a "small" / "reduced size" version of the data files. Is that correct? If so, perhaps you'd be willing to share it and then we could integrate it here and finally get this package released to a wider audience... |
Hi @fingolfin , sorry, I don't log in to github too often, I just saw this. |
Hi @glukemorgan and sorry I saw your message and then it got lost in the stack sigh. You could add them via a pull request. Or you could email them to me and I can integrate them. If you prefer we can also continue the conversation via email (reach me under |
Currently there are 5 compressed data files (
QUIMP[1-5].tar.bz2
) in the repository which take up 9-69 MB each for a total of 190 MB. The user has to extract them for a total of 770 MB.This should be reduced. Several ideas for this which can be combined.
First off, GAP can transparently access
.gz
files, this would suggest storing not e.g.lib/QUIMP_336.g
but ratherlib/QUIMP_336.g.gz
in the archives, so that disk space usage is reduced for the end user. The result is "only" 270 MBThis would in fact allow shipping the files "directly" to the user, without a need for .tar.bz2 files. These could then also be removed from the repository which would be better anyway; we could instead keep the
lib/QUIMP_*.g
files in the repository directly (and compress them on the fly for releases, which we already do for multiple other packages)Next, the content of the
lib/QUIMP_*.g
files could be optimized further.@aniemeyer suggest that for many groups a good way to compress them is to store them via generators in a different, minimal degree representation; and then store generators of a subgroup such that the coset action on the subgroup gives the actual QUIMP permutations. Indeed, take for example$A_{17}$ in disguise. So one could replace the generators by the information "this is A17" plus generators for the point stabilizer:
QuimpGroup(4080,1)
. In the filelib/QUIMP_4080.g
it takes up more than 0.5 MB space. But it isThe text was updated successfully, but these errors were encountered: