-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: increase max block size from 1M to 16M #217
Comments
Hi, I am trying to test this feature also, I modify the squashfs_fs.h which is both in squashfs-tools and kernel, simply modify the SQUASHFS_FILE_MAX_SIZE and SQUASHFS_FILE_MAX_LOG to implement this function(but I only change the block size to 8M). This patch can really reduce the size of squashfs image, but I really worry about the impact of this feature on read performance. |
Assuming SquashFS readers don't discard the 3 bits between 1 << 20 (current max block size) and 1 << 24 (compressed bit), it could be increased to 8MiB without breaking anything. With some small changes to readers (a couple lines of code), the block size could be increased to 16MiB. Either choice should probably increment the SquashFS minor version number since existing tooling can currently make the assumption that blocks will be no larger than 1MiB. This code snippet shows the layout of a data block reference: pub const DataEntry = packed struct {
// Maximum SquashFS block size is 1MiB, which can be
// represented by a u21
size: u21,
// If we use these, 8MiB can now be represented
UNUSED: u3 = undefined,
is_uncompressed: bool,
UNUSED2: u7 = undefined,
}; Technically speaking, the upper 7 bits could even be utilized, but that would be hacky and complicated; they might also be better used for something else. Increasing the block size might be good for future-proofing but really impacts random access performance on today's computers. |
For me 16MB blocks works fine.. and is making the sq image smaller . I attached my patch if someone finds usefull .. EDIT: Patch is broken if data is not compressable so I removed it to avoid any issues with data loss. To mount images I use https://github.com/vasi/squashfuse which does not require any patch |
It could be for more than just future-proofing. Surely it's not really the goal of the project, but with a bit of compression improvement and better tooling (I guess mostly libarchive support for "transparent" I/O support), squashfs could become a viable replacement for a bunch of tar use cases. Random access time for tar files is pretty much a worst case scenario so even hundreds of MiB block sizes would be an improvement there, they just have a compression advantage due to being compressed as a single block, and support for all kinds of related formats, even the not too old tzst is quite wide in file managers. Not sure if this kind of archive file use case is ever planned to be "supported", but I've definitely "abused" it as such in a few cases as a large tar file isn't feasible to browse, and 7zip is too dumb to deal with even just symbolic links, while squashfs makes a proper archive that can be browsed (with FUSE mounting at least), it's just not as well compressed as the other options. |
@cgm999 squashfuse (as well as the Linux kernel) actually would require a patch for this. If a block doesn't compress, the compressed bit will be set and additional logic will need to be used to correctly get the block size. Of course you won't run into this situation often, but once you do it'll make for some hard to track bugs and corrupted reads. This code already exists in squashfuse, it would just need to be moved to the |
@voidpointertonull Yeah, I'm not gonna argue that. It could be useful for long-term backups where random access speed isn't super important. Although once you break 16MiB, SquashFS would need some major format changes, which obviously wouldn't be compatible with current tooling |
@mgord9518 Ah yes, I guess I never hit that case where a block is not compressed and used as is .. and I do use this patch for years (is same type of data which I guess it explains why I never hit the issue , and I do compare the source with mounted squashfs via fuse then remove source ). |
@mgord9518 The compressed bit was set at 1 << 24 to allow for increases in the maximum block size, if required. The max block size of 1M was chosen in 2009 because the only compression algorithm at the time (in the kernel) was gzip and that can't make good use of 1M blocks anyway, because the window size is too small. xz/zstd can and increasing the maximum block size is already planned for the next major release (4.7). |
@plougher Nice, I'm excited to mess with larger block sizes. Why was 1<<24 chosen over 1<<31? |
It left the upper bits free for other uses. |
Hi,
I am using a patch to increase the block size because is having better compression. Is it possible to increase it in a future version?
I forgot what was the max block size that does not break the binary optimization related to some marker..
The text was updated successfully, but these errors were encountered: