Feature request: increase max block size from 1M to 16M #217

cgm999 · 2023-01-21T10:43:06Z

Hi,

I am using a patch to increase the block size because is having better compression. Is it possible to increase it in a future version?
I forgot what was the max block size that does not break the binary optimization related to some marker..

maiziyi · 2023-05-05T03:35:28Z

Hi,

I am trying to test this feature also, I modify the squashfs_fs.h which is both in squashfs-tools and kernel, simply modify the SQUASHFS_FILE_MAX_SIZE and SQUASHFS_FILE_MAX_LOG to implement this function(but I only change the block size to 8M). This patch can really reduce the size of squashfs image, but I really worry about the impact of this feature on read performance.

mgord9518 · 2024-05-02T06:57:44Z

Assuming SquashFS readers don't discard the 3 bits between 1 << 20 (current max block size) and 1 << 24 (compressed bit), it could be increased to 8MiB without breaking anything. With some small changes to readers (a couple lines of code), the block size could be increased to 16MiB. Either choice should probably increment the SquashFS minor version number since existing tooling can currently make the assumption that blocks will be no larger than 1MiB.

This code snippet shows the layout of a data block reference:

pub const DataEntry = packed struct {
    // Maximum SquashFS block size is 1MiB, which can be
    // represented by a u21
    size: u21,

    // If we use these, 8MiB can now be represented
    UNUSED: u3 = undefined,

    is_uncompressed: bool,
    UNUSED2: u7 = undefined,
};

Technically speaking, the upper 7 bits could even be utilized, but that would be hacky and complicated; they might also be better used for something else.

Increasing the block size might be good for future-proofing but really impacts random access performance on today's computers.

cgm999 · 2024-05-02T07:08:29Z

For me 16MB blocks works fine.. and is making the sq image smaller . I attached my patch if someone finds usefull ..
I also use zstd for good decompress speed.

EDIT: Patch is broken if data is not compressable so I removed it to avoid any issues with data loss.

To mount images I use https://github.com/vasi/squashfuse which does not require any patch

voidpointertonull · 2024-05-02T07:38:09Z

Increasing the block size might be good for future-proofing but really impacts random access performance on today's computers.

It could be for more than just future-proofing. Surely it's not really the goal of the project, but with a bit of compression improvement and better tooling (I guess mostly libarchive support for "transparent" I/O support), squashfs could become a viable replacement for a bunch of tar use cases.

Random access time for tar files is pretty much a worst case scenario so even hundreds of MiB block sizes would be an improvement there, they just have a compression advantage due to being compressed as a single block, and support for all kinds of related formats, even the not too old tzst is quite wide in file managers.

Not sure if this kind of archive file use case is ever planned to be "supported", but I've definitely "abused" it as such in a few cases as a large tar file isn't feasible to browse, and 7zip is too dumb to deal with even just symbolic links, while squashfs makes a proper archive that can be browsed (with FUSE mounting at least), it's just not as well compressed as the other options.

mgord9518 · 2024-05-02T07:49:46Z

@cgm999 squashfuse (as well as the Linux kernel) actually would require a patch for this. If a block doesn't compress, the compressed bit will be set and additional logic will need to be used to correctly get the block size. Of course you won't run into this situation often, but once you do it'll make for some hard to track bugs and corrupted reads.

This code already exists in squashfuse, it would just need to be moved to the sqfs_data_header function, which happens to be directly below the function it's currently in.

mgord9518 · 2024-05-02T07:57:31Z

@voidpointertonull Yeah, I'm not gonna argue that. It could be useful for long-term backups where random access speed isn't super important.

Although once you break 16MiB, SquashFS would need some major format changes, which obviously wouldn't be compatible with current tooling

cgm999 · 2024-05-02T09:39:33Z

@mgord9518 Ah yes, I guess I never hit that case where a block is not compressed and used as is .. and I do use this patch for years (is same type of data which I guess it explains why I never hit the issue , and I do compare the source with mounted squashfs via fuse then remove source ).

plougher · 2024-05-02T10:46:15Z

@mgord9518 The compressed bit was set at 1 << 24 to allow for increases in the maximum block size, if required. The max block size of 1M was chosen in 2009 because the only compression algorithm at the time (in the kernel) was gzip and that can't make good use of 1M blocks anyway, because the window size is too small. xz/zstd can and increasing the maximum block size is already planned for the next major release (4.7).

mgord9518 · 2024-05-02T11:06:54Z

@plougher Nice, I'm excited to mess with larger block sizes. Why was 1<<24 chosen over 1<<31?

plougher · 2024-05-02T11:13:47Z

@plougher Nice, I'm excited to mess with larger block sizes. Why was 1<<24 chosen over 1<<31?

It left the upper bits free for other uses.

cgm999 changed the title ~~increase max block size from 1M to 16M or 32M~~ Feature request: increase max block size from 1M to 16M or 32M Jan 24, 2023

plougher self-assigned this Jan 29, 2023

plougher added the enhancement label Jan 29, 2023

plougher added this to the Undecided milestone Jan 29, 2023

voidpointertonull mentioned this issue Oct 13, 2023

Block level deduplication #58

Open

cgm999 changed the title ~~Feature request: increase max block size from 1M to 16M or 32M~~ Feature request: increase max block size from 1M to 16M May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: increase max block size from 1M to 16M #217

Feature request: increase max block size from 1M to 16M #217

cgm999 commented Jan 21, 2023

maiziyi commented May 5, 2023

mgord9518 commented May 2, 2024

cgm999 commented May 2, 2024 •

edited

Loading

voidpointertonull commented May 2, 2024

mgord9518 commented May 2, 2024 •

edited

Loading

mgord9518 commented May 2, 2024 •

edited

Loading

cgm999 commented May 2, 2024

plougher commented May 2, 2024

mgord9518 commented May 2, 2024

plougher commented May 2, 2024

Feature request: increase max block size from 1M to 16M #217

Feature request: increase max block size from 1M to 16M #217

Comments

cgm999 commented Jan 21, 2023

maiziyi commented May 5, 2023

mgord9518 commented May 2, 2024

cgm999 commented May 2, 2024 • edited Loading

voidpointertonull commented May 2, 2024

mgord9518 commented May 2, 2024 • edited Loading

mgord9518 commented May 2, 2024 • edited Loading

cgm999 commented May 2, 2024

plougher commented May 2, 2024

mgord9518 commented May 2, 2024

plougher commented May 2, 2024

cgm999 commented May 2, 2024 •

edited

Loading

mgord9518 commented May 2, 2024 •

edited

Loading

mgord9518 commented May 2, 2024 •

edited

Loading