Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Parallel file reading #239

Open
nh2 opened this issue Apr 1, 2023 · 2 comments
Open

Feature request: Parallel file reading #239

nh2 opened this issue Apr 1, 2023 · 2 comments
Assignees
Milestone

Comments

@nh2
Copy link

nh2 commented Apr 1, 2023

Currently mksquashfs seems to use a single reader thread.

Many current devices only achieve optimal throughput when files are read from them in parallel:

  • current SSDs (which require a high queue depth)
  • large RAID arrays (e.g. servers with 16 disks in)
  • network file systems (parallelism hiding network latency)

Could mksquashfs add (configurable) threaded reading?

Thanks!

@plougher plougher self-assigned this Apr 4, 2023
@plougher plougher added this to the Undecided milestone Apr 4, 2023
@plougher
Copy link
Owner

plougher commented Apr 5, 2023

This is an interesting request (the second in one week). Back when I parallelised Mksquashfs for the first time in about 2006 I did extensive experiments reading the source filesystem using one thread and multiple threads. These experiments showed the maximum performance was obtained with a single read thread (and so you're right that there is only one reader thread). But this was in the days of mechanical hard drives with slow seeking, and the results were not that surprising. By and large anything which caused seeking (including parallel reading of files) produced worse performance.

Modern hardware including RAID (*) and SSD drives may have changed the situation. So I'll add this to the list of enhancements and see if priorities allow it to be looked at for the next release.

(*) RAID has been around since the late 1980s. In fact I implemented a block striping RAID system in 1991. But they have become more and more widespread in recent years.

As far as RAID is concerned I assume these systems are using block striping rather than bit-striping otherwise there should not be an issue. Also as readahead should kick in for large files utilising all the disks with block striping, I assume the issue is with small files which do not benefit from readahead.

@plougher plougher modified the milestones: Undecided, 4.7 release Dec 13, 2024
@ptallada
Copy link

I'm interested in this feature too :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants