Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

genext2fs is painfully slow for multi-GB input #31

Open
josch opened this issue Feb 7, 2022 · 8 comments
Open

genext2fs is painfully slow for multi-GB input #31

josch opened this issue Feb 7, 2022 · 8 comments

Comments

@josch
Copy link
Contributor

josch commented Feb 7, 2022

Hi,

I'm now using genext2fs with multi-GB tarballs as input. While this works well it also takes several hours on my machine. So I profiled genext2fs:
gprof.txt

If I read the profiling output correctly, then most time is spent in the function allocate().

Do you have any ideas how to improve the speed by introducing better data structures?

@bestouff
Copy link
Owner

bestouff commented Feb 8, 2022

Hi @josch,

indeed I have some ideas to mitigate this; I'm currently a bit short on time but I may try something.
Do you have an easy way of reproducing the problem ?

@josch
Copy link
Contributor Author

josch commented Feb 8, 2022

The "easy" way is just to throw a big tarball at it. 😄

For example here is a big system image: https://mister-muffin.de/reform/target-userland-full.tar

@gelrom
Copy link

gelrom commented Jan 12, 2023

Any luck looking into this?

I've hit this issue as well. For me with a ~10gb tar it seems to basically never complete (on a very powerful machine).
vs e.g. virt-make-fs taking ~30 min.

@gelrom
Copy link

gelrom commented Jan 12, 2023

Some quick benchmarks that I did make me think there is something highly nonlinear going on:
100mb ~1s
500mb ~10s
800mb ~27s
900mb ~71s
1gb ~130s

note: these were done with a tar of a single file of the above sizes.

@josch
Copy link
Contributor Author

josch commented Jan 12, 2023

I observed the same non-linear behavior. Since this is breaking my use-case for genext2fs I instead worked on a patch for e2fsprogs that would allow it to use a tarball as input: tytso/e2fsprogs#118

@pamolloy
Copy link

I'm trying to build a 8G image using genimage and genext2fs -d ... has been running for at least 30 minutes. I haven't managed to get it to finish yet. I tried using -a rootfs.tar and ran into a locale issue with a downloaded tar and a segfault on a tar I created.

@pamolloy
Copy link

Switched to mke2fs using use-mke2fs = "true" in my genimage.cfg, which seems to perform without issue and complete in less than a minute

@josch
Copy link
Contributor Author

josch commented Mar 31, 2023

@pamolloy did the local issue look something like this:

archive_read_next_header(): Pathname can't be converted from UTF-8 to current locale.

If yes, maybe try out #30 and tell me if that fixes your issue?

As for the slowness, I do not know how to fix genext2fs but if you want tarball input, then maybe tytso/e2fsprogs#118 is of interest to you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants