Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lto to static builds #2147

Merged
merged 3 commits into from
Jan 31, 2025
Merged

Add lto to static builds #2147

merged 3 commits into from
Jan 31, 2025

Conversation

jneem
Copy link
Member

@jneem jneem commented Jan 29, 2025

Fixes #2116.

I timed (clean) lto vs non-lto builds, and on my system the lto builds are almost twice as slow (180-ish seconds vs 100-ish). For that reason, I didn't add lto to the release profile, but created a new one with lto. I switched the static builds to use lto, so that release binaries will get the benefit but our other CI builds won't get slowed down.

I also measured lto = "thin", but it increased the size of the main binary (I didn't run benchmarks, so I don't know the effect on runtime performance).

@jneem jneem requested a review from yannham January 29, 2025 14:57
Copy link
Member

@yannham yannham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm counting right, we have three main ways of providing Nickel binaries

  • pre-compiled binaries (handled here)
  • binary compiled from source when using the crates.io release
  • building Nickel from Nix/Nixpkgs

Do you think we could cheaply get LTO for the other ones? I haven't looked, but can we specify a profile to use when cargo installing something, which might be different from release?

The nix part might need to add some knobs to avoid doing LTO on the CI but do it for the default package/default app of the flake. Then I suppose it needs to be done differently for the Nixpkgs release, but this can be handled at the next release - there is nothing we can do from the Nickel repo for now, I reckon.

@aspiwack
Copy link
Member

I'm obviously lacking a ton of context here, so bear with me. I'm surprised that the release profile is used on CI, I would intuitively run CI with the development profile. But if you have a good reason to use a release-like profile for CI, I think the more natural solution is to create a new profile for CI with the well-tuned option, and have the release profile (which will be used by package manager) turn LTO on. You're not proposing this, so: am I missing something?

@jneem
Copy link
Member Author

jneem commented Jan 30, 2025

Do you think we could cheaply get LTO for the other ones?

Definitely the easiest route would be to add LTO to the release profile, and then everyone would get it automatically. I'm not 100% sure that it's what we want for cargo install, though, because everyone who uses that route is compiling for themselves in contrast to CI/nixpkgs which compiles once and then distributes the binary.

For nixpkgs, I guess we would either add LTO to the release profile or patch nixpkgs to use a release-lto profile.

I'm surprised that the release profile is used on CI, I would intuitively run CI with the development profile

I think this is just because the CI uses nix flake check, and nix defaults to release builds. This can be changed but it's a little painful -- AFAICT we'd need to duplicate all the packages to add a non-release variant of each one.

@aspiwack
Copy link
Member

I see. Is the flake building purely development/CI builds not an option? Is the flake a distribution channel as well? But if it is, then you probably want LTO turned on for the flake as well, don't you?

@jneem
Copy link
Member Author

jneem commented Jan 30, 2025

Personally, I use the flake only for its devShell (and indirectly for CI). Since we make the flake publicly available, I guess anyone could be using it for getting a nickel binary. I think organist was using our flake at some point, but currently they're using nickel from nixpkgs instead.

Even if the flake was a distribution channel, I'm not sure we'd want to turn on LTO by default. It's the same logic as "cargo install", because the end-user is the one doing the compiling.

One thing that makes the decision tricky is that going from debug to release builds roughly doubles the build time but delivers a huge performance benefit, while adding on LTO doubles the build time again but for a much smaller benefit.

@aspiwack
Copy link
Member

aspiwack commented Jan 30, 2025

Are we able to estimate the performance benefit of LTO?


As a rule of thumb, I think that when installing a program (as opposed to “just” compiling), you want to best build (that is the recommended build). Build time doesn't matter that much. In the case of the flake, it could also be mitigated by providing a cache.

@jneem
Copy link
Member Author

jneem commented Jan 30, 2025

Are we able to estimate the performance benefit of LTO?

I'm not sure how representative it is, but we do have a benchmark suite. On my machine, LTO delivers about a 10% improvement. Maybe that's substantial enough to just live with the increased build times?

@yannham
Copy link
Member

yannham commented Jan 30, 2025

Definitely the easiest route would be to add LTO to the release profile, and then everyone would get it automatically. I'm not 100% sure that it's what we want for cargo install, though, because everyone who uses that route is compiling for themselves in contrast to CI/nixpkgs which compiles once and then distributes the binary.

I would say this is what we want. When you cargo install, you still compile it once and use it n >> 1 times; like @aspiwack I would say build time doesn't matter as much here.

I'm obviously lacking a ton of context here, so bear with me. I'm surprised that the release profile is used on CI, I would intuitively run CI with the development profile.

This is a good point. To be honest I'm not a hundred percent sure about the answer. One might just be that it was easier to do in the flake (instead of parametrizing the profile), as we share the same sub-derivations e.g. to build the default package of the flake. It's definitely a bad reason because it's not very hard to parametrize.

Another possibility is caching. It's maybe moot now that we use buildxx for the CI, but before that, hitting the cache was quite important to get reasonable CI time, and it might be the case that building with different profiles would fill it up more quickly (it's easy to blow up a free Cachix instance). I remember we were quite careful regarding those questions when developing the flake at a time. Since we use flake checks, could it be the case that a random user installing from the flake will build everything in double, as I believe checks are run by default?

But those are suppositions. It also makes a lot of sense to build stuff in debug mode in CI. Maybe a reasonable answer is that flake checks and CI checks should be separate, or at least not necessarily perfectly equal.

@jneem
Copy link
Member Author

jneem commented Jan 30, 2025

I only ever run nix flake check when the CI fails and I want to reproduce it locally, so I think it's nice that they're the same.

@yannham
Copy link
Member

yannham commented Jan 30, 2025

Sure, but would that be a problem to run nix build .#ci-checksand have the CI run the same thing instead ? It's not the first time that I tell to myself that the common practice of combining both doesn't always make sense. It also means the flake checks are building the WASM crate for example, which you probably don't care about if you're a simpler consumer of the flake...

@jneem
Copy link
Member Author

jneem commented Jan 30, 2025

Sure, that would be fine for me. But as someone who only ever runs nix flake check to reproduce CI locally, what would be the purpose of having flake checks in addition to a build job that's intended for CI?

@yannham
Copy link
Member

yannham commented Jan 30, 2025

I think I had a wrong understanding of checks and mixed it up with the checks associated to one derivation (like the one controlled by doChecks). It does seem like checks are only run through nix flake check, in which case my proposition doesn't make sense, and you can ignore it.

@jneem
Copy link
Member Author

jneem commented Jan 31, 2025

Ok, this latest version turns on LTO for the release profile, and then hopefully modifies nix flake check to only use the debug builds.

With any luck, I also fixed the caching issues causing cargo to rebuild many of the dependencies. We now build and cache dependencies with debug and release profiles, and with/without --all-features. The cached cargo target tarballs are less than 1G each, so hopefully not a big deal for our builders.

@jneem jneem requested a review from yannham January 31, 2025 07:36
@jneem jneem added this pull request to the merge queue Jan 31, 2025
Merged via the queue into master with commit fef448b Jan 31, 2025
5 checks passed
@jneem jneem deleted the lto branch January 31, 2025 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable Link-Time Optimization (LTO)
3 participants