Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precalculate float consts for RecursiveGaussian at build time #17

Merged
merged 5 commits into from
Feb 4, 2024

Conversation

FreezyLemon
Copy link
Contributor

@FreezyLemon FreezyLemon commented Feb 2, 2024

RecursiveGaussian::new does calculations that are always based on the same constant values defined at compile-time. My first thought was to just inline these, but that would make maintaining the algorithm a lot more difficult.

This PR adds a simple build script that does the same as RecursiveGaussian::new and writes the resulting values to const variables:

mod consts {
  pub const RADIUS: usize = 5_usize;
  pub const VERT_MUL_IN_1: f32 = 0.055295236_f32;
  pub const VERT_MUL_IN_3: f32 = -0.058836687_f32;
  pub const VERT_MUL_IN_5: f32 = 0.012955819_f32;
  pub const VERT_MUL_PREV_1: f32 = -1.9021131_f32;
  pub const VERT_MUL_PREV_3: f32 = -1.1755705_f32;
  pub const VERT_MUL_PREV_5: f32 = -0.00000000000000012246469_f32;
  pub const MUL_IN_1: f32 = 0.055295236_f32;
  pub const MUL_IN_3: f32 = -0.058836687_f32;
  pub const MUL_IN_5: f32 = 0.012955819_f32;
  pub const MUL_PREV_1: f32 = 1.9021131_f32;
  pub const MUL_PREV_3: f32 = 1.1755705_f32;
  pub const MUL_PREV_5: f32 = 0.00000000000000012246469_f32;
  pub const MUL_PREV2_1: f32 = -1_f32;
  pub const MUL_PREV2_3: f32 = -1_f32;
  pub const MUL_PREV2_5: f32 = -1_f32;
}

The RecursiveGaussian can then use these consts instead of self.asdf in the blur algorithm.

I understand that there is a reason that float calculations are not const in Rust (even though integer calculations are), but by my testing, even platforms without FMA produce the exact same variable values. Maybe more testing is needed on ARM and maybe Windows platforms (due to their x87 FPU shenanigans that might apply here). But I think that even if there are inconsistencies, they're probably smaller than the inaccuracies the algorithm itself would produce at runtime anyways.

I have no clue why I never tested this before. The results are staggering for the amount of effort needed:

blur                    time:   [9.0900 ms 9.1005 ms 9.1112 ms]
                        change: [-36.050% -35.935% -35.820%] (p = 0.00 < 0.05)
                        Performance has improved.

This also reduces the resulting library size (very roughly 20%), mostly because of nalgebra being removed from runtime.

@FreezyLemon
Copy link
Contributor Author

Oops, still some clippy lints to sort out

Also allow a clippy lint in this module that
is not necessary here
This will allow `blur.blur` instead of having
to do `RecursiveGaussian::blur` everywhere
@shssoichiro shssoichiro merged commit 88a5703 into rust-av:main Feb 4, 2024
1 check passed
@FreezyLemon FreezyLemon deleted the precalc-float-consts branch February 4, 2024 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants