Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pclmulqdq crc32 variant #31

Merged
merged 19 commits into from
Feb 19, 2024
Merged

add pclmulqdq crc32 variant #31

merged 19 commits into from
Feb 19, 2024

Conversation

folkertdev
Copy link
Collaborator

@folkertdev folkertdev commented Feb 16, 2024

We've had some discussion about whether we really need to have our own implementation of crc32. It turns out that we get a massive speed boost over crc32fast in the streaming case (when input comes in in small chunks):

Benchmark 1 (84 runs): cargo run --release --example crc32_bench sse-chunked silesia-small.tar
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          59.8ms ± 3.60ms    57.6ms … 84.2ms          6 ( 7%)        0%
  peak_rss           21.9MB ±  171KB    21.5MB … 22.3MB          0 ( 0%)        0%
  cpu_cycles          159M  ± 7.31M      155M  …  206M           6 ( 7%)        0%
  instructions        351M  ± 37.4K      351M  …  351M           0 ( 0%)        0%
  cache_references   3.76M  ± 63.5K     3.65M  … 3.97M           3 ( 4%)        0%
  cache_misses       1.06M  ± 81.3K      908K  … 1.49M           8 (10%)        0%
  branch_misses       750K  ± 59.1K      730K  … 1.17M           4 ( 5%)        0%
Benchmark 2 (56 runs): cargo run --release --example crc32_bench crc32fast-chunked silesia-small.tar
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          90.8ms ± 2.10ms    88.4ms …  101ms          2 ( 4%)        💩+ 51.7% ±  1.7%
  peak_rss           21.9MB ±  170KB    21.5MB … 22.2MB          0 ( 0%)          -  0.1% ±  0.3%
  cpu_cycles          278M  ± 4.38M      273M  …  302M           2 ( 4%)        💩+ 75.4% ±  1.3%
  instructions        434M  ± 35.6K      434M  …  434M           0 ( 0%)        💩+ 23.5% ±  0.0%
  cache_references   3.75M  ± 54.5K     3.63M  … 3.90M           0 ( 0%)          -  0.1% ±  0.5%
  cache_misses       1.14M  ±  121K      992K  … 1.58M           3 ( 5%)        💩+  7.7% ±  3.2%
  branch_misses       744K  ± 15.3K      732K  …  842K           3 ( 5%)          -  0.9% ±  2.1%

when all the input is availabe, the current implementation is just as fast as crc32fast.

Copy link

codecov bot commented Feb 16, 2024

Codecov Report

Attention: 63 lines in your changes are missing coverage. Please review.

Comparison is base (92994f8) 85.99% compared to head (0297489) 86.16%.
Report is 2 commits behind head on main.

Files Patch % Lines
zlib-rs/src/crc32/braid.rs 82.30% 20 Missing ⚠️
zlib-rs/src/crc32/pclmulqdq.rs 92.88% 19 Missing ⚠️
zlib-rs/src/crc32.rs 84.84% 15 Missing ⚠️
load-dynamic-libz-ng/src/lib.rs 0.00% 9 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #31      +/-   ##
==========================================
+ Coverage   85.99%   86.16%   +0.16%     
==========================================
  Files          29       31       +2     
  Lines        6591     6982     +391     
==========================================
+ Hits         5668     6016     +348     
- Misses        923      966      +43     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -183,3 +183,13 @@ pub fn compress_slice<'a>(

(&mut output[..stream.total_out], libz_ng_sys::Z_OK)
}

pub unsafe fn crc32(start: u32, buf: *const u8, len: usize) -> u32 {
let lib = libloading::Library::new("/home/folkertdev/c/libcrc.so").unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add this to the repo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do in a separate PR

@folkertdev folkertdev merged commit 8c3a10a into main Feb 19, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants