clarify README #23

matu3ba · 2021-03-04T00:04:24Z

RAM has >100ns memory access delay, whereas cache has lower than 60ns. This is abit vague in the README.
On top of that comes the hardware prefetcher that detects linear access patterns and has some dark magic for speculative execution with speculative cache prefetching.
Usually Linux memory-maps smaller files directly to L3 cache (when available), which is also not exactly formulated in the README.

Since the algorithm used is a streaming one with sliding window for vectorization: Is there a specific reason, why you do not use circular buffers as fastest available data structure?

Otherwise, it would be helpful to make a separate benchmark on allocator implementations only for reading files.

matu3ba changed the title ~~clarify RAM <-> cache behavior~~ clarify README Mar 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clarify README #23

clarify README #23

matu3ba commented Mar 4, 2021 •

edited

Loading

clarify README #23

clarify README #23

Comments

matu3ba commented Mar 4, 2021 • edited Loading

matu3ba commented Mar 4, 2021 •

edited

Loading