Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
rdrand: Avoid inlining unrolled retry loops.
The rdrand implementation contains three calls to rdrand(): 1. One in the main loop, for full words of output. 2. One after the main loop, for the potential partial word of output. 3. One inside the self-test loop. In the first case, the loop is unrolled into: ``` loop: .... rdrand <register> jb loop rdrand <register> jb loop rdrand <register> jb loop rdrand <register> jb loop rdrand <register> jb loop rdrand <register> jb loop rdrand <register> jb loop rdrand <register> jb loop rdrand <register> jb loop rdrand <register> jb loop ``` The second case is similar, except it isn't a loop. In the third case, the self-test loop, the same unrolling happens, but then the self-test loop is also unrolled, so the result is a sequence of 160 instructions. With this change, the generated code for the loop looks like this: ``` loop: ... rdrand <register> jb loop call retry test rax, rax jne loop jmp fail ``` The generated code for the tail now looks like this: ``` rdrand rdx jae call_retry ... ``` This is much better because we're no longer jumping over the uselessly- unrolled loops. The loop in `retry()` still gets unrolled though, but the compiler will put it in the cold function section. Since rdrand will basically never fail, the `jb <success>` in each call is going to be predicted as succeeding, so the number of instructions doesn't change. But, instruction cache pressure should be reduced.
- Loading branch information