Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per ChunkArea PushConstants + VRAM Usage Reductions #345

Closed
wants to merge 17 commits into from

Conversation

thr3343
Copy link
Contributor

@thr3343 thr3343 commented Dec 16, 2023

No description provided.

@thr3343 thr3343 force-pushed the BaseInstanceGPUArgs3 branch 6 times, most recently from 4931d9f to 616e6a1 Compare December 18, 2023 13:43
@thr3343 thr3343 force-pushed the BaseInstanceGPUArgs3 branch from 616e6a1 to 714834b Compare December 18, 2023 14:02
@thr3343 thr3343 force-pushed the BaseInstanceGPUArgs3 branch from 82ccd6d to 0934a4e Compare December 18, 2023 18:22
@thr3343 thr3343 force-pushed the BaseInstanceGPUArgs3 branch 2 times, most recently from 840e091 to a98c750 Compare December 21, 2023 17:24
@thr3343 thr3343 force-pushed the BaseInstanceGPUArgs3 branch from a98c750 to 17adeb9 Compare December 21, 2023 17:29
@thr3343 thr3343 force-pushed the BaseInstanceGPUArgs3 branch 2 times, most recently from ca12001 to 6599f2c Compare December 21, 2023 18:48
@thr3343 thr3343 force-pushed the BaseInstanceGPUArgs3 branch from 6599f2c to 8fe886c Compare December 22, 2023 18:45
Empty drawCalls/DrawCmds no longer seem to occur, allowing all indexCount checks to be removed

(On paper this should be completely impossible but it somehow works)
@thr3343 thr3343 force-pushed the BaseInstanceGPUArgs3 branch from b723e00 to d71c16c Compare December 22, 2023 23:02
@thr3343
Copy link
Contributor Author

thr3343 commented Dec 23, 2023

Thanks to Collateral, they pointed out that this PR has major issues with CPU cache Misses
Which causes performance regressions with the BFS algorithm used to handle culling/uploads during chunk loads.
Causing Chunk loads to become alot slower than normal

i.e. I had hyperfixated on reducing class sizes to improve cache line alignment, but completely overlooked about cache misses

if(!renderSection.isCompletelyEmpty()) {
//This code has major issues with cache misses due to the EnumMap in DrawBuffers afaik
                for(var t : renderSection.getCompiledSection().renderTypes) {
                    drawBuffers.addDrawCommands(t, renderSection.getDrawParameters(t));
                }
                this.drawBufferSetQueue.add(drawBuffers);
                this.nonEmptyChunks++;
}

Perf regressions are not ideal as over time + PRs, they can accumulate and become hard to debug
So atm I decided its best this PR will be left to a later version
So these cache miss issues can be ironed out TBH

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant