Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Driver timeout on AMD 5700U #178

Open
toxieainc opened this issue Aug 5, 2024 · 19 comments
Open

Driver timeout on AMD 5700U #178

toxieainc opened this issue Aug 5, 2024 · 19 comments

Comments

@toxieainc
Copy link
Contributor

toxieainc commented Aug 5, 2024

Hi!

Using Windows 11, an AMD 5700U (=integrated graphics) and latest official AMD drivers, Supermodel runs into a driver timeout pretty consistently, e.g. when selecting the race car in Daytona2PE (or early during racing in Scud Race).

This seems to be the case since at least the super sampling was introduced (note: happens also though if using supersampling = 1), but could also be some commits earlier on, i still need to do a full binary search.
Does not happen with older builds, even when playing for a long time.

@dukeeeey
Copy link
Collaborator

dukeeeey commented Aug 5, 2024

Hi toxieainc,
if you could figure out at which commit this started happening that would help a lot. Unfortunately I don't even have an AMD card to test with. I know in windows there is a driver timeout value that you can edit in the registry. If you increase this time?

https://answers.microsoft.com/en-us/windows/forum/all/increase-time-out-limit/e979e2ad-e15f-450b-9818-a148cbf01078

does it allow the game to run at least? Maybe something in the shader is is causing it to recompile eating up time.

If you are savvy with the code there try uncommenting these

//glDebugMessageCallback(DebugCallback, NULL);
//glDebugMessageControl(GL_DONT_CARE,GL_DONT_CARE,GL_DONT_CARE, 0, 0, GL_TRUE);
//glEnable(GL_DEBUG_OUTPUT);

it should give some detailed driver output

@toxieainc
Copy link
Contributor Author

From my understanding, a recompile should be handled differently in the OS than the GPU not responding (as the latter prevents the screen to update, etc, while the first 'just' blocks some CPU thread(s)).

Will debug/find the commit when i'm back at that system (will take some weeks though :/).

@dukeeeey
Copy link
Collaborator

dukeeeey commented Aug 5, 2024

Well opengl commands are essentially issued on a single thread (the current context). If swapbuffers takes too long, I think it's usually a second or two, windows assumes it's died and kills it. I know this can happen if you render very large datasets and it simply takes too long. But really it shouldn't happen in normal rendering. Uniforms are essentially constants per draw call, I know some vendors will optimise the shaders and essentially recompile them based upon different inputs. I dont know if this is happening here it's just speculation.

Really the best option is to enable those debug options and the driver will hopefully tell us what the issue is.

@dukeeeey
Copy link
Collaborator

dukeeeey commented Aug 5, 2024

quad rendering?

@toxieainc
Copy link
Contributor Author

No, these changes/commits all worked fine, and also some months after that.

As said, will know more when i have access to the system again later-on.

@crashGG
Copy link

crashGG commented Aug 10, 2024

Using Windows 10 22H2, an AMD 7840HS (=integrated graphics) and latest official AMD drivers, whql-amd-software-adrenalin-edition-24.7.1, use dd90d0e testing Daytona2PE and Scud Race about 2 hours,everything works fine, no issues found.

my ini setting:
QuadRendering = true
WideScreen = true
Stretch = false
WideBackground = true
XResolution =2560
YResolution =1440
FullScreen =1
RefreshRate = 57.524
LegacySoundDSP = false

@toxieainc
Copy link
Contributor Author

Finally back at the setup, found the commit in question:
6f40953 (2023-12-04) works
33b84c8 (2023-12-22) doesn't

@dukeeeey
Copy link
Collaborator

dukeeeey commented Sep 3, 2024

It might be the depth stencil format it's float 32 but with 8 bit stencil. Has 24 bit padding so it actually works out as a 64 bit type. Can't see what else would cause issues

@dukeeeey
Copy link
Collaborator

dukeeeey commented Sep 3, 2024

Try like halving the resolution, see if that makes any difference

@toxieainc
Copy link
Contributor Author

I will experiment a bit. But so far it makes no sense, cause its not like its running significantly slower up until the hang/timeout.
So there is no indication why this should happen.

@toxieainc
Copy link
Contributor Author

toxieainc commented Sep 4, 2024

While staring at code, tiny observation: is a depth and stencil buffer still needed when creating the SDL window in Main.cpp (at least for the New3D renderer)? (EDIT: i now filed a PR for this)

@crashGG
Copy link

crashGG commented Sep 4, 2024

Win11 is not friendly to AMD CPUs, which often causes the driver to lose response. The currently known Win11 optimizations that need to be done when using AMD processors include: turning off High Precision Event Timer (formerly Multimedia Timer), turning off Fullscreen Optimizations(GameDVR_FSEBehaviorMode), and logging in with an administrator account to play games. In addition, the latest kb5041587 can bring about a 10% performance improvement to AMD Zen3/4/5.

@toxieainc
Copy link
Contributor Author

oh my.. :)
Thanks, i will give that a try, too..

@toxieainc
Copy link
Contributor Author

kb5041587 i already had installed with my latest tests..

As for the 40 (so potentially 32+8 or 64bit) depth/stencil: changing it (and the readback) to 32 again did not improve anything, i rather have the experience that this makes the machine bluescreen instead of 'just' timeout-ing.
BUT i did these tests with the current master, so will have to do the same with the old revision.

@dukeeeey
Copy link
Collaborator

dukeeeey commented Sep 4, 2024

Could you try glDebugMessageCallback(DebugCallback
Just uncomment it in the code.

@toxieainc
Copy link
Contributor Author

toxieainc commented Sep 29, 2024

Some more updates here:

  1. Enabling GL debugging returned nothing suspicious whatsoever (both on current NV driver and the problematic AMD 5700U with current driver), just a harmless warning.
  2. Using my own builds (latest VS2019 or VS2022 versions) shows the exact same behavior, before the reversed z-buffer all fine, after reversed z-buffer hang/crash.
  3. Trying different SDL2 versions (including latest), also no other behavior.

@toxieainc
Copy link
Contributor Author

..but while debugging i found a fix for another weird behavior i saw on that machine (micro stutter), so will file a PR for that one later-on.

@toxieainc
Copy link
Contributor Author

small update: I backported a lot of the newer commits to the state of before-reverse-z, and then still all is fine. So its somehow really linked to that specific change.
next step: Try to split up that commit into smaller pieces and see what exactly makes the AMD driver/HW break.

@dukeeeey
Copy link
Collaborator

If it's not the frame buffer. Maybe it's related to the scene graph, because the culling code was rewritten. Maybe something bad is happening in the culling code that is leading to an abnormally large render load. Just speculation. I can't think what else it could be honestly. I know if the CPU clock is too low it can try and render incomplete frames.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants