Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement cuDeviceGetLuid to report proper LUIDs #8

Merged
merged 1 commit into from
Dec 12, 2024

Conversation

Saancreed
Copy link
Contributor

I was debugging why DLSS Frame Generation was exploding Indiana Jones and the Great Circle (which was initially happening due to unimplemented function nvcuda.dll.cuInit called in 64-bit code) and after dropping nvcuda from this repo into Proton I ran into this, where the game was unable to match devices by LUID. Unfortunately, it appears to not be enough to make DLFG work, but still I decided to fix this along the way.

@Saancreed
Copy link
Contributor Author

As a side note, dlssg snippet that ships with the game imports the following symbols from nvcuda:

  • cuDestroyExternalSemaphore
  • cuMipmappedArrayGetLevel
  • cuWaitExternalSemaphoresAsync
  • cuSignalExternalSemaphoresAsync
  • cuImportExternalSemaphore
  • cuStreamDestroy_v2
  • cuStreamCreate
  • cuCtxGetCurrent
  • cuCtxPopCurrent_v2
  • cuMipmappedArrayDestroy
  • cuCtxDestroy_v2
  • cuCtxCreate_v2
  • cuDeviceGetAttribute
  • cuDeviceGetLuid
  • cuDeviceGetCount
  • cuDeviceGet
  • cuDriverGetVersion
  • cuInit
  • cuGetErrorName
  • cuImportExternalMemory
  • cuExternalMemoryGetMappedMipmappedArray
  • cuMemcpy2DAsync_v2
  • cuSurfObjectDestroy
  • cuDestroyExternalMemory
  • cuSurfObjectCreate
  • cuCtxPushCurrent_v2

but all I see in the logs are a few repeated cycles of

176253.687:0120:02a8:trace:loaddll:build_module Loaded L"C:\\windows\\system32\\nvofapi64.dll" at 00006FFFF2F70000: native
176253.689:0120:02a8:trace:nvcuda:wine_cuInit (0)
176253.689:0120:02a8:trace:nvcuda:wine_cuDriverGetVersion (0x835ed868)
176253.691:0120:02a8:trace:nvcuda:wine_cuDeviceGetCount (0x835ed858)
176253.691:0120:02a8:trace:nvcuda:wine_cuDeviceGet (0xa4098ec0, 0)
176253.691:0120:02a8:trace:nvcuda:wine_cuDeviceGetLuid (0x835ed870, 0x835ed860, 0)
176253.692:0120:02a8:trace:nvcuda:wine_cuDeviceGetLuid Found LUID: f3030000-00000000
176253.693:0120:02a8:trace:nvcuda:wine_cuDeviceGetAttribute (Device: 0, Attribute: 114) Value: (1)
176253.693:0120:02a8:trace:nvcuda:wine_cuCtxCreate_v2 (0xa4098ec8, 0, 0)
176253.823:0120:02a8:trace:nvcuda:wine_cuStreamCreate (0xa4098ed0, 0)
176253.823:0120:02a8:trace:nvcuda:wine_cuStreamCreate (0xa4098ed8, 0)
176253.823:0120:02a8:trace:nvcuda:wine_cuCtxPopCurrent_v2 (0xa4098ec8)
176253.831:0120:02a8:trace:nvcuda:wine_cuStreamDestroy_v2 (0x7980050ad890)
176253.831:0120:02a8:trace:nvcuda:wine_cuStreamDestroy_v2 (0x7980050ad8b0)
176253.831:0120:02a8:trace:nvcuda:wine_cuCtxDestroy_v2 (0x7980046fa0c0)
176253.891:0120:02a8:trace:loaddll:free_modref Unloaded module L"C:\\windows\\system32\\nvofapi64.dll" : native

before the game ultimately hangs on a black screen with the only interesting thing logged around that time being

176254.982:0120:02a8:trace:nvcuda:wine_cuCtxCreate_v2 (0xa4098ec8, 0, 0)
176255.063:0120:02a8:trace:nvcuda:wine_cuStreamCreate (0xa4098ed0, 0)
176255.063:0120:02a8:trace:nvcuda:wine_cuStreamCreate (0xa4098ed8, 0)
176255.063:0120:02a8:trace:nvcuda:wine_cuCtxPopCurrent_v2 (0xa4098ec8)
176255.063:0120:01b4:warn:debugstr:OutputDebugStringA "[19] WARNING: Streamline: Vulkan API Error VK_ERROR_OUT_OF_DATE_KHR\n"

@SveSop
Copy link
Owner

SveSop commented Dec 11, 2024

Nice.. I will look at this implementation of LUID.

You could attempt to view the return codes from cuStreamCreate or cuCtxCreate_v2 perhaps, if you have not done so, in case it actually returns a error that is not logged in the gamelog. Most the cu** functions just relay directly without error checking in nvcuda as possible overhead for those checks could be > benefits..

eg.

CUresult WINAPI wine_cuCtxCreate_v2(CUcontext *pctx, unsigned int flags, CUdevice dev)
{
    CUresult ret;
    TRACE("(%p, %u, %u)\n", pctx, flags, dev);
    ret = pcuCtxCreate_v2(pctx, flags, dev);
    TRACE("Returned: %d\n", ret);
    return ret;
}

I suppose there are no other demo's that would use framegen like this currently.. I do have CyberPunk 2077 tho, but it might have other issues with this..

VK_ERROR_OUT_OF_DATE_KHR somewhat odd i think 😏

Now we just need to figure out when swap chain recreation is necessary and call our new recreateSwapChain function. Luckily, Vulkan will usually just tell us that the swap chain is no longer adequate during presentation. The vkAcquireNextImageKHR and vkQueuePresentKHR functions can return the following special values to indicate this.

VK_ERROR_OUT_OF_DATE_KHR: The swap chain has become incompatible with the surface and can no longer be used for rendering. Usually happens after a window resize.

Could ofc indicate that the cudacontext is used to present some image, and the data is somewhat "incompatible".

Anything interesting logged from dxvk or vkd3d? I would not think it actually has anything to do with resizing, but rather some image-data of sorts returned from this cudacontext that is iffy compared to what vulkan expects perhaps....

@Saancreed
Copy link
Contributor Author

You could attempt to view the return codes from cuStreamCreate or cuCtxCreate_v2 perhaps, if you have not done so, in case it actually returns a error that is not logged in the gamelog. Most the cu** functions just relay directly without error checking in nvcuda as possible overhead for those checks could be > benefits..

I'm not actually sure if anything there is erroring out, it looks closer to not matching the game's expectations… or it does match them but for some strange reason it performs this sequence multiple times.

On another hand, from jp7677/dxvk-nvapi#234 (comment) maybe it isn't nvcuda's fault at all and my setup is just cursed… Maybe it would just work with another Wayland compositor, or Gamescope, or with X11 session.

VK_ERROR_OUT_OF_DATE_KHR somewhat odd i think 😏

Yes, it's logged as a warning and not a fatal error, the game should be able to just recreate the swapchain and continue, but something, possibly unrelated to nvcuda at all, fails so I just end up with a black screen.

Anything interesting logged from dxvk or vkd3d?

Not really, the game doesn't call into them on its own, so vkd3d is completely unused and dxvk is used only indirectly from dxvk-nvapi, as far as I can tell.

@SveSop
Copy link
Owner

SveSop commented Dec 11, 2024

Not really, the game doesn't call into them on its own, so vkd3d is completely unused and dxvk is used only indirectly from dxvk-nvapi, as far as I can tell.

I might be misunderstanding you a bit maybe? I mean, if you even get a flicker of graphics it would be dxvk that would do that work. Just wondering how the cuda context ties together with d3d usage - Ie. dxvk/d3d gets some image data -> DLSS uses cuda to do some magic with that image data -> dxvk displays said image data as supposedly something viewable. And if the data (cuda context) that comes back from nvcuda is somewhat garbage or have a format not supported by vulkan it will break horribly.

Might be similar to the issue with War Thunder where dxvk will crash when enabling DLSS - supposedly due to DLSS doing something that vulkan does not support. (Don't have the details other than "illegal and expected to fail")

Thats why i was wondering if DXVK_LOG_LEVEL=debug would produce something interesting like this : jp7677/dxvk-nvapi#171 (comment)

@Saancreed
Copy link
Contributor Author

I might be misunderstanding you a bit maybe? I mean, if you even get a flicker of graphics it would be dxvk that would do that work.

Not necessarily. I can literally disable all of d3d8, d3d9, d3d10core, d3d11, dxgi (plus d3d12 and d3d12core) and wine vkcube.exe still outputs what I'd consider to be graphics on my screen. You can create a Win32 VkSurface without ever loading DXGI, only on Windows does the Vulkan loader call into it for purposes mostly related to physical device enumeration.

Yeah okay, in this case Streamline does call into DXGI to enumerate adapters but the game never creates any D3D device.

Just wondering how the cuda context ties together with d3d usage - Ie. dxvk/d3d gets some image data -> DLSS uses cuda to do some magic with that image data -> dxvk displays said image data as supposedly something viewable.

There is no D3D usage here.

And if the data (cuda context) that comes back from nvcuda is somewhat garbage or have a format not supported by vulkan it will break horribly.

I doubt the game (or Streamline) is buggy enough to misuse NVX extensions like that.

Might be similar to the issue with War Thunder where dxvk will crash when enabling DLSS - supposedly due to DLSS doing something that vulkan does not support. (Don't have the details other than "illegal and expected to fail")

War Thunder case sounds like a genuine game bug. Here, I have no idea what DLFG is trying to achieve, or why would it call into nvcuda instead of just using VK_NV_optical_flow like other similar games do, in which case I'd expect to see call to NvOFAPICreateInstanceVk logged here but if it's looking for NvOFAPICreateInstanceCuda then it won't be able to find it. Still, no idea why would it when Vulkan flavor is right there, next to CUDA one.

Thats why i was wondering if DXVK_LOG_LEVEL=debug would produce something interesting

The game never creates any D3D device so DXVK_LOG_LEVEL=debug is going to be useless. Only WINEDEBUG=+vulkan can save you here, and that generates a ton of log lines to sift through. Or maybe Vulkan Validation Layers but I doubt they can verify proper DLSS usage.

@SveSop
Copy link
Owner

SveSop commented Dec 11, 2024

So.. its a vulkan game then? I did not know..

@Saancreed
Copy link
Contributor Author

Yup.

@Saancreed
Copy link
Contributor Author

I did the rename, and also changed the error code to more generic CUDA_ERROR_UNKNOWN. CUDA_ERROR_NOT_SUPPORTED doesn't feel too appropriate when the cause was just a failure to look it up in the registry.

Copy link
Owner

@SveSop SveSop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will sort things later 👍

@@ -3651,23 +3652,122 @@ CUresult WINAPI wine_cuDeviceGetUuid(CUuuid *uuid, CUdevice dev)
else return CUDA_ERROR_INVALID_VALUE;
}

/* code borrowed from Wine starts here */
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not overly important comment, but i would like these "non cu functions" to be moved up around

To somewhat make an attempt to keep the cu functions by themselves, and put these helper functions along with others :) But i will do some cleanup.. There is no lint'ing here as you clearly can see 🤣 🤣

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to move this around as you see fit 🙂

I don't have any preferences myself, I just dumped them here because that was the easiest thing to do 🙈

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already did : 74a6c29
👍

@SveSop SveSop merged commit 2a21674 into SveSop:devel Dec 12, 2024
@Saancreed Saancreed deleted the cuDeviceGetLuid branch December 12, 2024 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants