Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ABI-incompatible function pointers cause mayhem #19

Open
CoffeeImpliesCode opened this issue Feb 14, 2025 · 0 comments
Open

ABI-incompatible function pointers cause mayhem #19

CoffeeImpliesCode opened this issue Feb 14, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@CoffeeImpliesCode
Copy link

The following two definitions are ABI-incompatible:

zglfw/src/zglfw.zig

Lines 1175 to 1176 in e9bd486

extern fn glfwGetWaylandWindow(window: *Window) ?*anyopaque;
fn _getWaylandWindow(_: *Window) ?*anyopaque {

extern implies callconv(.c).
This rears its ugly head in the WindowProvider struct from zgpu here:
https://github.com/zig-gamedev/zgpu/blob/bc10f874cf9c93e347c1298efba87be4f001fc9d/src/zgpu.zig#L21-L30

What's astonishing there is that this seems to (completely on accident) not break for the other functions,
as for example the fn_getWaylandDisplay works completely fine, but called through
*const fn(window: *const anyopaque) ?*anyopaque
the getWaylandWindow function silently returns an incorrect pointer, which will then fail in a place like vkCreateWaylandSurfaceKHR or equivalent.
Here are my test results:

Directly calling the glfw functions (which are extern and implicitly callconv(.c)):
direct win:                  zglfw.Window@10aa8d80
direct getWaylandDisplay:    fn () callconv(.c) ?*anyopaque@1165c70
direct getWaylandWindow:     fn (*zglfw.Window) callconv(.c) ?*anyopaque@1165d60
direct result (wl_display):  anyopaque@10854bb0
direct result (wl_window):   anyopaque@10a39860

Calling the glfw functions through a WindowProvider:
provider win:                anyopaque@10aa8d80
provider getWaylandDisplay:  fn () ?*anyopaque@1165c70
provider getWaylandWindow:   fn (*const anyopaque) ?*anyopaque@1165d60
provider result(wl_display): anyopaque@10854bb0
provider result(wl_window):  anyopaque@7ffe6a18b3e1 <<----- evil UB here

Notice how the provider wl_window points to some garbage probably on the stack.
Changing the definition of WindowProvider to the correct callconv removes this bug.

Of course this problem probably extends to all platform functions in WindowProvider.
But considering that this is UB it might even work on some machines...

So tl;dr these pointers should all either be callconv(.c) or necessitate wrapper functions in zig-land.

@hazeycode hazeycode added the bug Something isn't working label Feb 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants