-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vulkan: linux builds + small subgroup size fixes #11767
base: master
Are you sure you want to change the base?
Conversation
l_warptile = { 128, 128, 128, 16, device->subgroup_size * 2, 64, 2, tm_l, tn_l, tk_l, device->subgroup_size }; | ||
m_warptile = { 128, 64, 64, 16, device->subgroup_size, 32, 2, tm_m, tn_m, tk_m, device->subgroup_size }; | ||
s_warptile = { subgroup_size_16, 32, 32, 16, 32, 32, 2, tm_s, tn_s, tk_s, device->subgroup_size }; | ||
l_warptile = { 128, 128, 128, 16, subgroup_size_8 * 2, 64, 2, tm_l, tn_l, tk_l, subgroup_size_8 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine the coopmat path doesn't handle faking the subgroup size, maybe add an assert to that effect? Coopmat implementations probably have at least 8 invocations per subgroup, so this seems fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does our coopmat shader even work with a subgroup size of 8? We should probably find the actual limit and set up the assert based on that.
Honestly I don't know exactly why the regular mul_mat shaders break down with a subgroup size less than 8, but with the Vulkan backend becoming more and more popular I'd rather have it run slowly than fail mysteriously on those devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warptile parameters are not independent. There is probably a minimum there, coming from the hardcoded values.
Vulkan requires either the SDK or additional packages to build on Linux, so let's release official binaries so people can easily try it out.
Meanwhile our mat mul shaders don't work with subgroup sizes smaller than 8. With this fix all tests are passing even with
device->subgroup_size
forced to 1.