Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] vstore在thread num不够32时出错 #9

Open
oywd opened this issue Sep 19, 2023 · 0 comments
Open

[bug] vstore在thread num不够32时出错 #9

oywd opened this issue Sep 19, 2023 · 0 comments

Comments

@oywd
Copy link
Collaborator

oywd commented Sep 19, 2023

在cts vstore_global case中,当thread num不够32的时候,会在0x90000000写0,导致输出memory部分数据跟golden不匹配。
这是因为硬件在处理的时候mask为全F,这样32个线程都在处理数据,而最后的几个线程因为没有数据,导致计算到的地址偏移为0,就会重复的去对0x90000000写0.

解决办法:
通过vblt的方法,将thread id与global size进行对比,以生成正确的mask。
或者在硬件层面,对于初始的mask值,需要根据global_size - num_thread *N得到

需要进行进一步讨论,讨论后再决定采用什么方法进行修改。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant