-
Notifications
You must be signed in to change notification settings - Fork 783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sparse gemm kernels are not supported in ACL #1084
Comments
Hi @snadampal Thanks for raising this. We will discuss the feature request with the team. |
Hi @snadampal We discussed this with the team, we are considering exploring sparse tensors support in the context of GenAI but this is not officially in the roadmap for ACL. There are no plans to implement this feature. We would be interested in specific use cases for ACL which you could share with us. Hope this helps. |
Hi @morgolock , isn't ACL targeted for GenAI use cases? My requirement is mainly to accelerate sparse LLMs inference with ACL gemm kernels. Are you planning a different GEMM library for GenAI ? |
Hi @snadampal Apologies if I was not clear. we're exploring ways to accelerate GenAI workloads with ACL. This means that we may consider exploring sparse tensors support in ACL to accelerate these workloads in the future but we don't have any work planned for this feature in our roadmap.
Could you please share more details about the models you would like to accelerate? Hope this helps |
Hi @morgolock , thanks for the clarification. will share the details with you. |
Hi @morgolock, any updates on the plan for sparse GEMM kernels in ACL? This is particularly interesting as sparse LLMs are able to match performance of dense counterparts on task-specific applications. For ref. - https://arxiv.org/pdf/2310.06927 |
Hi @Arnav0400 There is no work planned to implement this feature at the moment. I'll discuss it again with the team. |
Output of 'strings libarm_compute.so | grep arm_compute_version':
arm_compute_version=v23.11 Build options: {'Werror': '0', 'debug': '0', 'neon': '1', 'opencl': '0', 'embed_kernels': '0', 'os': 'linux', 'arch': 'armv8a', 'build': 'native', 'multi_isa': '1', 'fixed_format_kernels': '1', 'openmp': '1', 'cppthreads': '0'} Git hash=b'add70ace1e57f65d1ae4d0cedaec6e4578cf87ff'
Platform:
AWS c7g.16xl
Operating System:
Ubuntu 22.04
Problem description:
PyTorch supports sparse tensor formats The request is to provide aarch64 gemm kernels that accept these sparse formatted tensors and does sparse gemm implementation to achieve better performance.
The text was updated successfully, but these errors were encountered: