Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not initialize NNPACK! Reason: Unsupported hardware on supported CPU #221

Open
MatPoliquin opened this issue Sep 29, 2024 · 2 comments

Comments

@MatPoliquin
Copy link

MatPoliquin commented Sep 29, 2024

I get this error when running pytorch '2.4.1+cu121':
[W929 20:27:52.827045770 NNPACK.cpp:61] Could not initialize NNPACK! Reason: Unsupported hardware.

My specs:

  • AMD Ryzen 7950X
  • Ubuntu 22.04 VM running on Proxmox VE 8.2.2
  • the cpu type in proxmox is set to "host" so the full specs are there

As you can see from the output of lscpu below it supports AVX2 and has L3 cache

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 7950X 16-Core Processor
CPU family: 25
Model: 97
Thread(s) per core: 1
Core(s) per socket: 32
Socket(s): 1
Stepping: 2
BogoMIPS: 8983.08
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse s
se2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_
freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsa
ve avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowpre
fetch osvw perfctr_core ssbd ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase tsc_adjust bmi1 av
x2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd
sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr wbnoinv
d arat npt lbrv nrip_save tsc_scale vmcb_clean flushbyasid pausefilter pfthreshold v_vmsave_vm
load vgif vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_
bitalg avx512_vpopcntdq rdpid fsrm flush_l1d arch_capabilities
Virtualization features:
Virtualization: AMD-V
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 2 MiB (32 instances)
L1i: 2 MiB (32 instances)
L2: 16 MiB (32 instances)
L3: 512 MiB (32 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-31
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Reg file data sampling: Not affected
Retbleed: Not affected
Spec rstack overflow: Vulnerable: Safe RET, no microcode
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditional; STIBP disabled; RSB filling; PBRSB-eI
BRS Not affected; BHI Not affected
Srbds: Not affected
Tsx async abort: Not affected

@mhkarimi1383
Copy link

Hi
I'm having the same issue and there is no docs that indicates what features are needed

@kirillmeisser
Copy link

Hi everyone,

We were also facing the same issue with the same setup as you @MatPoliquin and had tests that randomly failed on our runners. It turns out that the cause are AMD CPUs. From our observations, it seems that NNPACK does not support the AMD version of the AVX2 instructions, it simply can't find them, even though, if you run the lscpu command the flags are clearly there. Ultimately, we found out that our runners with Intel CPUs (which supported AVX2 instructions) had no problem running models that used NNPACK optimizations. It would be stellar if NNPACK could support the newer AMD CPUs since more and more people are switching sides. Could that potentially be of interest to the project @Maratyszcza?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants