You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I get this error when running pytorch '2.4.1+cu121':
[W929 20:27:52.827045770 NNPACK.cpp:61] Could not initialize NNPACK! Reason: Unsupported hardware.
My specs:
AMD Ryzen 7950X
Ubuntu 22.04 VM running on Proxmox VE 8.2.2
the cpu type in proxmox is set to "host" so the full specs are there
As you can see from the output of lscpu below it supports AVX2 and has L3 cache
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 7950X 16-Core Processor
CPU family: 25
Model: 97
Thread(s) per core: 1
Core(s) per socket: 32
Socket(s): 1
Stepping: 2
BogoMIPS: 8983.08
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse s
se2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_
freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsa
ve avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowpre
fetch osvw perfctr_core ssbd ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase tsc_adjust bmi1 av
x2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd
sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr wbnoinv
d arat npt lbrv nrip_save tsc_scale vmcb_clean flushbyasid pausefilter pfthreshold v_vmsave_vm
load vgif vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_
bitalg avx512_vpopcntdq rdpid fsrm flush_l1d arch_capabilities
Virtualization features:
Virtualization: AMD-V
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 2 MiB (32 instances)
L1i: 2 MiB (32 instances)
L2: 16 MiB (32 instances)
L3: 512 MiB (32 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-31
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Reg file data sampling: Not affected
Retbleed: Not affected
Spec rstack overflow: Vulnerable: Safe RET, no microcode
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditional; STIBP disabled; RSB filling; PBRSB-eI
BRS Not affected; BHI Not affected
Srbds: Not affected
Tsx async abort: Not affected
The text was updated successfully, but these errors were encountered:
We were also facing the same issue with the same setup as you @MatPoliquin and had tests that randomly failed on our runners. It turns out that the cause are AMD CPUs. From our observations, it seems that NNPACK does not support the AMD version of the AVX2 instructions, it simply can't find them, even though, if you run the lscpu command the flags are clearly there. Ultimately, we found out that our runners with Intel CPUs (which supported AVX2 instructions) had no problem running models that used NNPACK optimizations. It would be stellar if NNPACK could support the newer AMD CPUs since more and more people are switching sides. Could that potentially be of interest to the project @Maratyszcza?
I get this error when running pytorch '2.4.1+cu121':
[W929 20:27:52.827045770 NNPACK.cpp:61] Could not initialize NNPACK! Reason: Unsupported hardware.
My specs:
As you can see from the output of lscpu below it supports AVX2 and has L3 cache
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 7950X 16-Core Processor
CPU family: 25
Model: 97
Thread(s) per core: 1
Core(s) per socket: 32
Socket(s): 1
Stepping: 2
BogoMIPS: 8983.08
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse s
se2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_
freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsa
ve avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowpre
fetch osvw perfctr_core ssbd ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase tsc_adjust bmi1 av
x2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd
sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr wbnoinv
d arat npt lbrv nrip_save tsc_scale vmcb_clean flushbyasid pausefilter pfthreshold v_vmsave_vm
load vgif vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_
bitalg avx512_vpopcntdq rdpid fsrm flush_l1d arch_capabilities
Virtualization features:
Virtualization: AMD-V
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 2 MiB (32 instances)
L1i: 2 MiB (32 instances)
L2: 16 MiB (32 instances)
L3: 512 MiB (32 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-31
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Reg file data sampling: Not affected
Retbleed: Not affected
Spec rstack overflow: Vulnerable: Safe RET, no microcode
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditional; STIBP disabled; RSB filling; PBRSB-eI
BRS Not affected; BHI Not affected
Srbds: Not affected
Tsx async abort: Not affected
The text was updated successfully, but these errors were encountered: