Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault on start #447

Open
endzyme opened this issue Jan 28, 2025 · 7 comments
Open

segfault on start #447

endzyme opened this issue Jan 28, 2025 · 7 comments
Labels
bug Something isn't working

Comments

@endzyme
Copy link

endzyme commented Jan 28, 2025

What is the version?

4.0.0-4.0.1

What happened?

the main directions refer to a version that isn't released (4.0.0-4.0.0 should be 4.0.0-4.0.1) and when you start it with docker it does not run. I was able to run the previous version fine

What did you expect to happen?

I expect it to run without a segfault

What is the GPU model?

No response

What is the environment?

No response

How did you deploy the dcgm-exporter and what is the configuration?

No response

How to reproduce the issue?

No response

Anything else we need to know?

No response

@endzyme endzyme added the bug Something isn't working label Jan 28, 2025
@glowkey
Copy link
Collaborator

glowkey commented Jan 28, 2025

Can you please include the full output when running the container?

@UnknowViewer
Copy link

same issues here

  • the latest docker image version is out-of-date, it's 4.0.0-4.0.0 in README.md and dcgm-exporter.yaml commit-5f9250c
    Image. The correct version is 4.0.0-4.0.1

  • Error message (docker run)

2025/02/05 03:57:07 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined
2025/02/05 03:57:07 INFO Starting dcgm-exporter Version=4.0.0-4.0.1
2025/02/05 03:57:07 INFO Attempting to initialize DCGM.
2025/02/05 03:57:07 INFO Initialized DCGM Fields module.
2025/02/05 03:57:07 INFO DCGM successfully initialized!
2025/02/05 03:57:07 INFO Attempting to initialize NVML library.
2025/02/05 03:57:07 INFO NVML provider successfully initialized!
2025/02/05 03:57:07 INFO Not collecting DCP metrics: This request is serviced by a module of DCGM that is not currently loaded
2025/02/05 03:57:07 INFO Falling back to metric file '/etc/dcgm-exporter/default-counters.csv'
2025/02/05 03:57:07 WARN Skipping line 20 ('DCGM_FI_PROF_GR_ENGINE_ACTIVE'): metric not enabled
2025/02/05 03:57:07 WARN Skipping line 21 ('DCGM_FI_PROF_PIPE_TENSOR_ACTIVE'): metric not enabled
2025/02/05 03:57:07 WARN Skipping line 22 ('DCGM_FI_PROF_DRAM_ACTIVE'): metric not enabled
2025/02/05 03:57:07 WARN Skipping line 23 ('DCGM_FI_PROF_PCIE_TX_BYTES'): metric not enabled
2025/02/05 03:57:07 WARN Skipping line 24 ('DCGM_FI_PROF_PCIE_RX_BYTES'): metric not enabled
2025/02/05 03:57:07 INFO Initializing system entities of type 'GPU'
2025/02/05 03:57:07 INFO Initializing system entities of type 'NvSwitch'
2025/02/05 03:57:07 INFO Not collecting NvSwitch metrics; no switches to monitor
2025/02/05 03:57:07 INFO Initializing system entities of type 'NvLink'
2025/02/05 03:57:07 INFO Not collecting NvLink metrics; no switches to monitor
2025/02/05 03:57:07 INFO Initializing system entities of type 'CPU'
SIGSEGV: segmentation violation
PC=0x7ff3473336aa m=0 sigcode=1 addr=0x4
signal arrived during cgo execution

goroutine 1 gp=0xc0000061c0 m=0 mp=0x3378be0 [syscall]:
runtime.cgocall(0x1975150, 0xc00063e148)
        /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc00063e120 sp=0xc00063e0e8 pc=0x4195cb
github.com/NVIDIA/go-dcgm/pkg/dcgm._Cfunc_dcgmGetCpuHierarchy(0x7fffffff, 0xc0002cc000)
        _cgo_gotypes.go:1178 +0x4b fp=0xc00063e148 sp=0xc00063e120 pc=0x7f0c8b
github.com/NVIDIA/go-dcgm/pkg/dcgm.GetCpuHierarchy()
        /go/pkg/mod/github.com/!n!v!i!d!i!a/[email protected]/pkg/dcgm/cpu.go:42 +0x6b fp=0xc00063edf8 sp=0xc00063e148 pc=0x7f3beb
github.com/NVIDIA/dcgm-exporter/internal/pkg/dcgmprovider.dcgmProvider.GetCpuHierarchy(...)
        /go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/dcgmprovider/dcgm.go:163
github.com/NVIDIA/dcgm-exporter/internal/pkg/dcgmprovider.(*dcgmProvider).GetCpuHierarchy(_)
        <autogenerated>:1 +0x7c fp=0xc00063f028 sp=0xc00063edf8 pc=0x17a1b7c
github.com/NVIDIA/dcgm-exporter/internal/pkg/deviceinfo.(*Info).initializeCPUInfo(0xc000374f08, {0x1, {0x0, 0x0, 0x0}, {0x0, 0x0, 0x0}})
        /go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/deviceinfo/device_info.go:196 +0x9f fp=0xc00063f3d0 sp=0xc00063f028 pc=0x17a393f
github.com/NVIDIA/dcgm-exporter/internal/pkg/deviceinfo.Initialize({0x1, {0x0, 0x0, 0x0}, {0x0, 0x0, 0x0}}, {0x1, {0x0, 0x0, ...}, ...}, ...)
        /go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/deviceinfo/device_info.go:108 +0x349 fp=0xc00063f440 sp=0xc00063f3d0 pc=0x17a2cc9
github.com/NVIDIA/dcgm-exporter/internal/pkg/devicewatchlistmanager.(*WatchListManager).CreateEntityWatchList(0xc000113ba0, 0x7, {0x2140538, 0x33d9600}, 0x7530)
        /go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/devicewatchlistmanager/device_watchlist_manager.go:131 +0x458 fp=0xc00063f788 sp=0xc00063f440 pc=0x17cc358
github.com/NVIDIA/dcgm-exporter/pkg/cmd.startDeviceWatchListManager(0xc000210f30, 0xc000102a80)
        /go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:412 +0x305 fp=0xc00063f878 sp=0xc00063f788 pc=0x1972bc5
github.com/NVIDIA/dcgm-exporter/pkg/cmd.startDCGMExporter(0xc00013fa40, 0xc000383d80)
        /go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:346 +0x34b fp=0xc00063fa70 sp=0xc00063f878 pc=0x1971f2b
github.com/NVIDIA/dcgm-exporter/pkg/cmd.action.func1()
        /go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:304 +0x5b fp=0xc00063fac0 sp=0xc00063fa70 pc=0x19719db
github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout.Capture({0x2157448, 0xc00010e690}, 0xc000507b78)
        /go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout/capture.go:76 +0x1e6 fp=0xc00063fb50 sp=0xc00063fac0 pc=0x196f406
github.com/NVIDIA/dcgm-exporter/pkg/cmd.action(0xc00013fa40)
        /go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:295 +0x67 fp=0xc00063fba8 sp=0xc00063fb50 pc=0x1971947
github.com/NVIDIA/dcgm-exporter/pkg/cmd.NewApp.func1(0xc00013fa40?)
        /go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:276 +0x13 fp=0xc00063fbc0 sp=0xc00063fba8 pc=0x1974c53
github.com/urfave/cli/v2.(*Command).Run(0xc0001a1600, 0xc00013fa40, {0xc000052090, 0x1, 0x1})
        /go/pkg/mod/github.com/urfave/cli/[email protected]/command.go:279 +0x97d fp=0xc00063fe48 sp=0xc00063fbc0 pc=0x818ffd
github.com/urfave/cli/v2.(*App).RunContext(0xc00011ec00, {0x21571e0, 0x33d9600}, {0xc000052090, 0x1, 0x1})
        /go/pkg/mod/github.com/urfave/cli/[email protected]/app.go:337 +0x58b fp=0xc00063fea8 sp=0xc00063fe48 pc=0x81588b
github.com/urfave/cli/v2.(*App).Run(0xc000507f30?, {0xc000052090?, 0x1?, 0x48453a?})
        /go/pkg/mod/github.com/urfave/cli/[email protected]/app.go:311 +0x2f fp=0xc00063fee8 sp=0xc00063fea8 pc=0x8152af
main.main()
        /go/src/github.com/NVIDIA/dcgm-exporter/cmd/dcgm-exporter/main.go:32 +0x5f fp=0xc00063ff50 sp=0xc00063fee8 pc=0x1974d7f
runtime.main()
        /usr/local/go/src/runtime/proc.go:271 +0x29d fp=0xc00063ffe0 sp=0xc00063ff50 pc=0x45185d
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00063ffe8 sp=0xc00063ffe0 pc=0x4848e1

goroutine 2 gp=0xc000006c40 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000084fa8 sp=0xc000084f88 pc=0x451c8e
runtime.goparkunlock(...)
        /usr/local/go/src/runtime/proc.go:408
runtime.forcegchelper()
        /usr/local/go/src/runtime/proc.go:326 +0xb3 fp=0xc000084fe0 sp=0xc000084fa8 pc=0x451b13
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000084fe8 sp=0xc000084fe0 pc=0x4848e1
created by runtime.init.6 in goroutine 1
        /usr/local/go/src/runtime/proc.go:314 +0x1a

goroutine 3 gp=0xc000007180 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000085780 sp=0xc000085760 pc=0x451c8e
runtime.goparkunlock(...)
        /usr/local/go/src/runtime/proc.go:408
runtime.bgsweep(0xc000060070)
        /usr/local/go/src/runtime/mgcsweep.go:318 +0xdf fp=0xc0000857c8 sp=0xc000085780 pc=0x43c33f
runtime.gcenable.gowrap1()
        /usr/local/go/src/runtime/mgc.go:203 +0x25 fp=0xc0000857e0 sp=0xc0000857c8 pc=0x430c45
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000857e8 sp=0xc0000857e0 pc=0x4848e1
created by runtime.gcenable in goroutine 1
        /usr/local/go/src/runtime/mgc.go:203 +0x66

goroutine 4 gp=0xc000007340 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x21305b0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000085f78 sp=0xc000085f58 pc=0x451c8e
runtime.goparkunlock(...)
        /usr/local/go/src/runtime/proc.go:408
runtime.(*scavengerState).park(0x3377600)
        /usr/local/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc000085fa8 sp=0xc000085f78 pc=0x439ce9
runtime.bgscavenge(0xc000060070)
        /usr/local/go/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc000085fc8 sp=0xc000085fa8 pc=0x43a299
runtime.gcenable.gowrap2()
        /usr/local/go/src/runtime/mgc.go:204 +0x25 fp=0xc000085fe0 sp=0xc000085fc8 pc=0x430be5
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000085fe8 sp=0xc000085fe0 pc=0x4848e1
created by runtime.gcenable in goroutine 1
        /usr/local/go/src/runtime/mgc.go:204 +0xa5

goroutine 5 gp=0xc000007c00 m=nil [finalizer wait]:
runtime.gopark(0xc000084648?, 0x423525?, 0xa8?, 0x1?, 0xc0000061c0?)
        /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000084620 sp=0xc000084600 pc=0x451c8e
runtime.runfinq()
        /usr/local/go/src/runtime/mfinal.go:194 +0x107 fp=0xc0000847e0 sp=0xc000084620 pc=0x42fc87
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000847e8 sp=0xc0000847e0 pc=0x4848e1
created by runtime.createfing in goroutine 1
        /usr/local/go/src/runtime/mfinal.go:164 +0x3d

goroutine 9 gp=0xc0001f7a40 m=nil [GC worker (idle)]:
runtime.gopark(0xc0000867a8?, 0x41b6eb?, 0xf7?, 0xaa?, 0xc00012cc00?)
        /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000086750 sp=0xc000086730 pc=0x451c8e
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000867e0 sp=0xc000086750 pc=0x432d25
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000867e8 sp=0xc0000867e0 pc=0x4848e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 36 gp=0xc00038c000 m=nil [GC worker (idle)]:
runtime.gopark(0x67282706ab3ce?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000080750 sp=0xc000080730 pc=0x451c8e
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000807e0 sp=0xc000080750 pc=0x432d25
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000807e8 sp=0xc0000807e0 pc=0x4848e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 21 gp=0xc000102380 m=nil [GC worker (idle)]:
runtime.gopark(0x67282706c43eb?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000150750 sp=0xc000150730 pc=0x451c8e
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0001507e0 sp=0xc000150750 pc=0x432d25
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0001507e8 sp=0xc0001507e0 pc=0x4848e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 37 gp=0xc00038c1c0 m=nil [GC worker (idle)]:
runtime.gopark(0x67282706c33e9?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000080f50 sp=0xc000080f30 pc=0x451c8e
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc000080fe0 sp=0xc000080f50 pc=0x432d25
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000080fe8 sp=0xc000080fe0 pc=0x4848e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        /usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 10 gp=0xc0001028c0 m=nil [IO wait]:
runtime.gopark(0x5?, 0x0?, 0x0?, 0x0?, 0xb?)
        /usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000081c30 sp=0xc000081c10 pc=0x451c8e
runtime.netpollblock(0x4d8ad8?, 0x418d66?, 0x0?)
        /usr/local/go/src/runtime/netpoll.go:573 +0xf7 fp=0xc000081c68 sp=0xc000081c30 pc=0x44a9f7
internal/poll.runtime_pollWait(0x7ff2fe73c700, 0x72)
        /usr/local/go/src/runtime/netpoll.go:345 +0x85 fp=0xc000081c88 sp=0xc000081c68 pc=0x47f125
internal/poll.(*pollDesc).wait(0xc00012d8c0?, 0xc0002bd000?, 0x1)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000081cb0 sp=0xc000081c88 pc=0x4f5b07
internal/poll.(*pollDesc).waitRead(...)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc00012d8c0, {0xc0002bd000, 0x1000, 0x1000})
        /usr/local/go/src/internal/poll/fd_unix.go:164 +0x27a fp=0xc000081d48 sp=0xc000081cb0 pc=0x4f6dfa
os.(*File).read(...)
        /usr/local/go/src/os/file_posix.go:29
os.(*File).Read(0xc00007b598, {0xc0002bd000?, 0x0?, 0x0?})
        /usr/local/go/src/os/file.go:118 +0x52 fp=0xc000081d88 sp=0xc000081d48 pc=0x502252
bufio.(*Scanner).Scan(0xc000155500)
        /usr/local/go/src/bufio/scan.go:219 +0x81e fp=0xc000081e60 sp=0xc000081d88 pc=0x5607fe
github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout.Capture.func2()
        /go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout/capture.go:58 +0x50 fp=0xc000081fe0 sp=0xc000081e60 pc=0x196f510
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000081fe8 sp=0xc000081fe0 pc=0x4848e1
created by github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout.Capture in goroutine 1
        /go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout/capture.go:57 +0x1d9

rax    0x0
rbx    0x7ff2e4bed740
rcx    0x7ff347366a1c
rdx    0x1
rdi    0x0
rsi    0x2416c1a0
rbp    0x4
rsp    0x7ffcce545e90
r8     0x90800
r9     0x2416c1a0
r10    0x0
r11    0x287
r12    0xffffffffffffff78
r13    0x2
r14    0x0
r15    0x7ffcce546080
rip    0x7ff3473336aa
rflags 0x10206
cs     0x33
fs     0x0
gs     0x0

@nvvfedorov
Copy link
Collaborator

@UnknowViewer, What is your environment?

@awoimbee
Copy link

awoimbee commented Feb 6, 2025

Same issue running on EKS with bottlerocket (Bottlerocket OS 1.32.0 (aws-k8s-1.31-nvidia) , driver pre-installed in the AMI)

@glowkey
Copy link
Collaborator

glowkey commented Feb 6, 2025

We are actively trying to reproduce this issue but have not been able to in our environments. Any information you can provide about your GPUs, driver version, etc would be useful.

@awoimbee
Copy link

awoimbee commented Feb 7, 2025

Bottlerocket is very locked down, it wouldn't surprise me if this issue doesn't appear in a more normal OS.

Using g5 instances in AWS (single A10G GPU).
Kernel: 6.1.124
NVIDIA-SMI 535.230.02 Driver Version: 535.230.02 CUDA Version: 12.2

Full crash log (version `4.0.0-4.0.1-ubuntu22.04`):
2025/02/07 10:32:14 maxprocs: Leaving GOMAXPROCS=8: CPU quota undefined
2025/02/07 10:32:14 INFO Starting dcgm-exporter Version=4.0.0-4.0.1
2025/02/07 10:32:14 INFO Attempting to initialize DCGM.
2025/02/07 10:32:14 INFO Initialized DCGM Fields module.
2025/02/07 10:32:14 INFO DCGM successfully initialized!
2025/02/07 10:32:14 INFO Attempting to initialize NVML library.
2025/02/07 10:32:14 INFO NVML provider successfully initialized!
2025/02/07 10:32:14 INFO Not collecting DCP metrics: This request is serviced by a module of DCGM that is not currently loaded
2025/02/07 10:32:14 INFO Falling back to metric file '/etc/dcgm-exporter/default-counters.csv'
2025/02/07 10:32:14 WARN Skipping line 19 ('DCGM_FI_PROF_GR_ENGINE_ACTIVE'): metric not enabled
2025/02/07 10:32:14 WARN Skipping line 20 ('DCGM_FI_PROF_PIPE_TENSOR_ACTIVE'): metric not enabled
2025/02/07 10:32:14 WARN Skipping line 21 ('DCGM_FI_PROF_DRAM_ACTIVE'): metric not enabled
2025/02/07 10:32:14 WARN Skipping line 22 ('DCGM_FI_PROF_PCIE_TX_BYTES'): metric not enabled
2025/02/07 10:32:14 WARN Skipping line 23 ('DCGM_FI_PROF_PCIE_RX_BYTES'): metric not enabled
2025/02/07 10:32:14 INFO Initializing system entities of type 'GPU'
2025/02/07 10:32:14 INFO Initializing system entities of type 'NvSwitch'
2025/02/07 10:32:14 INFO Not collecting NvSwitch metrics; no switches to monitor
2025/02/07 10:32:14 INFO Initializing system entities of type 'NvLink'
2025/02/07 10:32:14 INFO Not collecting NvLink metrics; no switches to monitor
2025/02/07 10:32:14 INFO Initializing system entities of type 'CPU'
SIGSEGV: segmentation violation
PC=0x7f11913986aa m=5 sigcode=1 addr=0x4
signal arrived during cgo execution

goroutine 1 gp=0xc0000061c0 m=5 mp=0xc000100008 [syscall]:
runtime.cgocall(0x1975150, 0xc0006a8148)
	/usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc0006a8120 sp=0xc0006a80e8 pc=0x4195cb
github.com/NVIDIA/go-dcgm/pkg/dcgm._Cfunc_dcgmGetCpuHierarchy(0x7fffffff, 0xc000206000)
	_cgo_gotypes.go:1178 +0x4b fp=0xc0006a8148 sp=0xc0006a8120 pc=0x7f0c8b
github.com/NVIDIA/go-dcgm/pkg/dcgm.GetCpuHierarchy()
	/go/pkg/mod/github.com/!n!v!i!d!i!a/[email protected]/pkg/dcgm/cpu.go:42 +0x6b fp=0xc0006a8df8 sp=0xc0006a8148 pc=0x7f3beb
github.com/NVIDIA/dcgm-exporter/internal/pkg/dcgmprovider.dcgmProvider.GetCpuHierarchy(...)
	/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/dcgmprovider/dcgm.go:163
github.com/NVIDIA/dcgm-exporter/internal/pkg/dcgmprovider.(*dcgmProvider).GetCpuHierarchy(_)
	<autogenerated>:1 +0x7c fp=0xc0006a9028 sp=0xc0006a8df8 pc=0x17a1b7c
github.com/NVIDIA/dcgm-exporter/internal/pkg/deviceinfo.(*Info).initializeCPUInfo(0xc000091908, {0x1, {0x0, 0x0, 0x0}, {0x0, 0x0, 0x0}})
	/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/deviceinfo/device_info.go:196 +0x9f fp=0xc0006a93d0 sp=0xc0006a9028 pc=0x17a393f
github.com/NVIDIA/dcgm-exporter/internal/pkg/deviceinfo.Initialize({0x1, {0x0, 0x0, 0x0}, {0x0, 0x0, 0x0}}, {0x1, {0x0, 0x0, ...}, ...}, ...)
	/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/deviceinfo/device_info.go:108 +0x349 fp=0xc0006a9440 sp=0xc0006a93d0 pc=0x17a2cc9
github.com/NVIDIA/dcgm-exporter/internal/pkg/devicewatchlistmanager.(*WatchListManager).CreateEntityWatchList(0xc0000d25b0, 0x7, {0x2140538, 0x33d9600}, 0x7530)
	/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/devicewatchlistmanager/device_watchlist_manager.go:131 +0x458 fp=0xc0006a9788 sp=0xc0006a9440 pc=0x17cc358
github.com/NVIDIA/dcgm-exporter/pkg/cmd.startDeviceWatchListManager(0xc000209500, 0xc000392700)
	/go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:412 +0x305 fp=0xc0006a9878 sp=0xc0006a9788 pc=0x1972bc5
github.com/NVIDIA/dcgm-exporter/pkg/cmd.startDCGMExporter(0xc0003c87c0, 0xc000605d50)
	/go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:346 +0x34b fp=0xc0006a9a70 sp=0xc0006a9878 pc=0x1971f2b
github.com/NVIDIA/dcgm-exporter/pkg/cmd.action.func1()
	/go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:304 +0x5b fp=0xc0006a9ac0 sp=0xc0006a9a70 pc=0x19719db
github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout.Capture({0x2157448, 0xc00003e050}, 0xc000515b78)
	/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout/capture.go:76 +0x1e6 fp=0xc0006a9b50 sp=0xc0006a9ac0 pc=0x196f406
github.com/NVIDIA/dcgm-exporter/pkg/cmd.action(0xc0003c87c0)
	/go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:295 +0x67 fp=0xc0006a9ba8 sp=0xc0006a9b50 pc=0x1971947
github.com/NVIDIA/dcgm-exporter/pkg/cmd.NewApp.func1(0xc0003c87c0?)
	/go/src/github.com/NVIDIA/dcgm-exporter/pkg/cmd/app.go:276 +0x13 fp=0xc0006a9bc0 sp=0xc0006a9ba8 pc=0x1974c53
github.com/urfave/cli/v2.(*Command).Run(0xc000409340, 0xc0003c87c0, {0xc0001260f0, 0x3, 0x3})
	/go/pkg/mod/github.com/urfave/cli/[email protected]/command.go:279 +0x97d fp=0xc0006a9e48 sp=0xc0006a9bc0 pc=0x818ffd
github.com/urfave/cli/v2.(*App).RunContext(0xc0001bb000, {0x21571e0, 0x33d9600}, {0xc0001260f0, 0x3, 0x3})
	/go/pkg/mod/github.com/urfave/cli/[email protected]/app.go:337 +0x58b fp=0xc0006a9ea8 sp=0xc0006a9e48 pc=0x81588b
github.com/urfave/cli/v2.(*App).Run(0xc000515f30?, {0xc0001260f0?, 0x1?, 0x48453a?})
	/go/pkg/mod/github.com/urfave/cli/[email protected]/app.go:311 +0x2f fp=0xc0006a9ee8 sp=0xc0006a9ea8 pc=0x8152af
main.main()
	/go/src/github.com/NVIDIA/dcgm-exporter/cmd/dcgm-exporter/main.go:32 +0x5f fp=0xc0006a9f50 sp=0xc0006a9ee8 pc=0x1974d7f
runtime.main()
	/usr/local/go/src/runtime/proc.go:271 +0x29d fp=0xc0006a9fe0 sp=0xc0006a9f50 pc=0x45185d
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0006a9fe8 sp=0xc0006a9fe0 pc=0x4848e1

goroutine 2 gp=0xc000006c40 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000098fa8 sp=0xc000098f88 pc=0x451c8e
runtime.goparkunlock(...)
	/usr/local/go/src/runtime/proc.go:408
runtime.forcegchelper()
	/usr/local/go/src/runtime/proc.go:326 +0xb3 fp=0xc000098fe0 sp=0xc000098fa8 pc=0x451b13
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000098fe8 sp=0xc000098fe0 pc=0x4848e1
created by runtime.init.6 in goroutine 1
	/usr/local/go/src/runtime/proc.go:314 +0x1a

goroutine 3 gp=0xc000007180 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000099780 sp=0xc000099760 pc=0x451c8e
runtime.goparkunlock(...)
	/usr/local/go/src/runtime/proc.go:408
runtime.bgsweep(0xc00006a070)
	/usr/local/go/src/runtime/mgcsweep.go:318 +0xdf fp=0xc0000997c8 sp=0xc000099780 pc=0x43c33f
runtime.gcenable.gowrap1()
	/usr/local/go/src/runtime/mgc.go:203 +0x25 fp=0xc0000997e0 sp=0xc0000997c8 pc=0x430c45
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000997e8 sp=0xc0000997e0 pc=0x4848e1
created by runtime.gcenable in goroutine 1
	/usr/local/go/src/runtime/mgc.go:203 +0x66

goroutine 4 gp=0xc000007340 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x21305b0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000099f78 sp=0xc000099f58 pc=0x451c8e
runtime.goparkunlock(...)
	/usr/local/go/src/runtime/proc.go:408
runtime.(*scavengerState).park(0x3377600)
	/usr/local/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc000099fa8 sp=0xc000099f78 pc=0x439ce9
runtime.bgscavenge(0xc00006a070)
	/usr/local/go/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc000099fc8 sp=0xc000099fa8 pc=0x43a299
runtime.gcenable.gowrap2()
	/usr/local/go/src/runtime/mgc.go:204 +0x25 fp=0xc000099fe0 sp=0xc000099fc8 pc=0x430be5
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000099fe8 sp=0xc000099fe0 pc=0x4848e1
created by runtime.gcenable in goroutine 1
	/usr/local/go/src/runtime/mgc.go:204 +0xa5

goroutine 18 gp=0xc000102700 m=nil [finalizer wait]:
runtime.gopark(0x0?, 0x1f9cc40?, 0xc0?, 0x0?, 0x2000000020?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000098620 sp=0xc000098600 pc=0x451c8e
runtime.runfinq()
	/usr/local/go/src/runtime/mfinal.go:194 +0x107 fp=0xc0000987e0 sp=0xc000098620 pc=0x42fc87
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000987e8 sp=0xc0000987e0 pc=0x4848e1
created by runtime.createfing in goroutine 1
	/usr/local/go/src/runtime/mfinal.go:164 +0x3d

goroutine 8 gp=0xc000308540 m=nil [GC worker (idle)]:
runtime.gopark(0xc0000947a8?, 0x41b6eb?, 0xf7?, 0xaa?, 0xc0003603c0?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000094750 sp=0xc000094730 pc=0x451c8e
runtime.gcBgMarkWorker()
	/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000947e0 sp=0xc000094750 pc=0x432d25
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000947e8 sp=0xc0000947e0 pc=0x4848e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 36 gp=0xc000392000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000398750 sp=0xc000398730 pc=0x451c8e
runtime.gcBgMarkWorker()
	/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0003987e0 sp=0xc000398750 pc=0x432d25
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0003987e8 sp=0xc0003987e0 pc=0x4848e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 22 gp=0xc000308700 m=nil [GC worker (idle)]:
runtime.gopark(0x33db080?, 0x3?, 0x27?, 0xb?, 0x0?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000094f50 sp=0xc000094f30 pc=0x451c8e
runtime.gcBgMarkWorker()
	/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc000094fe0 sp=0xc000094f50 pc=0x432d25
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000094fe8 sp=0xc000094fe0 pc=0x4848e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 9 gp=0xc000007dc0 m=nil [GC worker (idle)]:
runtime.gopark(0x44532888b6?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc00009a750 sp=0xc00009a730 pc=0x451c8e
runtime.gcBgMarkWorker()
	/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc00009a7e0 sp=0xc00009a750 pc=0x432d25
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00009a7e8 sp=0xc00009a7e0 pc=0x4848e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 37 gp=0xc0003921c0 m=nil [GC worker (idle)]:
runtime.gopark(0x445326c249?, 0x3?, 0x80?, 0x2?, 0x0?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000398f50 sp=0xc000398f30 pc=0x451c8e
runtime.gcBgMarkWorker()
	/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc000398fe0 sp=0xc000398f50 pc=0x432d25
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000398fe8 sp=0xc000398fe0 pc=0x4848e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 23 gp=0xc0003088c0 m=nil [GC worker (idle)]:
runtime.gopark(0x445327e261?, 0x3?, 0x34?, 0x68?, 0x0?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000095750 sp=0xc000095730 pc=0x451c8e
runtime.gcBgMarkWorker()
	/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0000957e0 sp=0xc000095750 pc=0x432d25
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000957e8 sp=0xc0000957e0 pc=0x4848e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 10 gp=0xc0004c6000 m=nil [GC worker (idle)]:
runtime.gopark(0x445328e5a4?, 0x3?, 0xb7?, 0xa8?, 0x0?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc00009af50 sp=0xc00009af30 pc=0x451c8e
runtime.gcBgMarkWorker()
	/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc00009afe0 sp=0xc00009af50 pc=0x432d25
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00009afe8 sp=0xc00009afe0 pc=0x4848e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 38 gp=0xc000392380 m=nil [GC worker (idle)]:
runtime.gopark(0x445326c131?, 0x3?, 0xe?, 0x10?, 0x0?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc000399750 sp=0xc000399730 pc=0x451c8e
runtime.gcBgMarkWorker()
	/usr/local/go/src/runtime/mgc.go:1310 +0xe5 fp=0xc0003997e0 sp=0xc000399750 pc=0x432d25
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0003997e8 sp=0xc0003997e0 pc=0x4848e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1234 +0x1c

goroutine 11 gp=0xc000392540 m=nil [IO wait]:
runtime.gopark(0x5?, 0x0?, 0x0?, 0x0?, 0xb?)
	/usr/local/go/src/runtime/proc.go:402 +0xce fp=0xc00039bc30 sp=0xc00039bc10 pc=0x451c8e
runtime.netpollblock(0x4d8ad8?, 0x418d66?, 0x0?)
	/usr/local/go/src/runtime/netpoll.go:573 +0xf7 fp=0xc00039bc68 sp=0xc00039bc30 pc=0x44a9f7
internal/poll.runtime_pollWait(0x7f1148fe4e70, 0x72)
	/usr/local/go/src/runtime/netpoll.go:345 +0x85 fp=0xc00039bc88 sp=0xc00039bc68 pc=0x47f125
internal/poll.(*pollDesc).wait(0xc0003613e0?, 0xc0002b7000?, 0x1)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00039bcb0 sp=0xc00039bc88 pc=0x4f5b07
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc0003613e0, {0xc0002b7000, 0x1000, 0x1000})
	/usr/local/go/src/internal/poll/fd_unix.go:164 +0x27a fp=0xc00039bd48 sp=0xc00039bcb0 pc=0x4f6dfa
os.(*File).read(...)
	/usr/local/go/src/os/file_posix.go:29
os.(*File).Read(0xc0003a7f60, {0xc0002b7000?, 0x0?, 0x0?})
	/usr/local/go/src/os/file.go:118 +0x52 fp=0xc00039bd88 sp=0xc00039bd48 pc=0x502252
bufio.(*Scanner).Scan(0xc0003c0080)
	/usr/local/go/src/bufio/scan.go:219 +0x81e fp=0xc00039be60 sp=0xc00039bd88 pc=0x5607fe
github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout.Capture.func2()
	/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout/capture.go:58 +0x50 fp=0xc00039bfe0 sp=0xc00039be60 pc=0x196f510
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00039bfe8 sp=0xc00039bfe0 pc=0x4848e1
created by github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout.Capture in goroutine 1
	/go/src/github.com/NVIDIA/dcgm-exporter/internal/pkg/stdout/capture.go:57 +0x1d9

rax    0x0
rbx    0x7f11305f1740
rcx    0x7f11913cba1c
rdx    0x1
rdi    0x0
rsi    0x7f1134000fc0
rbp    0x4
rsp    0x7f1148fd9000
r8     0x90800
r9     0x7f1134000fc0
r10    0x0
r11    0x287
r12    0xffffffffffffff78
r13    0x2
r14    0x0
r15    0x7f1148fd91f0
rip    0x7f11913986aa
rflags 0x10206
cs     0x33
fs     0x0
gs     0x0

@UnknowViewer
Copy link

@UnknowViewer, What is your environment?

@nvvfedorov Amazon Linux 2023.6.20250128.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants