You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2025/02/02 10:09:46 maxprocs: Leaving GOMAXPROCS=8: CPU quota undefined
2025/02/02 10:09:46 INFO Starting dcgm-exporter Version=4.0.0-4.0.1
2025/02/02 10:09:46 INFO Attempting to initialize DCGM.
2025/02/02 10:09:46 INFO Initialized DCGM Fields module.
2025/02/02 10:09:46 INFO DCGM successfully initialized!
2025/02/02 10:09:46 INFO Attempting to initialize NVML library.
2025/02/02 10:09:46 INFO NVML provider successfully initialized!
2025/02/02 10:09:46 INFO Not collecting DCP metrics: This request is serviced by a module of DCGM that is not currently loaded
2025/02/02 10:09:46 INFO Falling back to metric file '/etc/dcgm-exporter/default-counters.csv'
2025/02/02 10:09:46 WARN Skipping line 20 ('DCGM_FI_PROF_GR_ENGINE_ACTIVE'): metric not enabled
2025/02/02 10:09:46 WARN Skipping line 21 ('DCGM_FI_PROF_PIPE_TENSOR_ACTIVE'): metric not enabled
2025/02/02 10:09:46 WARN Skipping line 22 ('DCGM_FI_PROF_DRAM_ACTIVE'): metric not enabled
2025/02/02 10:09:46 WARN Skipping line 23 ('DCGM_FI_PROF_PCIE_TX_BYTES'): metric not enabled
2025/02/02 10:09:46 WARN Skipping line 24 ('DCGM_FI_PROF_PCIE_RX_BYTES'): metric not enabled
2025/02/02 10:09:46 INFO Initializing system entities of type 'GPU'
2025/02/02 10:09:46 INFO Initializing system entities of type 'NvSwitch'
2025/02/02 10:09:46 INFO Not collecting NvSwitch metrics; no switches to monitor
2025/02/02 10:09:46 INFO Initializing system entities of type 'NvLink'
2025/02/02 10:09:46 INFO Not collecting NvLink metrics; no switches to monitor
2025/02/02 10:09:46 INFO Initializing system entities of type 'CPU'
SIGSEGV: segmentation violation
Technically I dont need any CPU metrics, why would the system initialize 'CPU'. May be my understanding is incomplete. I am using the default configmap provided by the helm chart and tried different versions of it but the result seems to be the same.
The text was updated successfully, but these errors were encountered:
Ask your question
Logs:
More detailed logs here: https://gist.github.com/puneetloya/2ec191a1ac79b3e23f76036e1a7d70fa
My values.yaml:
I tried the steps mentioned in this issue: #385.
Technically I dont need any CPU metrics, why would the system initialize 'CPU'. May be my understanding is incomplete. I am using the default configmap provided by the helm chart and tried different versions of it but the result seems to be the same.
The text was updated successfully, but these errors were encountered: