Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(nvidia): set default poller once with common db pkg #341

Closed
wants to merge 1 commit into from

Conversation

gyuho
Copy link
Collaborator

@gyuho gyuho commented Jan 28, 2025

c.f., #336

@gyuho gyuho added this to the v0.4.0 milestone Jan 28, 2025
@gyuho gyuho self-assigned this Jan 28, 2025
Copy link

codecov bot commented Jan 28, 2025

Codecov Report

Attention: Patch coverage is 0% with 82 lines in your changes missing coverage. Please review.

Project coverage is 21.46%. Comparing base (760ef3b) to head (c8b38f7).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
internal/server/server.go 0.00% 35 Missing ⚠️
components/diagnose/scan.go 0.00% 20 Missing and 1 partial ⚠️
...omponents/accelerator/nvidia/query/nvml/options.go 0.00% 8 Missing ⚠️
components/accelerator/nvidia/query/options.go 0.00% 8 Missing ⚠️
components/accelerator/nvidia/query/query.go 0.00% 4 Missing ⚠️
...nts/accelerator/nvidia/fabric-manager/component.go 0.00% 2 Missing and 1 partial ⚠️
components/accelerator/nvidia/query/nvml/nvml.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #341      +/-   ##
==========================================
- Coverage   21.47%   21.46%   -0.01%     
==========================================
  Files         300      300              
  Lines       27186    27100      -86     
==========================================
- Hits         5837     5817      -20     
+ Misses      20710    20641      -69     
- Partials      639      642       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gyuho gyuho force-pushed the simplify-nvml-initialization branch from f450442 to f77c452 Compare January 28, 2025 13:00
@xiang90
Copy link
Contributor

xiang90 commented Jan 29, 2025

tests?

nvidia_query.WithInfinibandClassDirectory(cfg.InfinibandClassDirectory),
)
nvidia_query.GetDefaultPoller().Start(cctx, cfg.Query, Name)
if nvidia_query.GetDefaultPoller() != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to check nil here but not in other components?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -88,6 +89,9 @@ type instance struct {
// read-only database instance
dbRO *sql.DB

xidEventsStore events_db.Store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rever this? let us do this in another pr

@@ -88,6 +89,9 @@ type instance struct {
// read-only database instance
dbRO *sql.DB

xidEventsStore events_db.Store
hwslowdownEventsStore events_db.Store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert?

dbRW *sql.DB
dbRO *sql.DB

xidEventsStore events_db.Store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert?

@xiang90
Copy link
Contributor

xiang90 commented Jan 29, 2025

let us only remove duplicate

	nvidia_query.SetDefaultPoller(
		nvidia_query.WithDBRW(cfg.Query.State.DBRW),
		nvidia_query.WithDBRO(cfg.Query.State.DBRO),
		nvidia_query.WithNvidiaSMICommand(cfg.NvidiaSMICommand),
		nvidia_query.WithNvidiaSMIQueryCommand(cfg.NvidiaSMIQueryCommand),
		nvidia_query.WithIbstatCommand(cfg.IbstatCommand),
		nvidia_query.WithInfinibandClassDirectory(cfg.InfinibandClassDirectory),
	)

in this pr. add tests as well

@gyuho
Copy link
Collaborator Author

gyuho commented Jan 29, 2025

@xiang90 Created a new PR #345 just for the default poller setting. Will rebase for passing the events store after #345.

@gyuho gyuho added the wip - do not merge working in progress label Jan 29, 2025
gyuho added a commit that referenced this pull request Jan 30, 2025
gyuho added a commit that referenced this pull request Jan 31, 2025
@gyuho
Copy link
Collaborator Author

gyuho commented Jan 31, 2025

This can be closed now in favor of #346.

@gyuho gyuho closed this Jan 31, 2025
gyuho added a commit that referenced this pull request Feb 6, 2025
c.f., #336

Requires #341.

Tested

<img width="1102" alt="Screenshot 2025-01-28 at 11 01 33 PM"
src="https://github.com/user-attachments/assets/78c12072-42df-4c17-8789-2eb900577eb5"
/>

<img width="1537" alt="Screenshot 2025-01-28 at 11 01 45 PM"
src="https://github.com/user-attachments/assets/9a520798-9aab-499f-9f03-433f1c8a9295"
/>

Signed-off-by: Gyuho Lee <[email protected]>
@gyuho gyuho deleted the simplify-nvml-initialization branch February 7, 2025 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wip - do not merge working in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants