-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jgf: refactor to use shared functions and fix containment #76
Conversation
Problem: The "JGF" part of fluxjgf should be capitalized to refer to "Json Graph Format" and for easier reading for the developer. Solution: Rename "Fluxjgf to FluxJGF" Signed-off-by: vsoch <[email protected]>
Problem: The convention for creating new structs is typically to use "New$Name" and return a reference or copy to it. Solution: Rename InitJGF to NewFluxJGF to mirror that practice, making it clear we get back a "new flux JGF" and the function is not initializing something that already exists. Signed-off-by: vsoch <[email protected]>
Problem: we want to have end to end (e2e) and regular go tests. Solution: rename the test.yaml to e2e-test.yaml, and prepare to add simple tests for next round of changes (testify package) Signed-off-by: vsoch <[email protected]>
Problem: the containment paths are currently not set, and each resource type has a separate function. Solution: create a shared node generation function that uses a common resource counter, where the counter manages the global and resource-specific counts. This scoped commit includes refactor of those JGF functions, which also means removing resources that are not present in the graph (e.g., rack and socket) and ensuring that we do not recreate the subnet since we loop through nodes (and might hit the same one twice). I am also removing the NFD features because I think they are changed and might lead to error if someone has them. We should assess which of them we want/need and add them back strategically. I have a local test but want to push this first to give a look over / run current tests to determine if other changes are needed. Signed-off-by: vsoch <[email protected]>
Problem: we will want to have scoped tests for fluence. Solution: to start, add a simple test file for JGF (the content of this current PR) and have an easy way to run it with "make test" from that directory. Currently this does not validate most of the structure - I would like to review the output, discuss a testing strategy, and then update the commit here to reflect that decision. Signed-off-by: vsoch <[email protected]>
@cmisale and @milroy - the changes here will tweak the Flux JGF a tiny bit so each has a containment path with (I think) correct indices. There of course could be bugs, so I'd like to look it over together (and myself again). I tried my best to keep the commits scoped, but in practice there is one large one and a few smaller ones. I will hopefully get better at this. What I'd like to suggest we do next is review the output of the tests - specifically here (click "Run Tests" and you'll see JGF output as I update the graph) where I print a generated JGF (not from a cluster, but from the test) and talk about the changes and if the structure is OK. Then we can decide on a testing strategy, I can implement and finish, and move forward with review. After this PR I'd like to do separate PRs in the following order:
I probably can't think ahead more than three things :) I'm also going to be scoping out a design for our turducken fluence / flux-core approach, but I need to finish some other work first. I hope you had a good memorial day weekend (I'm still not tired of using these emoji 🍓 ☁️ 🫐 ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks really good. Thank you!
We should discuss the index generation to make sure I'm interpreting the behavior correctly and to make sure it's what we want.
Thanks for the review! I'll wait to push (tiny changes so far) until we come to consensus on the above. |
Problem: the current strategy to derive an index is scoped to a resource globally across the graph. Solution: instead, provide a direct index counter for each new resource to ensure it is scoped to the parent Signed-off-by: vsoch <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great; thanks!
Problem: the containment paths are currently not always set, and each resource type has a separate function.
Solution: create a shared node generation function, a "ResourceCounter" that uses a common resource counter, where the counter manages the global and resource-specific counts, and then can be passed to a common node generation function!
This scoped commit includes:
I am also removing the NFD features (#75) because I think they are changed and might lead to error if someone has them, and I think it will be good to assess which we want/need (there are a lot!) and then add them back strategically.
I tried really hard to scope commits, but in practice it was like "a few little tiny things" and then the bulk of work I didn't know how to break into smaller pieces, so it's one chunk.
I could add more to this PR to further work on the logic of the ResourceCounter (for example, I think we should be counting things using it and not outside of it) but I don't want to make it too big for easier review.
I'll mark this ready when it's ready for review, and add a few more notes about questions/discussion.