Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

negative ranks are reported in the pmix shell trace #98

Open
garlick opened this issue Feb 20, 2024 · 0 comments
Open

negative ranks are reported in the pmix shell trace #98

garlick opened this issue Feb 20, 2024 · 0 comments

Comments

@garlick
Copy link
Member

garlick commented Feb 20, 2024

Problem: pmix defines some special numerical ranks which look puzzling in the trace output

0.104s: flux-shell[1]: TRACE: pmix: pmix server fence_upcall
    {"procs":[{"nspace":"ƒGnMtS4hm","rank":-2}],"info":

From the spec[1]:

The pmix_rank_t structure is a uint32_t type for rank values.

typedef uint32_t pmix_rank_t;

The following constants can be used to set a variable of the type pmix_rank_t. All definitions
were introduced in version 1 of the standard unless otherwise marked. Valid rank values start at
zero.

PMIX_RANK_UNDEF A value to request job-level data where the information itself is not
associated with any specific rank, or when passing a pmix_proc_t identifier to an
operation that only references the namespace field of that structure.

PMIX_RANK_WILDCARD A value to indicate that the user wants the data for the given key
from every rank that posted that key.

PMIX_RANK_LOCAL_NODE Special rank value used to define groups of ranks. This constant
defines the group of all ranks on a local node.

[1] section 3.2.3 of https://pmix.github.io/uploads/2021/10/pmix-standard-v4.1.pdf

and in pmix.h we have

#define PMIX_RANK_UNDEF     UINT32_MAX
#define PMIX_RANK_WILDCARD  UINT32_MAX-1
#define PMIX_RANK_LOCAL_NODE    UINT32_MAX-2        // all ranks on local node

Kind of meaningless to say that valid ranks start at zero when the type is an unsigned integer. But anyway.

The trace is a raw json dump of the interthread message used in the server upcall, so maybe one thing to do would be to encode ranks as I rather than i so the special ones look like a big number as opposed to a negative one.

We could also encode them as a string and use the above names as the encoding for the special values, but that might be going too far for a debug trace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant