Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mdtest size > first == last hang #506

Open
jschwartz-cray opened this issue Jan 23, 2025 · 1 comment
Open

mdtest size > first == last hang #506

jschwartz-cray opened this issue Jan 23, 2025 · 1 comment

Comments

@jschwartz-cray
Copy link
Contributor

The -f/-l options are used to restrict the number of tasks mdtest runs to smaller subsets of the total size specified via the job MPI parameters.

In this mode a subset of the ranks will not participate in the test, and those ranks have to be managed properly so they join up with the ranks that did at the end.

The recently refactored logic fixed one issue but created another in the corner case of size > first == last. In this scenario only one rank participates in the test, but all ranks are duping MPI_COMM_WORLD and the barrier behavior is not correct for this scenario resulting in a hang.

The relevant code is here:

        if(i < last){
          MPI_Group testgroup;
          range.last = i - 1;
          MPI_Group_range_incl(worldgroup, 1, (void *)&range, &testgroup);
          MPI_Comm_create(world_com, testgroup, &testComm);
          MPI_Group_free(&testgroup);
          if(testComm == MPI_COMM_NULL){
            continue;
          }
        }else{
          MPI_Comm_dup(world_com, & testComm);
        }

One solution to this involves making the logic common and ensuring that any ranks which aren't participating are handled in the same manner as they are in

ior/src/ior.c

Line 117 in 9f97b10

if (params->testComm == MPI_COMM_NULL) {
. I will be submitting a PR which implements this fix and some minor error handling improvements as a separate commit.

jschwartz-cray added a commit to jschwartz-cray/ior that referenced this issue Jan 23, 2025
The -f/-l options are used to restrict the number of tasks mdtest runs
to smaller subsets of the total size specified via the job MPI
parameters.

In this mode a subset of the ranks will not participate in the test, and
those ranks have to be managed properly so they join up with the ranks
that did at the end.

The recently refactored logic fixed one issue but created another in the
corner case of size > first == last. In this scenario only one rank
participates in the test, but all ranks were duping MPI_COMM_WORLD and
the barrier behavior was not correct for this scenario resulting in a
hang.

This solves the problem by making the logic common (a new group and
communicator will always be created for the test whether it is for all
ranks or a subset) and ensuring that any ranks which aren't
participating are handled in the same manner as in ior.c:117.
@jschwartz-cray
Copy link
Contributor Author

#507

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant