Skip to content

Commit

Permalink
fix apple#619 remove nodeSelector provisioner-nodepool
Browse files Browse the repository at this point in the history
  • Loading branch information
samos123 committed Jan 25, 2025
1 parent 94c81cb commit 0fbf321
Showing 1 changed file with 0 additions and 14 deletions.
14 changes: 0 additions & 14 deletions axlearn/cloud/gcp/job.py
Original file line number Diff line number Diff line change
Expand Up @@ -701,20 +701,6 @@ def _build_pod(self) -> Nested[Any]:
PRE_PROVISIONER_LABEL: cfg.name,
}
)
else:
# Used by GCP auto-provisioner.
selector.update(
{
# NOTE: This is an arbitrary key, with a value that must be unique to the
# jobset. This forces the jobset to be associated with its own node pool;
# without this, the TPU provisioner may create a node pool and the scheduler may
# schedule a different jobset onto the node pool, which can cause conflicts if
# the original jobset attempts to restart (node pool conflict). This is more
# reliable at the moment but doesn't take advantage of node pool sharing. GCP is
# working on a fix.
"provisioner-nodepool-id": cfg.name,
}
)

if os.environ.get(BASTION_JOB_VERSION_ENV_VAR):
labels.update({BASTION_JOB_VERSION_LABEL: os.environ.get(BASTION_JOB_VERSION_ENV_VAR)})
Expand Down

0 comments on commit 0fbf321

Please sign in to comment.