Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An older torchvision version is used in the nightly tpu image #8595

Open
hosseinsarshar opened this issue Jan 21, 2025 · 1 comment
Open
Assignees
Labels
model bug triaged This issue has been reviewed by the triage team and the appropriate priority assigned. usability Bugs/features related to improving the usability of PyTorch/XLA

Comments

@hosseinsarshar
Copy link
Contributor

🐛 Bug - old torchvision is used in the nightly tpu image

I noticed an older torchvision being used in the nightly image - Here is what I tried:

gcloud compute tpus tpu-vm ssh $TPU_NAME --project $PROJECT --zone=$ZONE --worker=all --command='
DOCKER_IMAGE=us-central1-docker.pkg.dev/tpu-pytorch-releases/docker/xla:nightly_3.10_tpuvm

worker_id=1

cat >> /dev/null <<EOF
EOF

stdbuf -oL bash <<-PIPE_EOF 2>&1 | sed "s/^/[worker $worker_id] /g" | tee runlog
  set -o xtrace
  # Configure docker
  sudo groupadd docker
  sudo usermod -aG docker $USER
  # newgrp applies updated group permissions
  newgrp - docker
  gcloud auth configure-docker us-central1-docker.pkg.dev --quiet
  # Kill any running benchmarks
  docker kill $USER-test || true

  docker pull $DOCKER_IMAGE

docker run \
    --name $USER-test \
    --privileged \
    -v /home/$USER:/tmp/home \
    --shm-size=16G \
    --net host \
    -u root \
     --rm $DOCKER_IMAGE  python -c "import torch_xla; import torch; import torchvision; print(f'{torch_xla.devices()=}'); print(f'{torch.__version__=}'); print(f'{torchvision.__version__=}'); print(f'{torch_xla.__version__=}')"

PIPE_EOF
'

This is what I get:

[worker ] torch_xla.devices()=[device(type='xla', index=0)]
[worker ] torch.__version__='2.7.0'
[worker ] torchvision.__version__='0.19.0a0+d23a6e1'
[worker ] torch_xla.__version__='2.7.0+gita295f7d'

By checking the commit ID on the upstream torchvision repo - I get to a commit submitted in May, 2024 - wonder if this is intended or it needs to be updated to a more recent version.

The git commit on torchvision: pytorch/vision@d23a6e1

Thanks

@hosseinsarshar hosseinsarshar changed the title An old torchvision commit is used in the nightly tpu image An older torchvision version is used in the nightly tpu image Jan 21, 2025
@miladm miladm added triaged This issue has been reviewed by the triage team and the appropriate priority assigned. model bug usability Bugs/features related to improving the usability of PyTorch/XLA labels Jan 21, 2025
@miladm
Copy link
Collaborator

miladm commented Jan 21, 2025

cc @ysiraichi and @tengyifei to align 2.6 whl readiness + nightly whl alignments - it seems torchvision and torchxla nightlies have been incompatible since ~20241216

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model bug triaged This issue has been reviewed by the triage team and the appropriate priority assigned. usability Bugs/features related to improving the usability of PyTorch/XLA
Projects
None yet
Development

No branches or pull requests

3 participants