Training stylegan2 on TPU #24
Unanswered
reidsanders
asked this question in
Q&A
Replies: 1 comment 1 reply
-
Hi @reidsanders! I only tested it for GPUs, not for TPUs, so I only provided instructions for GPUs. It is interesting that inference works on the TPU but not training. Could you also post your Colab for training, then I can take a look. Cheers! |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Running eg, stylegan tests on TPU vm yields:
============================================== short test summary info ==============================================
FAILED tests/stylegan2/discriminator/test_discriminator.py::test_reference_output_afhqcat - assert DeviceArray(0.0...
FAILED tests/stylegan2/discriminator/test_discriminator.py::test_reference_output_afhqwild - assert DeviceArray(0....
FAILED tests/stylegan2/discriminator/test_discriminator.py::test_reference_output_car - assert DeviceArray(0.00418...
FAILED tests/stylegan2/discriminator/test_discriminator.py::test_reference_output_cat - assert DeviceArray(0.00065...
FAILED tests/stylegan2/discriminator/test_discriminator.py::test_reference_output_church - assert DeviceArray(0.00...
FAILED tests/stylegan2/discriminator/test_discriminator.py::test_reference_output_ffhq - assert DeviceArray(0.0080...
FAILED tests/stylegan2/discriminator/test_discriminator.py::test_reference_output_horse - assert DeviceArray(0.001...
FAILED tests/stylegan2/generator/test_generator.py::test_reference_output_afhqdog - assert DeviceArray(0.00501162,...
FAILED tests/stylegan2/generator/test_generator.py::test_reference_output_afhqwild - assert DeviceArray(0.00320451...
FAILED tests/stylegan2/generator/test_generator.py::test_reference_output_car - assert DeviceArray(0.0042854, dtyp...
FAILED tests/stylegan2/generator/test_generator.py::test_reference_output_cat - assert DeviceArray(0.00249642, dty...
FAILED tests/stylegan2/generator/test_generator.py::test_reference_output_church - assert DeviceArray(0.00342116, ...
FAILED tests/stylegan2/generator/test_generator.py::test_reference_output_cifar10 - assert DeviceArray(0.00351433,...
FAILED tests/stylegan2/generator/test_generator.py::test_reference_output_ffhq - assert DeviceArray(0.0016813, dty...
FAILED tests/stylegan2/generator/test_generator.py::test_reference_output_horse - assert DeviceArray(0.00304833, d...
FAILED tests/stylegan2/generator/test_generator.py::test_reference_output_metfaces - assert DeviceArray(0.00100499...
=============================== 16 failed, 6 passed, 2 warnings in 105.92s (0:01:45) ================================
Interestingly the tpu inference example notebook seems to work fine.
Attempting to train with
Gives a nan/inf error. I'm trying 256, to see if that works.
I've run this with both requirements.txt and most recent versions of tensorflow.
Is there a different setting required for using TPU? What could be causing the difference?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions