-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loss : nan when train custom data set #6
Comments
Hi, Can you provide small reproducer for this bug? |
Sorry @BelBES , would you please explain about "small reproducer"? FYI, this is structure of my custom data set:
And, this is structure of my custom
In But still have same |
When i tried to debug using
Note: i set cuda = False in my CPU dev-laptop, but set cuda = True on my GPU server above. |
Hi @BelBES
I tried several batch-size from 8,16,32,64,128,256..but always end with
loss : nan
in every epoch when training my custom data set.python train.py --data-path datatrain --test-init True --test-epoch 10 --output-dir snapshot --abc 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz:/. --batch-size 8
I am using PyTorch 0.4, Python 3.6, GTX 1080 Ti and Ubuntu 16.04
Can you help me how to solve this problem?
Kindly Regards
The text was updated successfully, but these errors were encountered: