Using float 16 in training? #6

sanjayss34 · 2020-10-30T21:13:33Z

I've noticed that in training some tensors are of the float 16 datatype, whereas in validation, I only see float 32. is that in line with what you see? Is this intentional? I haven't found the part of the code that causes the float 16 conversion; if there is some conversion like that, could you please point me to where it is in the code?

tscholak · 2020-11-01T12:04:11Z

Hi @sanjayss34, we are using Pytorch’s automatic mixed precision mode, which was introduced in 1.6 and originates from NVidia’s own APEX project. I recommend you read up on it here: https://pytorch.org/docs/stable/amp.html

tscholak · 2020-11-01T22:47:48Z

with regards to where and how the automatic mixed precision is implemented in duorat, please have a look at the training loop, specifically, around here: https://github.com/ElementAI/duorat/blob/master/scripts/train.py#L275

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using float 16 in training? #6

Using float 16 in training? #6

sanjayss34 commented Oct 30, 2020

tscholak commented Nov 1, 2020

tscholak commented Nov 1, 2020

Using float 16 in training? #6

Using float 16 in training? #6

Comments

sanjayss34 commented Oct 30, 2020

tscholak commented Nov 1, 2020

tscholak commented Nov 1, 2020