Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modulenorm = p.grad.data.norm() AttributeError: 'NoneType' object has no attribute 'data' #6

Open
monajalal opened this issue Mar 8, 2018 · 2 comments

Comments

@monajalal
Copy link

I am not sure how to fix this. Any idea?

[jalal@goku pytorch_sentiment_rnn]$ /scratch/sjn-p2/anaconda/anaconda2/bin/python train.py --batch-size 20 --rnn_type GRU --cuda --gpu 1 --lr 0.0001 --mdl RNN --clip_norm 1 --opt Adam
Using TensorFlow backend.
RuntimeError: module compiled against API version 0xb but this version of numpy is 0xa
RuntimeError: module compiled against API version 0xb but this version of numpy is 0xa
ERROR; return code from pthread_create() is 11
	Error detail: Resource temporarily unavailable
[jalal@goku pytorch_sentiment_rnn]$ /scratch/sjn-p2/anaconda/anaconda2/bin/python train.py --batch-size 20 --rnn_type GRU --cuda --gpu 1 --lr 0.0001 --mdl RNN --clip_norm 1 --opt Adam
Using TensorFlow backend.
RuntimeError: module compiled against API version 0xb but this version of numpy is 0xa
RuntimeError: module compiled against API version 0xb but this version of numpy is 0xa
There are 2 CUDA devices
Setting torch GPU to 1
Using device:1 
Stored Environment:['term_len', 'word_index', 'glove', 'max_len', 'train', 'dev', 'test', 'index_word']
Loaded environment
Creating Model...
Setting Pretrained Embeddings
Initialized GRU model
Starting training
Namespace(aggregation='mean', attention_width=5, batch_size=20, clip_norm=1, cuda=True, dataset='Restaurants', dev=1, dropout_prob=0.5, embedding_size=300, epochs=50, eval=1, gpu=1, hidden_layer_size=300, l2_reg=0.0, learn_rate=0.0001, log=1, maxlen=0, mode='term', model_type='RNN', opt='Adam', pretrained=1, rnn_direction='uni', rnn_layers=1, rnn_size=300, rnn_type='GRU', seed=1111, term_model='mean', toy=False, trainable=1)
========================================================================
/scratch2/debate_tweets/sentiment/pytorch_sentiment_rnn/models/rnn.py:51: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  decoded = self.softmax(decoded)
Traceback (most recent call last):
  File "train.py", line 343, in <module>
    exp.train()
  File "train.py", line 326, in train
    loss = self.train_batch(i)
  File "train.py", line 303, in train_batch
    coeff = clip_gradient(self.mdl, self.args.clip_norm)
  File "train.py", line 35, in clip_gradient
    modulenorm = p.grad.data.norm()
AttributeError: 'NoneType' object has no attribute 'data'
[jalal@goku pytorch_sentiment_rnn]$ vi train.py 

@monajalal
Copy link
Author

even though I fixed the runTime error I still get these:

sudo conda install anaconda
sudo conda update numpy
python train.py --batch-size 20 --rnn_type GRU --cuda --gpu 1 --lr 0.0001 --mdl RNN --clip_norm 1 --opt Adam
/scratch/sjn-p2/anaconda/anaconda2/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
There are 2 CUDA devices
Setting torch GPU to 1
Using device:1 
Stored Environment:['term_len', 'word_index', 'glove', 'max_len', 'train', 'dev', 'test', 'index_word']
Loaded environment
Creating Model...
Setting Pretrained Embeddings
Initialized GRU model
Starting training
Namespace(aggregation='mean', attention_width=5, batch_size=20, clip_norm=1, cuda=True, dataset='Restaurants', dev=1, dropout_prob=0.5, embedding_size=300, epochs=50, eval=1, gpu=1, hidden_layer_size=300, l2_reg=0.0, learn_rate=0.0001, log=1, maxlen=0, mode='term', model_type='RNN', opt='Adam', pretrained=1, rnn_direction='uni', rnn_layers=1, rnn_size=300, rnn_type='GRU', seed=1111, term_model='mean', toy=False, trainable=1)
========================================================================
/scratch2/debate_tweets/sentiment/pytorch_sentiment_rnn/models/rnn.py:51: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  decoded = self.softmax(decoded)
Traceback (most recent call last):
  File "train.py", line 343, in <module>
    exp.train()
  File "train.py", line 326, in train
    loss = self.train_batch(i)
  File "train.py", line 303, in train_batch
    coeff = clip_gradient(self.mdl, self.args.clip_norm)
  File "train.py", line 35, in clip_gradient
    modulenorm = p.grad.data.norm()
AttributeError: 'NoneType' object has no attribute 'data'

@monajalal
Copy link
Author

I am not sure if I solved the issue (because I do not get the same accuracy as mentioned in the README section) but here's what I did:
added if p.grad is not None: here:

def clip_gradient(model, clip):
    """Computes a gradient clipping coefficient based on gradient norm."""
    totalnorm = 0
    for p in model.parameters():
        if p.grad is not None:
                modulenorm = p.grad.data.norm()
                totalnorm += modulenorm ** 2
    totalnorm = math.sqrt(totalnorm)
    return min(1, clip / (totalnorm + 1e-6))

and added if p.grad is not None: in:

def train_batch(self, i):
        ''' Trains a regular RNN model
        '''
        sentence, targets, actual_batch = self.make_batch(self.train_set, i)
        if(sentence is None):
            return None
        hidden = self.mdl.init_hidden(actual_batch)
        hidden = repackage_hidden(hidden)
        self.mdl.zero_grad()
        output, hidden = self.mdl(sentence, hidden)
        loss = self.criterion(output, targets)
        loss.backward()
        if(self.args.clip_norm>0):
            coeff = clip_gradient(self.mdl, self.args.clip_norm)
            for p in self.mdl.parameters():
                if p.grad is not None:
                        p.grad.mul_(coeff)
        self.optimizer.step()
        return loss.data[0]

The accuracies I get are as follows:
for RNN model:

python train.py --batch-size 20 --rnn_type GRU --cuda --gpu 1 --lr 0.0001 --mdl RNN --clip_norm 1 --opt Adam
[Epoch 50] Train Loss=0.953654762131 T=0.51s
Test loss=0.90144520998
Output Distribution={2: 1120}
Accuracy=0.65

for TD-RNN model:

python train.py --batch-size 20 --rnn_type GRU --cuda --gpu 1 --lr 0.0001 --mdl TD-RNN --clip_norm 1 --opt Adam
[Epoch 50] Train Loss=0.64427837713 T=0.99s
Test loss=0.828059911728
Output Distribution={0: 165, 1: 138, 2: 817}
Accuracy=0.719642857143

However the accuracy mentioned in github readme for RNN is:

[Epoch 50] Train Loss=0.680990989366
Test loss=0.810974478722
Output Distribution={0: 158, 1: 158, 2: 804}
Accuracy=0.733035714286

What is the possible cause for such difference is accuracy? How would you fix the issue with no drop in accuracy? Not sure if my fix is causing the drop in accuracy.

I used the fix from this link https://discuss.pytorch.org/t/model-parameters-is-none-while-training/6830/2?u=monajalal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant