You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to work through the CNN code on p. 232 of NLPIA and the get_data() function is getting hung up. The pip install of nlpia seemed to be fine.
Here's the offending line (changing limit setting doesn't seem to change anything, I have gone as low a 5000):
word_vectors = get_data('w2v', limit=50000)
I also see this output the first time I run it:
2019-11-13 14:09:23,227 WARNING:nlpia.constants:107: Starting logger in nlpia.constants...
I'm running Ubuntu 16.04 and using the Spyder IDE. Any suggestions?
The text was updated successfully, but these errors were encountered:
The warning is not a bug, just a bit too verbose. We've gotten rid of it in the latest release.
Unfortunately the word2vec file format provided by Google is compressed in a way that cannot be limited for the download. So the "hangup" may be in the download from dropbox where we stored the w2v file. You'll need a machine with enough disk space and internet bandwidth to download the entire file. The limit arg will only reduce the amount of RAM consumed. And it's implemented within the gensim "KeyedVector" class where we just pass it through, so we can't control how it works and whether it effectively limits the amount of RAM consumed within the gensim code. You may have to get a machine with more RAM in order to experiment with CNNs and NLP.
If you use Anaconda you will be able to install nlpia in a python 3.6 environment. It has not been tested on python 3.7 and this may be why it is hanging up on you. In python 3.7 the re package seems to have a problem with the regular expressions we use to change the filenames during decompression. I'll check it and make sure there's not a bug in get_data.
I'm trying to work through the CNN code on p. 232 of NLPIA and the get_data() function is getting hung up. The pip install of nlpia seemed to be fine.
Here's the offending line (changing limit setting doesn't seem to change anything, I have gone as low a 5000):
word_vectors = get_data('w2v', limit=50000)
I also see this output the first time I run it:
2019-11-13 14:09:23,227 WARNING:nlpia.constants:107: Starting logger in nlpia.constants...
I'm running Ubuntu 16.04 and using the Spyder IDE. Any suggestions?
The text was updated successfully, but these errors were encountered: