Converting Models is High Memory Consuming... #11

ClaudeCoulombe · 2017-07-18T18:37:54Z

Greetings,

I'm trying to convert models built with AdaGram.jl Julia to JSON and then in Python as explained in the README.rst

I've used a pretty big model, the huang_super_300D_0.2_min20_hs_t1e-17.model which has a file size of over 3.3 GB. The conversion to JSON gives two files, a 4.7M id2word.json and a 23GB vm.json

To convert the model my command is:

sudo nohup python3 ./adagram/load_julia.py ./ model.joblib &

How much RAM should it take to convert the JSON to Python model?

Also, I did not understand why the .joblib extension for the Python model ?
I've expected a .pkl extension ?

The text was updated successfully, but these errors were encountered:

lopuhin · 2017-07-18T18:43:22Z

@ClaudeCoulombe unfortunately, expected RAM usage in this case is quite big, probably bigger than the size of the model. Someone suggested a more memory efficient way of converting the models, if I recall correctly this was using hdf format, but I can't find this suggestion at the moment

ClaudeCoulombe · 2017-07-18T22:17:48Z

Greetings @lopuhin,

Thanks for your quick answer.

I've tried with a huge memory server (64 GB) of RAM, the code was able to fill that out and give a Memory Error...

The function culprit seems to be the rand_arr in utils.py which return (np.array(np.random.rand(*shape), dtype=dtype) - 0.5) * norm

You're probably better to understand the problem than me. Maybe a generator with yield could help in that case but I'm not sure.

Below the traces:

Traceback (most recent call last):

File "./adagram/load_julia.py", line 40, in
main()
File "./adagram/load_julia.py", line 27, in main
alpha=vm_data['alpha'])
File "/usr/local/lib/python3.5/dist-packages/adagram/model.py", line 83, in init
self.In = rand_arr((N, prototypes, dim), 1. / dim, np.float32)
File "/usr/local/lib/python3.5/dist-packages/adagram/utils.py", line 8, in rand_arr
return (np.array(np.random.rand(*shape), dtype=dtype) - 0.5) * norm
File "mtrand.pyx", line 1347, in mtrand.RandomState.rand (numpy/random/mtrand/mtrand.c:19701)
File "mtrand.pyx", line 856, in mtrand.RandomState.random_sample (numpy/random/mtrand/mtrand.c:15527)
File "mtrand.pyx", line 167, in mtrand.cont0_array (numpy/random/mtrand/mtrand.c:6127)

MemoryError

ahmarz · 2017-11-29T20:33:15Z

any updates regarding this issue ? I've expected a .pkl extension as well but i get ".joblib" any ideas how i go about getting a file.pkl for the model ?

Thanks in advance

lopuhin · 2017-11-29T20:35:34Z

@ahmelshi this is likely a different issue. .joblib model should work fine.

ahmarz · 2017-11-29T20:38:34Z

thank you for your fast reply.

upon executing "adagram/load_julia.py ... " i get about 9.joblib files i was wondering if there is a way to combine or produce them into 1 file.pkl ? any ideas ?

lopuhin · 2017-11-29T20:40:39Z

I see - you could load them in python with joblib, and then save with pickle or different joblib options (there should be an option that gives a single file)

lopuhin mentioned this issue Sep 6, 2018

Memory issue for converting the model? #14

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Converting Models is High Memory Consuming... #11

Converting Models is High Memory Consuming... #11

ClaudeCoulombe commented Jul 18, 2017 •

edited

Loading

lopuhin commented Jul 18, 2017

ClaudeCoulombe commented Jul 18, 2017 •

edited

Loading

ahmarz commented Nov 29, 2017

lopuhin commented Nov 29, 2017

ahmarz commented Nov 29, 2017 •

edited

Loading

lopuhin commented Nov 29, 2017

Converting Models is High Memory Consuming... #11

Converting Models is High Memory Consuming... #11

Comments

ClaudeCoulombe commented Jul 18, 2017 • edited Loading

lopuhin commented Jul 18, 2017

ClaudeCoulombe commented Jul 18, 2017 • edited Loading

ahmarz commented Nov 29, 2017

lopuhin commented Nov 29, 2017

ahmarz commented Nov 29, 2017 • edited Loading

lopuhin commented Nov 29, 2017

ClaudeCoulombe commented Jul 18, 2017 •

edited

Loading

ClaudeCoulombe commented Jul 18, 2017 •

edited

Loading

ahmarz commented Nov 29, 2017 •

edited

Loading