In Chapter 2, the memory usage of compact form seems unreal. #13

3zhang · 2023-03-06T08:11:24Z

"For one block, this representation takes up only ~0.002 MB of memory." This is probably based on the result of pandas memory_usage. But if you calculate the numpy bytes in the energy consumption column, it takes ~10MB:
sum([x.nbytes for x in block1_compact['energy_consumption']])/1024**2
Out[32]: 9.8807373046875

I think memory_usage only considers the array pointers in the column (even with deep=True). Also if you save the table to disk it still takes ~15MB.

manujosephv · 2023-03-07T00:46:16Z

Interesting! I thought deep=True was supposed to give you a better approximation of the size.

May be running a memory profiler and calculating the difference or Pympler might give us better answers? I would be more than happy to change the claim in the book with some more data points.

Personally, I'm inclined to think that the memory usage from pandas might be flawed as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In Chapter 2, the memory usage of compact form seems unreal. #13

In Chapter 2, the memory usage of compact form seems unreal. #13

3zhang commented Mar 6, 2023 •

edited

Loading

manujosephv commented Mar 7, 2023

In Chapter 2, the memory usage of compact form seems unreal. #13

In Chapter 2, the memory usage of compact form seems unreal. #13

Comments

3zhang commented Mar 6, 2023 • edited Loading

manujosephv commented Mar 7, 2023

3zhang commented Mar 6, 2023 •

edited

Loading