Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In Chapter 2, the memory usage of compact form seems unreal. #13

Open
3zhang opened this issue Mar 6, 2023 · 1 comment
Open

In Chapter 2, the memory usage of compact form seems unreal. #13

3zhang opened this issue Mar 6, 2023 · 1 comment

Comments

@3zhang
Copy link

3zhang commented Mar 6, 2023

"For one block, this representation takes up only ~0.002 MB of memory." This is probably based on the result of pandas memory_usage. But if you calculate the numpy bytes in the energy consumption column, it takes ~10MB:
sum([x.nbytes for x in block1_compact['energy_consumption']])/1024**2
Out[32]: 9.8807373046875

I think memory_usage only considers the array pointers in the column (even with deep=True). Also if you save the table to disk it still takes ~15MB.

@manujosephv
Copy link
Collaborator

Interesting! I thought deep=True was supposed to give you a better approximation of the size.

May be running a memory profiler and calculating the difference or Pympler might give us better answers? I would be more than happy to change the claim in the book with some more data points.

Personally, I'm inclined to think that the memory usage from pandas might be flawed as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants