Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adds option to change JVM max heap size #97

Merged
merged 5 commits into from
Nov 28, 2024
Merged

Conversation

stulacy
Copy link
Collaborator

@stulacy stulacy commented Nov 22, 2024

Addresses #94

@llwiggins I've added the option to change the default max heap size in the JVM, which should help with processing large
datasets. I haven't been able to test it though as my java install is currently not working.

Before launching into a full dataset process, can you try the following to see if it works?

Create a Python script with the below and then run it, it should tell you the max memory available to your system by default (should do, I can't check this until I fix my java).

from cellphe.imagej import setup_imagej
import scyjava

setup_imagej()
print(f"Max memory = {scyjava.memory_max() // 1024 // 1024 // 1024}")

Then try run it again but passing in the max amount of memory you want, which should be then displayed in the output message.

from cellphe.imagej import setup_imagej
import scyjava

setup_imagej(max_heap=6)
print(f"Max memory = {scyjava.memory_max() // 1024 // 1024 // 1024}")

By the way, were you on your laptop or Google Colab when you had this issue before?

@llwiggins
Copy link
Collaborator

Hi @stulacy,

I've tried this code out but get the same output for max memory before and after setting a max_heap value. I also get the same output for each max_heap value that I try! I was using my laptop when I ran into the initial memory issues.

Kind regards,
Laura

@stulacy
Copy link
Collaborator Author

stulacy commented Nov 22, 2024

I've fixed my Java install on my Linux laptop and increasing that heap parameter is working for me.

On Windows I can't get it to work at all, which I think is a problem with my Java setup. I haven't tried running the tracking stuff on there before. I'll try and fix it and document it for other Windows users.

Are you still testing it on your Mac, or are you using Google Colab? If the latter I suspect that you won't be able to modify the JVM settings.
What is the default heap size?

@stulacy
Copy link
Collaborator Author

stulacy commented Nov 27, 2024

Update: still unable to get scyjava working on Windows, yet alone testing if the JVM max heap size can be configured.
Appears to be related to jpype-project/jpype#1242

@stulacy
Copy link
Collaborator Author

stulacy commented Nov 27, 2024

Hi @llwiggins , I've put a fix in place so the PyImageJ integration for Tracking should now also work on Windows.

I can also confirm that the max_heap parameter added by this PR is working as expected on both Windows and Linux.
Is it still not working for you? And if so, are you using your Mac or on Google Colab?

@llwiggins
Copy link
Collaborator

Weird! I'm using it on my Mac rather than through Google Collab.

For me, the statement below returns a value of 3

setup_imagej()
print(f"Max memory = {scyjava.memory_max() // 1024 // 1024 // 1024}")

@stulacy
Copy link
Collaborator Author

stulacy commented Nov 28, 2024

I also have a default 3GB heap on my laptop (16GB RAM total), so that sounds reasonable (and also relatively low so it's understandable how it runs out of memory on large datasets).

But it's still not changing when you try to increase it with the max_heap argument?

setup_imagej(max_heap=6)

@llwiggins
Copy link
Collaborator

Here's the output from running the code in my .ipynb file on my mac where it doesn't seem that the max memory is changing despite changing the max_heap parameter

Screenshot 2024-11-28 at 10 30 09

@stulacy
Copy link
Collaborator Author

stulacy commented Nov 28, 2024

That's really strange. I thought it might be a Jupyter problem, but on both Windows and Linux I can change the max heap in Jupyter.

Can you try 2 things?

First try setting max_heap=1.

Then try running this in a cell and pasting the output here:

import subprocess
out = subprocess.run(["java", "-version"], capture_output=True)
print(out.stderr.decode("utf-8"))

I'm wondering if you're using a 32-bit JVM which limits you to a 3GB heap.

@llwiggins
Copy link
Collaborator

Here's my output:

openjdk version "1.8.0_402"
OpenJDK Runtime Environment Corretto-8.402.08.1 (build 1.8.0_402-b08)
OpenJDK 64-Bit Server VM Corretto-8.402.08.1 (build 25.402-b08, mixed mode)

@stulacy
Copy link
Collaborator Author

stulacy commented Nov 28, 2024

Ah that kills that theory.

And did setting max_heap=1 still keep it at 3GB?

@llwiggins
Copy link
Collaborator

Yep still says 3 😢

Screenshot 2024-11-28 at 12 20 39

@stulacy
Copy link
Collaborator Author

stulacy commented Nov 28, 2024

Hmmm this might be another M1 Mac specific issue. I'll see if I can reproduce it on another machine.

@stulacy
Copy link
Collaborator Author

stulacy commented Nov 28, 2024

I've had a go on someone's MacBook Pro 2021 in the office and annoyingly I can't reproduce this at all. What Mac are you on?
It was interesting that you're using an Amazon JDK, but even installing that exact same version on this MacBook I couldn't reproduce it.

A few things to try (clutching at straws now):

  1. Restart Kernel and change sj.config.add_option(f"-Xmx6g") to sj.config.add_option(f"-mx6g")
  2. Restart Kernel and change sj.config.add_option(f"-Xmx6g") to sj.config.set_heap_max(gb=6)
  3. In a terminal run java -XX:+PrintFlagsFinal -Xmx6g -version -verbose | grep HeapSize and look at the MaxHeapSize value.
  4. In a terminal run java -XX:+PrintFlagsFinal -mx6g -version -verbose | grep HeapSize and look at the MaxHeapSize value.

I'm mostly convinced now that this PR will work for the majority of users, and we're now just trouble shooting why it doesn't work for you, and if it there might be others it doesn't work for.

@llwiggins
Copy link
Collaborator

Weird, I set up a new venv and tried running the lines of code again.
Screenshot 2024-11-28 at 16 20 22

It seems I can change the max memory, but it's a bit intermittent whether it behaves as expected? For example, when I set back to max_heap = 3 it will stay at 6 until I restart my kernal/notebook.

@stulacy
Copy link
Collaborator Author

stulacy commented Nov 28, 2024

Huh, that's strange... At least it works now!

Yep it staying the same until the kernel/Python is restarted is expected behaviour. The JVM is only started once at the start of the program and the heap size is set on JVM start.

I'll make that change go live then.

@stulacy stulacy merged commit 38b3f3a into main Nov 28, 2024
10 checks passed
@stulacy stulacy deleted the feature/jvm-max-heap branch November 28, 2024 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants