Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add heap size hint #2774

Merged
merged 13 commits into from
Jan 23, 2024
Merged

Conversation

kirill-kondrashov
Copy link
Contributor

@kirill-kondrashov kirill-kondrashov commented Jan 15, 2024

This MR intends to integrate the heap size hint:

that enables users to set a limit on memory usage, after which the garbage collector (GC) will work more aggressively to clean up unused memory.

For the Julia processes launched in Pluto.

Copy link
Contributor

Try this Pull Request!

Open Julia and type:

julia> import Pkg
julia> Pkg.activate(temp=true)
julia> Pkg.add(url="https://github.com/kir19890817/Pluto.jl", rev="add-julia-heap-size-hint")
julia> using Pluto

@pankgeorg
Copy link
Collaborator

One thing we've found to work better than heap-size-hint, is actually invoking the GC.gc() function on a background task, when memory pressure on a system is important and is creating problems.

@kirill-kondrashov
Copy link
Contributor Author

One thing we've found to work better than heap-size-hint, is actually invoking the GC.gc() function on a background task, when memory pressure on a system is important and is creating problems.

I see; it sounds like a good thing to check. Let me have a look.

@fonsp
Copy link
Owner

fonsp commented Jan 16, 2024

Can you make this PR again, but:

  • Avoid formatting unrelated lines of code
  • Only add the option to CompilerOptions, not also to ServerOptions
  • Make sure that the tests pass

I would recommend searching for math_mode and doing the exact same for heap_size_hint

@fonsp fonsp marked this pull request as draft January 16, 2024 22:39
@fonsp
Copy link
Owner

fonsp commented Jan 16, 2024

@pankgeorg That's a nice suggestion but let's track that in a new issue and PR! This is just about a command line flag that was added in Julia 1.9

@pankgeorg
Copy link
Collaborator

@pankgeorg That's a nice suggestion but let's track that in a new issue and PR! This is just about a command line flag that was added in Julia 1.9

True, this is completely unrelated to the PR, it's mostly an offtopic response that addresses the intent of the PR, not the content of it. I'll open a new issue, and probably also a PR for memory management.

@kirill-kondrashov
Copy link
Contributor Author

kirill-kondrashov commented Jan 17, 2024

Can you make this PR again, but:

* Avoid formatting unrelated lines of code

* Only add the option to `CompilerOptions`, not also to `ServerOptions`

* Make sure that the tests pass

I would recommend searching for math_mode and doing the exact same for heap_size_hint

Sure, thanks for comment! I've initially tried to put it into Draft: status by typing it in the title, but it seems to not affect the state of the PR on github... So thanks for changing the status.

@kirill-kondrashov
Copy link
Contributor Author

@pankgeorg That's a nice suggestion but let's track that in a new issue and PR! This is just about a command line flag that was added in Julia 1.9

True, this is completely unrelated to the PR, it's mostly an offtopic response that addresses the intent of the PR, not the content of it. I'll open a new issue, and probably also a PR for memory management.

Hey, @pankgeorg thanks for the suggestion again!

Could you please elaborate a bit on what you mean by background task here? As far as I can track (with my poor understanding of the codebase as a disclaimer), for instance, in case of running the notebooks for exporting, in PlutoSliderServer.jl in the function run_directory which is called by export_directory, the new 'process' is created:

new = process(s; server_session, settings, output_dir, start_dir, progress)

process is implemented in Actions.jl. It triggers Pluto.jl/src/webserver/SessionActions.jl open. Because it seems to be a separate process, there's no way to trigger GC.gc() somewhere 'in the background', outside of Pluto.jl.

@fonsp
Copy link
Owner

fonsp commented Jan 22, 2024

Thanks @kirill-kondrashov ! Let us know when the PR is ready

@pankgeorg
Copy link
Collaborator

@pankgeorg That's a nice suggestion but let's track that in a new issue and PR! This is just about a command line flag that was added in Julia 1.9

True, this is completely unrelated to the PR, it's mostly an offtopic response that addresses the intent of the PR, not the content of it. I'll open a new issue, and probably also a PR for memory management.

Hey, @pankgeorg thanks for the suggestion again!

Could you please elaborate a bit on what you mean by background task here? As far as I can track (with my poor understanding of the codebase as a disclaimer), for instance, in case of running the notebooks for exporting, in PlutoSliderServer.jl in the function run_directory which is called by export_directory, the new 'process' is created:

new = process(s; server_session, settings, output_dir, start_dir, progress)

process is implemented in Actions.jl. It triggers Pluto.jl/src/webserver/SessionActions.jl open. Because it seems to be a separate process, there's no way to trigger GC.gc() somewhere 'in the background', outside of Pluto.jl.

Indeed. Pluto does inject some code to the host process though, so it could also add a Timer(() -> GC.gc(), 3600.; interval=3600.), or something a little bit more complex. Pluto itself runs a webserver too, so Pluto.run() could also have a function that periodically does a full sweep.

The tricky part (which julia could/should do, but currently doesn't) is to track

  1. whether it's a good time to run the GC now (good time = not too stressed)
  2. whether there has been enough allocations to warrant a GC sweep.

I think the GC doesn't keep statistics of allocations around, so knowing [2] isn't very easy, at least not without consulting the kernel.

On the other hand, none of this should be Pluto's job in the first place right? This effort should most likely be another package.

@kirill-kondrashov kirill-kondrashov force-pushed the add-julia-heap-size-hint branch 2 times, most recently from 188e392 to b230cc3 Compare January 22, 2024 16:23
@kirill-kondrashov kirill-kondrashov marked this pull request as ready for review January 22, 2024 17:20
@kirill-kondrashov kirill-kondrashov changed the title Draft: Add heap size hint Add heap size hint Jan 22, 2024
@kirill-kondrashov
Copy link
Contributor Author

Thanks @kirill-kondrashov ! Let us know when the PR is ready

Seems to pass the tests now

@kirill-kondrashov
Copy link
Contributor Author

@pankgeorg That's a nice suggestion but let's track that in a new issue and PR! This is just about a command line flag that was added in Julia 1.9

True, this is completely unrelated to the PR, it's mostly an offtopic response that addresses the intent of the PR, not the content of it. I'll open a new issue, and probably also a PR for memory management.

Hey, @pankgeorg thanks for the suggestion again!
Could you please elaborate a bit on what you mean by background task here? As far as I can track (with my poor understanding of the codebase as a disclaimer), for instance, in case of running the notebooks for exporting, in PlutoSliderServer.jl in the function run_directory which is called by export_directory, the new 'process' is created:

new = process(s; server_session, settings, output_dir, start_dir, progress)

process is implemented in Actions.jl. It triggers Pluto.jl/src/webserver/SessionActions.jl open. Because it seems to be a separate process, there's no way to trigger GC.gc() somewhere 'in the background', outside of Pluto.jl.

Indeed. Pluto does inject some code to the host process though, so it could also add a Timer(() -> GC.gc(), 3600.; interval=3600.), or something a little bit more complex. Pluto itself runs a webserver too, so Pluto.run() could also have a function that periodically does a full sweep.

The tricky part (which julia could/should do, but currently doesn't) is to track

1. whether it's a good time to run the GC _now_ (good time = not too stressed)

2. whether there has been enough allocations to warrant a GC sweep.

I think the GC doesn't keep statistics of allocations around, so knowing [2] isn't very easy, at least not without consulting the kernel.

On the other hand, none of this should be Pluto's job in the first place right? This effort should most likely be another package.

Makes sense! I think Pluto, indeed, shouldn't be the first place; PlutoSliderServer seemed to me as a suitable place to insert that procedure.

@fonsp fonsp merged commit eb2ee11 into fonsp:main Jan 23, 2024
6 checks passed
@fonsp
Copy link
Owner

fonsp commented Jan 23, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants