Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support using a shared sandbox instance #71

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

itegulov
Copy link
Contributor

@itegulov itegulov commented Feb 8, 2022

Fixes #64

Creating this raw draft just to share the direction I am going in. I have tested the shared sandbox in its current state and it seems to be working fine (after I fixed the issue with TLA account creation race condition; see the second commit for more details).

@ChaoticTempest
Copy link
Member

This looks like it's going generally in the right direction. There's a couple of things we can change up since we want to switch the defaults to the shared instance. sandbox_multi could be alternative to sandbox_shared where we can have sandbox just be controlled by a feature/env flag to choose between the two

@itegulov
Copy link
Contributor Author

itegulov commented Feb 9, 2022

@ChaoticTempest

sandbox_multi could be alternative to sandbox_shared where we can have sandbox just be controlled by a feature/env flag to choose between the two

Done. I have introduced a sandbox-multi feature (any suggestions for a better name?) that controls which file out of sandbox_shared.rs and sandbox_multi.rs gets compiled and uses the struct from that file for Sandbox from sandbox.rs. But I wonder if it would make sense to just move Sandbox inside sandbox_shared.rs/sandbox_multi.rs and completely remove sandbox.rs. This way we would not need this intermediary layer of SandboxMulti and SandboxShared. As always any feedback is welcome.

Also, I found out that Rust never calls drop on static objects, so SHARED_SERVER never gets stopped and just hangs as an orphan process... Still looking for an alternative way to do this.

@itegulov itegulov force-pushed the daniyar/sandbox-shared branch from 70e76e3 to 01a28cd Compare February 9, 2022 07:46
@ChaoticTempest
Copy link
Member

ChaoticTempest commented Feb 9, 2022

But I wonder if it would make sense to just move Sandbox inside sandbox_shared.rs/sandbox_multi.rs and completely remove sandbox.rs. This way we would not need this intermediary layer of SandboxMulti and SandboxShared

Thinking about this again, since we're adding all these structs, it might just be simplest to purely have a single Sandbox struct that controls how the server gets spun up. Like in Sandbox::new(), we can choose to startup a new sandbox server or not if there's a shared server based on a flag. Stuff like TLA_ACCOUNT_MUTEX.lock() can just be behind the flag too.

I have introduced a sandbox-multi feature (any suggestions for a better name?)

yeah, multi might not be the best name. How about sandbox-parallel?

Also, I found out that Rust never calls drop on static objects, so SHARED_SERVER never gets stopped and just hangs as an orphan process... Still looking for an alternative way to do this.

Let me know if you hit a snag on this one. It's pretty complicated one to try to solve. Was trying to accomplish similar things on one of the first iterations of workspaces, but at-exit hooks are non-existent in rust

@itegulov itegulov force-pushed the daniyar/sandbox-shared branch from 01a28cd to 3cc4eb8 Compare February 15, 2022 06:04
@itegulov
Copy link
Contributor Author

@ChaoticTempest I have managed to resolve the problem in a roundabout way with the divine intervention help from @matklad. Basically I made so that all sandbox instances are spawned with their stdin redirected to a pipe and made so that they shutdown when stdin closes. The nice part is that once a parent process dies OS automatically closes all of its pipes ends, meaning that all child sandboxes will die with it. This is helpful even for a multi-sandbox mode as you can't really rely on Drop to terminate child processes.

You can see a preview of changes I am proposing in this sandbox branch: https://github.com/near/sandbox/tree/daniyar/autocloseable. If you agree with the direction, I will open another PR to sandbox so we can discuss implementation details.

Also I collapsed two sandbox implementations into one as you suggested. Personally I think it looks better now.

@ChaoticTempest
Copy link
Member

@itegulov very interesting approach. I like it, but there's one concern I have with it since we need to make it cross platform, and there needs to be an equivalent way to kill the process for windows as well. I wonder if it would work the same with the equivalent windows kill command

@itegulov
Copy link
Contributor Author

@ChaoticTempest I have access to a windows PC, so I am going to test it later today.

@matklad
Copy link
Contributor

matklad commented Feb 17, 2022

Can we just modify the sandbox binary to kill itself when stdin closes? That way, we won't need an intermediate shell.

@itegulov
Copy link
Contributor Author

itegulov commented Feb 17, 2022

@matklad Yeah, the current state of https://github.com/near/sandbox/tree/daniyar/autocloseable is mostly a proof of concept. I just picked the path of least friction for now (otherwise I would have had to make a PR to nearcore). I will open a separate PR in sandbox repo and we can discuss which implementation makes the most sense. I am not 100% sure what kind of implications would be if we made NEAR node kill itself when stdin closes (but I guess we can always make it a configurable behaviour via a CLI parameter).

@itegulov
Copy link
Contributor Author

itegulov commented Feb 17, 2022

@ChaoticTempest so I tried to test this on Windows, but I forgot that sandbox does not support Windows platform right now (probably because we don't publish binaries for it to AWS S3 as I think NEAR node itself does work on Windows). So testing this is not as straightforward.

Also, in general is near-sandbox binary release process documented somewhere? I could not trace where you got this link from.

@ChaoticTempest
Copy link
Member

@itegulov

so I tried to test this on Windows, but I forgot that sandbox does not support Windows platform right now (probably because we don't publish binaries for it to AWS S3 as I think NEAR node itself does work on Windows). So testing this is not as straightforward.

This is a little more involved, but one way to test this is to compile nearcore sandbox on a windows machine, and have workspaces point to the binary via the NEAR_SANDBOX_BIN_PATH environment var.

Also, in general is near-sandbox binary release process documented somewhere? I could not trace where you got this link from.

The link is just the aws bucket + nearcore commit hash. So if you get any commit hash from nearcore, and combined it with the AWS part, it would be valid link given enough time has passed for it to be uploaded. You can take a look at how it gets uploaded via: https://github.com/near/nearcore/blob/master/scripts/mac-release.sh

@itegulov
Copy link
Contributor Author

itegulov commented Mar 1, 2022

As per my and @ChaoticTempest investigation, compiling nearcore on Windows is impossible right now due to how older versions (that are still being used by nearcore) of near/wasmer are written. We decided to put this issue on hold until nearcore finishes upgrade to wasmer 2.0 where some of the issues might get resolved.

@matklad
Copy link
Contributor

matklad commented Mar 15, 2022

As per investigation, compiling nearcore on Windows is impossible right now due to how older versions (that are still being used by nearcore) of near/wasmer are written.

I wouldn't do such inference here: the code shouldn't be using sell/kill external command in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: NEW❗
Development

Successfully merging this pull request may close these issues.

Explore using single sandbox instance
3 participants