-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RSDK-9591 - Add Kill to ManagedProcess #399
base: main
Are you sure you want to change the base?
Conversation
@@ -32,6 +32,10 @@ type ManagedProcess interface { | |||
// there's any system level issue stopping the process. | |||
Stop() error | |||
|
|||
// Kill will attempt to kill the process group and not wait for completion. Only use this if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the story behind killing the process group? Is the viam-server
part of the same process group? If we're intentionally trying to kill the viam-server + all modules running in one swift process group kill -9
, what's the utility in adding this functionality to the goutils/managed process code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the viam-server is not part of the same process group, each ManagedProcess is its own process group, I'm guessing the intention is to be able to Kill
or Stop
all spawned processes from one ManagedProcess at once
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the viam-server is not part of the same process group, each ManagedProcess is its own process group
I thought* that was a possibility. But I didn't see the code that did that. Is that in our code or the SDK side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
goutils/pexec/managed_process_unix.go
Line 69 in 8961a20
attrs := &syscall.SysProcAttr{Setpgid: true} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Just some initial comments.
@@ -126,6 +127,14 @@ func (p *managedProcess) kill() (bool, error) { | |||
return forceKilled, nil | |||
} | |||
|
|||
// forceKill kills everything in the process group. This will not wait for completion and may result in a zombie process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a zombie process
I have heard this term used in a couple different ways. What do you mean by "zombie process"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the process will have finished but not waited on by the parent process - it is still in the process table. I'm explicitly not waiting on the child process here so the Kills
will be zombies until the viam-server terminates and the system reaps these processes. https://en.wikipedia.org/wiki/Zombie_process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// forceKill kills everything in the process group. This will not wait for completion and may result in a zombie process. | |
// forceKill kills everything in the process group. This will not wait for completion and may result in the kill becoming a zombie process. |
func (p *managedProcess) forceKill() error { | ||
pidStr := strconv.Itoa(p.cmd.Process.Pid) | ||
p.logger.Infof("force killing entire process tree %d", p.cmd.Process.Pid) | ||
return exec.Command("taskkill", "/t", "/f", "/pid", pidStr).Start() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test this on windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not yet, I just copied this from kill() a bit above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a test that a managed process that starts another process in the same process can both be killed by Kill
(or KillGroup
if you take my renaming suggestion)?
@@ -32,6 +32,10 @@ type ManagedProcess interface { | |||
// there's any system level issue stopping the process. | |||
Stop() error | |||
|
|||
// Kill will attempt to kill the process group and not wait for completion. Only use this if | |||
// comfortable with leaking resources (in cases where exiting the program as quickly as possible is desired). | |||
Kill() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call this KillGroup
?
@@ -126,6 +127,14 @@ func (p *managedProcess) kill() (bool, error) { | |||
return forceKilled, nil | |||
} | |||
|
|||
// forceKill kills everything in the process group. This will not wait for completion and may result in a zombie process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// forceKill kills everything in the process group. This will not wait for completion and may result in a zombie process. | |
// forceKill kills everything in the process group. This will not wait for completion and may result in the kill becoming a zombie process. |
@@ -126,6 +127,14 @@ func (p *managedProcess) kill() (bool, error) { | |||
return forceKilled, nil | |||
} | |||
|
|||
// forceKill kills everything in the process group. This will not wait for completion and may result in a zombie process. | |||
func (p *managedProcess) forceKill() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call this forceKillGroup
?
This is part one of two PRs that will hopefully help with shutting down all module processes before viam-server exits. Part 2 is here
This is still a draft as I'm looking for thoughts and ideas around making this better.
The idea behind this is for a Kill() call to propagate from the viam-server, and we should not block on anything if possible. The
kill()
s here do not wait for the execed process to complete, and this is intentional - I couldn't find if syscall.Kill waits for completion, so did not use. I wasn't able to find documentation about whetherkill -9
blocks on the signal getting received or if it immediately returns, so opted for.Start()
instead of.Run()
, which means we do notWait()
for the process to return. This does mean that the assumption is that the viam-server ends immediately after and that the system will properly await the processes that we spawned to kill the managed process, otherwise those processes will be zombie'd. I can see changing it so that we do a.Run()
instead, but might mean we would have to do the module process killing in parallel in viam-server.There is a todo to add Kill to the processManager, but a quick glance at how Close works for processManager makes me think that it should be a followup task