-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Newer API #3
Comments
Thinking some more, another thing that I've wanted in the past is the ability to pipe in a C++ program. Using system() is dangerous if the string is able to be tainted, whilst with exec/popen etc, it's fine as the command arguments are strictly delineated. So, it's a bit annoying to invoke programs that use one another's output as their input. I can't really imagine what it would look like either, but perhaps something like: So that they're chained perhaps...? |
With regards to the daemon we can probably just call daemon. The interface would probably be similar to execute but without the lambda since you would be deliberately disconnecting from it. This will be much more "interesting" to do with windows. |
Regarding error handling, I'm having a read of this paper by herb in which he suggests that both forms are inadequate. It hasn't help me pick one. |
I'm thinking that we should stick to exceptions, with some of the enlightenment from that doc. Specifically because we're most likely going to have some operator overloads (>> and << at least) and that the kind of things that these interact with may be error prone - consider writing strings to a closed process (mid-way through writing or otherwise), what do you do? Well, it's not really feasible to return values from this stuff, so exceptions are the only real good option in this case. Yep, that daemon call seems good - another interesting thing is what would happen to the output? I can't remember what happens for daemons, so I'm not sure, but should we mimic something like nohup and have an explicit file set up for the daemon to output to? |
With regards to the new API I think it is strange that we require a list of strings for the arguments and a vector for the inputs. Is there a particular reason we can't just accept any iterable container of strings, or go full STL and only accept begin and end iterators for both? |
I suppose it's feasible - I've never seen the stdlib actually support that idea though. I imagine the declaration would look something like this then: template<class Container, class Container2>
int execute(const std::string& processPath, const Container& processArguments, const Container2& processStdin, ...); Where Looking at something from the stdlib that I've been annoyed that doesn't have this signature would be I agree that it's a good idea! |
How about we include all combinations? Eight is quite a lot (env also makes sense, right?) int execute(path, const Container& args, const Container2& stdin, const Container& env);
int execute(path, const Container& args, const Container2& stdin, Iterator beginEnv, Iterator endEnv);
int execute(path, const Container& args, Iterator beginStdin, Iterator endStdin, const Container& env);
int execute(path, const Container& args, Iterator beginStdin, Iterator endStdin, Iterator beginEnv, Iterator endEnv);
int execute(path, Iterator argsBegin, Iterator argsEnd, const Container2& stdin, const Container& env);
int execute(path, Iterator argsBegin, Iterator argsEnd, const Container2& stdin, Iterator beginEnv, Iterator endEnv);
int execute(path, Iterator argsBegin, Iterator argsEnd, Iterator beginStdin, Iterator endStdin, const Container& env);
int execute(path, Iterator argsBegin, Iterator argsEnd, Iterator beginStdin, Iterator endStdin, Iterator beginEnv, Iterator endEnv); |
all combinations? ick |
Just two? int execute(path, const Container& args, const Container2& stdin, const Container& env);
int execute(path, Iterator argsBegin, Iterator argsEnd, Iterator beginStdin, Iterator endStdin, Iterator beginEnv, Iterator endEnv); No builder pattern here, because the users probably won't like it |
I think just two but I'll just be using the iterator version for the internal Process class |
Alright, I'll update my branch when I get home (i'm working on the new API on the new_api branch on this repo) |
But you feel free to deviate, we'll be able to merge when you come back in |
The default behavior for iostreams is not to throw exceptions, you have to explicitly set what failure modes should throw exceptions https://en.cppreference.com/w/cpp/io/basic_ios/exceptions |
For the execute and check_output fns I ended up settling with this format: Execute/**
* Execute a subprocess and optionally call a function per line of stdout.
* @param commandPath - the path of the executable to execute, e.g. "/bin/cat"
* @param firstArg - the begin iterator for a list of arguments
* @param lastArg - the end iterator for a list of arguments
* @param stdinBegin - an InputIterator to provide stdin
* @param stdinEnd - the end of the InputIterator range for stdin
* @param lambda - a function that is called with every line from the executed process (default NOP function)
* @param envBegin - the begin of an iterator containing process environment variables to set
* @param envEnd - the end of the env iterator
*/
template<class ArgIt, class StdinIt, class EnvIt,
typename = typename std::enable_if<is_iterator<ArgIt>::value, void>::type,
typename = typename std::enable_if<is_iterator<StdinIt>::value, void>::type,
typename = typename std::enable_if<is_iterator<EnvIt>::value, void>::type>
int execute(const std::string& commandPath, ArgIt firstArg, ArgIt lastArg,
StdinIt stdinBegin = DevNullIterator(), StdinIt stdinEnd = DevNullIterator(),
const std::function<void(std::string)>& lambda = [](std::string){},
EnvIt envBegin = DevNullIterator(), EnvIt envEnd = DevNullIterator());
/**
* Execute a subprocess and optionally call a function per line of stdout.
* @param commandPath - the path of the executable to execute, e.g. "/bin/cat"
* @param commandArgs - the extra arguments for an executable e.g. {"argument 1", "henlo"} (default no arguments)
* @param stdinInput - a list of inputs that will be piped into the processes' stdin (default no stdin)
* @param lambda - a function that is called with every line from the executed process (default NOP function)
* @param env - a list of environment variables that the process will execute with (default nothing)
*/
template<class ArgIterable = std::vector<std::string>,
class StdinIterable = std::vector<std::string>,
class EnvIterable = std::vector<std::string>,
typename = typename std::enable_if<is_iterable<ArgIterable>::value, void>::type,
typename = typename std::enable_if<is_iterable<StdinIterable>::value, void>::type,
typename = typename std::enable_if<is_iterable<EnvIterable>::value, void>::type>
int execute(const std::string& commandPath, const ArgIterable& commandArgs = {},
const StdinIterable& stdinInput = {}, const std::function<void(std::string)>& lambda = [](std::string){},
const EnvIterable& env = {}); I made the arguments optional now, so people can write things like: // don't even have to provide args (they're default empty)
execute("/cool/path/shellscript.sh");
// for the iterator version you must provide some (at least the arg iterator)
// this is because there's no way to differentiate between the two then.
std::vector<std::string> args = {"argone", "argtwo", "arg3"};
execute("/cool/path/shellscript.sh", args.begin(), args.begin()+1); check_outputSimilar for check_output. /**
* Execute a subprocess and retrieve the output of the command
* @param commandPath - the path of the executable to execute, e.g. "/bin/cat"
* @param firstArg - the begin iterator for a list of arguments
* @param lastArg - the end iterator for a list of arguments
* @param stdinBegin - an InputIterator to provide stdin
* @param stdinEnd - the end of the InputIterator range for stdin
* @param lambda - a function that is called with every line from the executed process (default NOP function)
* @param envBegin - the begin of an iterator containing process environment variables to set
* @param envEnd - the end of the env iterator
*/
template<class ArgIt, class StdinIt, class EnvIt,
typename = typename std::enable_if<is_iterator<ArgIt>::value, void>::type,
typename = typename std::enable_if<is_iterator<StdinIt>::value, void>::type,
typename = typename std::enable_if<is_iterator<EnvIt>::value, void>::type>
std::vector<std::string> check_output(const std::string& commandPath, ArgIt firstArg, ArgIt lastArg,
StdinIt stdinBegin = DevNullIterator(), StdinIt stdinEnd = DevNullIterator(),
EnvIt envBegin = DevNullIterator(), EnvIt envEnd = DevNullIterator());
/**
* Execute a subprocess and retrieve the output of the command
* @param commandPath - the path of the executable to execute, e.g. "/bin/cat"
* @param commandArgs - the extra arguments for an executable e.g. {"argument 1", "henlo"}
* @param stdinInput - a list of inputs that will be piped into the processes' stdin
* @param env - a list of environment variables that the process will execute with (default nothing)
*/
template<class ArgIterable = std::vector<std::string>,
class StdinIterable = std::vector<std::string>,
class EnvIterable = std::vector<std::string>,
typename = typename std::enable_if<is_iterable<ArgIterable>::value, void>::type,
typename = typename std::enable_if<is_iterable<StdinIterable>::value, void>::type,
typename = typename std::enable_if<is_iterable<EnvIterable>::value, void>::type>
std::vector<std::string> check_output(const std::string& commandPath, const ArgIterable& commandArgs = {},
const StdinIterable& stdinInput = {}, const EnvIterable& env = {}) ; Unfortunately, there's not really any room for the check_output return code. Any ideas? I don't want to return a tuple for the check_output (like std::pair<std::vectorstd::string, int> or w/e), because it increases the difficulty to use this library. |
What even would be the return codes? It just returns the std::out of the executed process. Are you talking about error codes? |
yeah, the process's return code (0 for success, other values for failure) |
So we're not doing exceptions? Or is this for non exceptional cases. |
yeah, that's what I feel like makes sense, especially for that as there's no way to even indicate error codes nicely. exceptions it is? we may as well add these to the others then too. |
yeah can't really use error codes with a stateless call. regular streams store an error code internally. |
err, waitaminute, if we're throwing exceptions on non-zero statuses for execute, why are we returning an int for execute? maybe it does make sense for only check_output to throw exceptions, each of the other interfaces can access the error states/codes. |
legacy, from when I thought we might do error codes |
I've merged your stuff in the new_api branch. From our discussion we seemed to settle on stuff like this: Process p("/bin/cat", {"filename.out"}, {}, [&](std::string s) { std::cout << s; });
Process p2("/bin/grep", {"-i", "cool"}, {}, [&](..){ ok; });
Process p2("/usr/bin/lolcat", {});
std::ifstream myfile("outti.out");
p.pipe_to(p3);
p.pipe_to(p2).pipe_to(p3) >> myfile; // or pipe_to(myfile)..?
// equivalent to p >> p2 >> p3 >> myfile;
p.start();
p << 1 << std::endl;
// this shouldn't be allowed
// std::string output;
// p >> output; The current interface for the facing Process (still haven't renamed to ProcessStream) is: class ProcessStream {
// need some list of processes this process is supposed to pipe to
// shouldn't need to store the lambda, as the internal::Process object will store that
std::vector<ProcessStream&> nextProcesses;
public:
template<typename Functor = std::function<void(std::string)>>
ProcessStream(const std::string& commandPath, const std::vector<std::string>& commandArgs, const Functor& func = [](std::string){});
~ProcessStream();
// start the process and prevent any more pipes from being established.
// may throw an exception?
bool start();
// write a line to the subprocess's stdin
void write(const std::string& inputLine);
// read a line and block until received (or until timeout reached)
template<typename Rep>
std::string read(std::chrono::duration<Rep> timeout=-1);
// if there is a line for reading (optionally
template<typename Rep>
bool ready(std::chrono::duration<Rep> timeout=0);
// pipe some data to the receiver process, and return the receiver process
// we do this so we can have: process1.pipe_to(process2).pipe_to(process3)...etc
// if pipes are set up there are some restrictions on using the << and >> operators.
// if a process is receiving from another process, then they cannot use operator<< anymore
// if a process is outputting to another, they cannot use operator>>
ProcessStream& pipe_to(ProcessStream& receiver);
// ditto
ProcessStream& operator>>(ProcessStream& receiver);
// read a line into this process (so it acts as another line of stdin)
// instead of string, probably should be typename Stringable, and use stringstream and stuff.
ProcessStream& operator<<(const std::string& inputLine);
// retrieve a line of stdout from this process
ProcessStream& operator>>(std::string& outputLine);
// some other functions which maybe useful (perhaps take a timeout?)
// returns whether it could terminate
bool terminate();
// a more...extreme way
bool kill();
// send arbitrary signals to the subprocess
void signal(int signum);
class iterator;
// provide an iterator to iterate over the stdout produced
iterator begin();
iterator end();
}; We also had a todo list:
|
Our current API is a bit clunky and doesn't allow supporting all of the proposed features we wanted.
So, here's some newer endpoints I propose:
Executing a basic subprocess
Also, another
execute
function will exist, except that instead of taking in a std::vector to supply its stdin, it takes two input iterators (the start and end of a range). It's signature would look like this:Where the InputIt would require typical iterator methods implemented (similar to InputIterator defined here: https://en.cppreference.com/w/cpp/named_req/InputIterator).
Retrieving subprocess output
Similar to the
execute
function's parameters, thecheck_output
function (name TBD, this name is just copied from the python3 subprocess library, we should check if they have better names (or there's internal conflict about how shit that name is)) simply returns a vector containing all lines of stdout.Input iterator is similar to the above restrictions in the
execute
function.Interactive input/output
To enable dynamically setting what should be fed into the program based on the output, we introduce a newer ProcessStream class. This class has only a few methods available, but allows feeding a string into the process, and checking whether a line is available to read (there may not be one always available for each stdin, consider grep when the line doesn't match) and also extracting it.
I imagine it's use will be something along these lines:
There will naturally be non-operator overload equivalents to those endpoints (most likely something like
.write(std::string)
andstd::string .read(timeout)
.We require a timeout, because it's not really possible to check if there is a line available, without some kind of call to select (I think!).
Daemon spawning
People seem to like to spawn and forget processes (especially in some kind of management application), so providing a method to disown and mark them as daemons makes a whole lotta sense. I haven't written daemon code too often (it's been a couple years since I have I reckon), and that was in pure C. As this library has to use C to implement these features it's certainly doable, but would require different behaviour per OS, because this I imagine heavily differs between OSes. The problem is, I don't really know what a nice interface to doing so would be.
Any suggestions would be much appreciated.
Error Handling
As we're dealing with external applications, they may throw at any time and inconsistently (either due to the application, the OS/user killing them, or bugs), we need some kind of error handling.
Currently, the above APIs proposed return an error status, but I feel like this may not be very idiomatic, and that exceptions should be used instead. I suppose this requires a read up on C++ philosophy before being able to make a decision, but I'm thinking of mirroring how Python's subprocess library does it, which is pretty much to throw an exception whenever anything goes wrong, including non-zero return statuses.
Asynchronous Calls
Whether the user should be responsible for running this stuff in the background is again another philosophical question to ask, but as this is a library aimed at dumb-dumbs, its worth consideration if this is something we should hold their hand on. Currently, we provide a method to spawn a subprocess asynchronously, and return a
std::future
of the return code. Naturally, if we change the error handling to be a different strategy, this may not make sense, but I can't imagine asynchronous informing the main process about the status via exceptions can't be neat, so using futures on the return value seems like a good idea to cover this.Final remarks
In my mind, these cover the current TODOs I propose in the current README, so it seems like a good direction to head. The only thing that I don't know and could cause problems in these interfaces is how Windows deals with processes. Some of these things may not be possible.
Any comments are appreciated (and likely only Cameron is gonna comment, but I'll try and hook some people to comment).
Cheers
The text was updated successfully, but these errors were encountered: