Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(commands)!: change arg handling strategy #12441

Closed
wants to merge 38 commits into from

Conversation

RoloEdits
Copy link
Contributor

@RoloEdits RoloEdits commented Jan 7, 2025

Closes: #7453
Closes: #6171
Closes: #10549
Closes: #10993

Blocking: #12320
Blocking: #12288
Blocking: #11164

helix-term/src/commands/typed.rs Outdated Show resolved Hide resolved
helix-core/src/shellwords.rs Outdated Show resolved Hide resolved
helix-core/src/shellwords.rs Outdated Show resolved Hide resolved
helix-core/src/shellwords.rs Outdated Show resolved Hide resolved
@RoloEdits
Copy link
Contributor Author

The changes here seem to work well with the flags PR. This should be ready for review, I think I just need to touch up the wording in the errors when the params count it off. I will open this for review and then get to that as well as update the writeup with actual content.

@RoloEdits RoloEdits marked this pull request as ready for review January 8, 2025 14:02
@RoloEdits RoloEdits changed the title refactor(shellwords): change arg handling strategy refactor(commands): change arg handling strategy Jan 8, 2025
@RoloEdits RoloEdits changed the title refactor(commands): change arg handling strategy refactor(commands)!: change arg handling strategy Jan 8, 2025
helix-core/src/shellwords.rs Outdated Show resolved Hide resolved
helix-core/src/shellwords.rs Outdated Show resolved Hide resolved
helix-core/src/shellwords.rs Outdated Show resolved Hide resolved
helix-core/src/shellwords.rs Outdated Show resolved Hide resolved
helix-core/src/shellwords.rs Outdated Show resolved Hide resolved
helix-term/src/commands/typed.rs Outdated Show resolved Hide resolved
helix-term/src/commands/typed.rs Outdated Show resolved Hide resolved
helix-term/src/commands/typed.rs Outdated Show resolved Hide resolved
helix-term/src/commands/typed.rs Outdated Show resolved Hide resolved
helix-term/src/commands/typed.rs Outdated Show resolved Hide resolved
/// but a normal `&str`.
#[inline]
#[must_use]
pub fn peek(&self) -> Option<Cow<'_, str>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed: a Peekable should be used instead now that the item might be an allocation

Copy link
Contributor Author

@RoloEdits RoloEdits Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can just make this private instead. When its parsing flags it wont ever allocate. And when its collected it has them all enabled for Parameters, which could allocate, but it never used peek, just a collect. I need internal access to the idx. The string that passed to unescape has to have the correct lifetime, its its done through a function (like rest) im pretty sure it will error as not living long enough: unescape(&input[args.idx..], false)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's small so if we need to use it later we can reintroduce it. Let's remove it for now.

Copy link
Contributor Author

@RoloEdits RoloEdits Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove it here and then in flags add it if I need it. As it is really only meant to be used there.

Copy link
Contributor Author

@RoloEdits RoloEdits Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also just to clarify on why this is most likely needed, if Literal parsing can have flags, like with sh for example. I need to exhaust the flags but still be able to save the rest of the remaining string as one thing. Peekable does not give access to the internal state and the way it works it to store the next value in an Option<Option<I::Item>>. This advances the iterator one place too many when taking the idx into account. Thats why I clone in this one to get a new instance as no state can be advanced in the looping iterator. Not only does Peekable not have access to the rest function, it wouldnt work correctly even if it did as already stated, when passing to unescape it creates a lifetiem that doesnt live long enough. So I need to implement rest again here with unescape(&input[args.idx..]) which gives a correct lifetime tied to input. Regardless though, this is really only needed for the flags. I will just put it there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For things that use literal parsing we can enforce that flags are only at the beginning of the input. Kakoune has a similar concept, see ParameterDesc::Flags in parameters_parser.hh

Copy link
Contributor Author

@RoloEdits RoloEdits Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats how it is now, at least in being assumed as such. The main issue is that I need the remaining part of the input which means it needs to be aware of progress that was made in idx during the loop.

Copy link
Contributor Author

@RoloEdits RoloEdits Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:sh --no-popup wezterm spawn floating-pane lazygit
      yielded^ ^idx..
unescape(&'a input[idx..], _)                    

@RoloEdits RoloEdits force-pushed the shellwords branch 3 times, most recently from 5e9274f to de7858d Compare January 10, 2025 03:43
@RoloEdits
Copy link
Contributor Author

RoloEdits commented Jan 10, 2025

Hmm, im wondering if we need special case "" a bit. If we pass it to Args with Args::from(""), with its default ParseMode::Raw, it will always create a Vec, even if the only entry is vec![""]. This means that Args::from("").is_empty() is false, as there is currently one in the Vec (current implimentation is self.positionals.is_empty). We might need to check if the first argument is also empty or not with an is_some_and. Or maybe just filter out empty positions before collecting? filter(|arg| !arg.is_empty). Or return None whenever self.input.is_empty.

first and last also suffer similar problems if the only entry is "". len as well.

@RoloEdits
Copy link
Contributor Author

Or return None whenever self.input.is_empty.

I did this for now as it seems more intuitive to have it like this.

@RoloEdits
Copy link
Contributor Author

RoloEdits commented Jan 10, 2025

The completion paths for windows are handled in such a way that makes them wrong and hard to parse around in trying to adhere to the normal windows conventions. For example :o dir\"path to file.txt" is an invalid path in windows, yet this is what the completion is:
image

Because the backslashes are path segments, unless we make every windows path completion double backslash, you cant really escape spaces.

When typing / on windows it maintains the list of relative path completions, but with \ it provides drive level completions.
image
image

What we could do is just abstract the conventions and offer completions with / path segments. This way there wouldnt need to be compile time parser mode differences and spaces can be escaped in the same way and actually provide a correct representation to then convert into a proper path for the OS. Rust itself does this when handling strings to paths. This would simplify some things in code.

@RoloEdits
Copy link
Contributor Author

Another thing that could work is instead of replacing just the segment to surround with quotes, it replaces the whole thing once and puts an open quote at the start of the path. The way the parsing is done with the completions now this would still allow to complete segments and then you could just manually close the quote if you need to open more files. Otherwise just accepting with no closing quote would open the file correctly.

@the-mikedavis
Copy link
Member

I took a very long look into this, completions, flags and expansions. Usually I would prefer to see distinct features like these land independently but in this case I now think it's cleaner to land all at once since they can't really be independent code-wise. Each feature informs the design of the parsers, their output types and the completion code. I have changes locally now that add all of these features together based on Kakoune's approach. Kakoune's command line syntax and expansion strategy should be surprisingly effective for us. (Surprising because Kakoune doesn't support Windows.)

On the syntax level we should basically avoid supporting escapes like \n or \" - this simplifies compatibility with Windows. Unix can still do some escaping but it should be easy to avoid using backslashes. Unicode can instead be covered by expansions like %u{25CF} rather than \u{25CF}. The line ending should be covered by a variable expansion corresponding to the document's like ending %{line_ending} and maybe we can add special cases like %u{tab}/%u{lf}/u{crlf}. Otherwise the rest should be handled by quoting rules.

On the implementation level this looks like two parsers: one covering basically what shellwords does today (corresponding to ArgsParser), parsing the command line into tokens naively (unaware of the command's signature), and the other handling the distinction between flags and positionals (corresponding to Args). Both are used for the sake of completion. The expansion code feeds on the tokens from the first parser before passing them to the second.

I'll post my changes later this evening and I'd appreciate it if you could give them a try if you get a chance, especially on Windows. Also thanks for your work on this and its predecessor! These PRs have pushed the design forward quite a bit.

@RoloEdits
Copy link
Contributor Author

Usually I would prefer to see distinct features like these land independently but in this case I now think it's cleaner to land all at once since they can't really be independent code-wise.

This is something I heavily felt as well. Many times I had almost asked if I could just combine them into one PR.

The expansion code feeds on the tokens from the first parser before passing them to the second.

If im understanding this right, I think this is what we had landed on for that PR too.

I'll post my changes later this evening and I'd appreciate it if you could give them a try if you get a chance, especially on Windows.

Will do. I have come across a lot of pitfalls and edge cases in doing this, so I will be sure to give it a good peek.

Also thanks for your work on this and its predecessor! These PRs have pushed the design forward quite a bit.

Am I understanding that this PR is dead, and you will be combining them in a new one?

Regardless, its been a good learning experience, I just wish I could have had more understanding around this, as it feels my ignorance really hindered seeing it through to the end. But I'm glad the groundwork helped.

@the-mikedavis
Copy link
Member

Right yeah, I intend on continuing this with my own changes. I've posted #12527 and will hopefully merge it down within a few weeks. I would encourage you not to see this as a bad reflection on your work here - this and the predecessor are good changes and certainly far better than the questionable stuff we have in shellwords today. For me review is a balance of needing to familiarize myself with a problem and solution enough to know if a fix/enhancement will age well vs. spending time investigating it and working on it myself. In particularly complex areas like this that have lots of edge-cases and need design or necessary breaking changes it becomes more economical to make the change directly rather than guide an interested contributor. Aside from that we're generally more comfortable with accepting large (multi-K-line changes) from maintainers.

@RoloEdits
Copy link
Contributor Author

I would encourage you not to see this as a bad reflection on your work here

No worries, this just got out of my paygrade, with so many things that this effected. It was the correct choice, to subsume it all into one integrated implimentation. The project is better off for it.

Thanks once again, cant wait to see it all land!

@RoloEdits RoloEdits closed this Jan 14, 2025
@RoloEdits RoloEdits deleted the shellwords branch January 14, 2025 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants