Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: strings, bytes: CutLast #71151

Open
chipaca opened this issue Jan 7, 2025 · 9 comments
Open

proposal: strings, bytes: CutLast #71151

chipaca opened this issue Jan 7, 2025 · 9 comments
Labels
LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool Proposal
Milestone

Comments

@chipaca
Copy link
Contributor

chipaca commented Jan 7, 2025

Proposal Details

A couple of times now while chopping up some strings I've happily enjoyed strings.Cut but had to write my own CutLast (the latest was implementing a subset of the package-url spec where some of the things you need to find from the back of the string, and some from the front). It's a silly thing but it feels like an oversight.

For reference, here is one of the CutLast I've implemented, verbatim. Trivial and obvious:

// CutLast slices s around the last instance of sep,
// returning the text before and after sep.
// The found result reports whether sep appears in s.
// If sep does not appear in s, CutLast returns s, "", false.
func stringsꞏCutLast(s, sep string) (before, after string, found bool) {
	if i := strings.LastIndex(s, sep); i >= 0 {
		return s[:i], s[i+len(sep):], true
	}
	return s, "", false
}
@gopherbot gopherbot added this to the Proposal milestone Jan 7, 2025
@chipaca
Copy link
Contributor Author

chipaca commented Jan 7, 2025

Thanks to @gabyhelp I note that in #46336 rsc called this LastCut, which obviously also works :-)

@earthboundkid
Copy link
Contributor

My last cut returns return "", s, false in the not found case. I think it's more useful that way. Conceptually, it's moving the balance from before to after as it searches backward through the string, so if it doesn't find anything, it pushes it all the way so that after is full and before is empty.

Also, mine is called LastCut instead of CutLast, which is more consistent with LastIndex.

@mibk
Copy link
Contributor

mibk commented Jan 7, 2025

Related: I also wanted (and implemented for my use case) a variant of strings.Cut, one that would be analogous to strings.IndexAny:

func CutAny(s, chars string) (before, after string, found bool) { … }

// e.g.:

next, rest, _ := strings.CutAny(s, " \t")

@mateusz834
Copy link
Member

My last cut returns return "", s, false in the not found case. I think it's more useful that way. Conceptually, it's moving the balance from before to after as it searches backward through the string, so if it doesn't find anything, it pushes it all the way so that after is full and before is empty.

Not sure about that, i think that i would still prefer the return s, "", false approach. I would expect after to be filled only when something has been found, consider:

fileName, ext, ok := strings.CutLast("test", ".")
fileName, ext, ok = strings.CutLast("test.test", ".")

I know that this is a bad example, this should use filepath.Ext, but in the first case i would not expect the "test" to be in ext.

@chipaca
Copy link
Contributor Author

chipaca commented Jan 7, 2025

Like with Cut, I think the "did not find the thing" return choice is arbitrary, and there are cases where the other alternative is what would avoid a check no matter which one we choose.

@chipaca
Copy link
Contributor Author

chipaca commented Jan 7, 2025

I like CutAny and CutLastAny (or LastCutAny) also, I've just never had to use them.


An argument against doing this is that it's trivial and clutters the already large docs (I sometimes wish we could group functions in the docs, so e.g. all the *Index* docs could be together). Another is that to this day I'm still having to work with code that's just using Split (not even SplitN) when a Cut would've been faster and cleaner, so it'll be at least 10 years before this has a positive effect.
Edited to add: by 'doing this' here I meant this whole proposal, not just CutAny etc.

@earthboundkid
Copy link
Contributor

I know that this is a bad example, this should use filepath.Ext, but in the first case i would not expect the "test" to be in ext.

Here is one of my uses of LastCut:

metadata.URLSlug = strings.TrimRight(metadata.URLSlug, "/")
_, metadata.URLSlug, _ = stringx.LastCut(metadata.URLSlug, "/")

In this case, I could probably have just used path.Base instead and because it's a URL, it's pretty much guaranteed to have at least one slash. Still, conceptually, I want the slug to capture everything and not nothing in the failure case.

We could have a compromise position and in the false case it could do return s, s, false. 😆

@gabyhelp gabyhelp added the LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool label Jan 7, 2025
@rogpeppe
Copy link
Contributor

FWIW I searched and found at least 5 definitions of cutLast in our code.
It's always cut-and-paste defined as:

func cutLast(s, sep string) (before, after string, found bool) {
	if i := strings.LastIndex(s, sep); i >= 0 {
		return s[:i], s[i+len(sep):], true
	}
	return "", s, false
}

Until right now, I thought it made intuitive sense to return the whole string in the second
return parameter because logically the "after" value is getting larger and larger
the further to the left the separator is, until the limit (the entire string) is reached.

However, I looked at my uses of cutLast where it was using the values regardless
of the value of found and found that clearly my intuition was wrong: I was actually
using it wrong in all these cases!

// parts.Path is known to be non-empty. The code is clearly
// expecting the second return value to be empty when there
// is no / character, but that's not the case.
_, last, _ := cutLast(parts.Path, "/")
if last != "" && last != parts.Qualifier {
	needQualifier = true
}

// In both the cases below, we're expecting the entire filename
// when there's no . character.
testName, _, _ = cutLast(filename, ".")

extless, _, _ := cutLast(filename, ".")

So, perhaps the right answer probably is to return the complete string in the first
value when the separator is not found.

Regardless of the decision, I think it would be very good to standardise on some semantics
because this kind of inconsistency is exactly the kind of thing that trips people
up and leads to unexpected bugs. I've certainly needed to reach for
this function on multiple occasions and strings has much good precedent
for adding Last variants of functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool Proposal
Projects
Status: Incoming
Development

No branches or pull requests

7 participants