-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Editorial: clarify URL validity #666
Conversation
url.bs
Outdated
@@ -1170,7 +1170,9 @@ unified model would be, please file an issue. | |||
|
|||
<li><p>The <a>URL serializer</a> takes a <a for=/>URL</a> and returns an <a>ASCII string</a>. (If | |||
that string is then <a lt="URL parser">parsed</a>, the result will <a for=url>equal</a> the <a | |||
for=/>URL</a> that was <a lt="URL serializer">serialized</a>.) | |||
for=/>URL</a> that was <a lt="URL serializer">serialized</a>.) The output of the | |||
<a>URL serializer</a> is not always a <a>valid URL string</a>. I.e., not all <a for=/>URLs</a> are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth adding a pointer to #379, because I am still hoping we can change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't think it's useful that \
in HTTP URLs shows up as something you probably want to fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, we can rehash that thread here if you want... yes, I think the serializer should always produce valid URLs, either by expanding the definition of valid, or changing the serializer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, \
is not applicable then I think. I got this confused with the people who think all inputs ought to be valid or rejected. Whereas you don't necessarily think all inputs ought to be valid or rejected, but the invalid inputs that are accepted, ought to be transformed to something valid when they are spit out again.
So yeah, the reason for that is mainly encouraging RFC 3986 interop. But I'm not sure anyone is really appreciative of that.
url.bs
Outdated
@@ -1160,7 +1160,7 @@ unified model would be, please file an issue. | |||
|
|||
<ul> | |||
<li><p>The <a>URL parser</a> takes an arbitrary string and returns either failure or a | |||
<a for=/>URL</a>. | |||
<a for=/>URL</a>. It might also record zero or more <a>validation errors</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know if
"URL parser records zero validation errors" implies "input string is a valid URL string"?
How about the other direction?
It would be great to clarify the purpose of these validation errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not, and I strongly suspect they are not equivalent. I think there are some open issues on it.
My preferred strategy has been to instrument whatwg-url with both modes of validation and fuzz to find examples where they mismatch. I haven't made the time to do so yet though.
…roposal or consensus)
Closes #595.
Preview | Diff
Preview | Diff