-
-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
string.utf_codepoint
returns an error for valid codepoints U+FFFE
and U+FFFF
#778
Comments
Thank you |
Hello, if it is possible may I pick this up? For building out a solution and testing locally, should I just run |
Treats U+FFFE and U+FFFF as valid unicode codepoints rather than errors. See gleam-lang#778.
Hey @jooaf, I apologize--I did not notice you commented on this issue asking to work on it before I went ahead and fixed it. If you want I could close my PR and you could take it (since it is a good first issue!) |
Hey @mooreryan, no worries at all! That's so kind of you! Since you have already done the work, I think you should go ahead with your PR. I will just be on the lookout for another good first issue :) |
Treats `U+FFFE` and `U+FFFF` as valid Unicode codepoints rather than errors. See gleam-lang#778.
Treats `U+FFFE` and `U+FFFF` as valid Unicode codepoints rather than errors. See gleam-lang#778.
Treats `U+FFFE` and `U+FFFF` as valid Unicode codepoints rather than errors. See #778.
Completed in #781. |
The
string.utf_codepoint
returns an error for valid codepointsU+FFFE
(65,534) andU+FFFF
(65,535).The line of code is here.
These two codepoints are two of the 66 so-called "noncharacter code points". Here are some excerpts from 23.7.1 Noncharacters: U+FFFE, U+FFFF, and Others of the Unicode core spec:
If the goal is for
string.utf_codepoint
to returnError(Nil)
for noncharacter codepoints, then there are many missing noncharacter codepoints that do not return an error, e.g.,U+1FFFF
.However, I think that the correct behavior would be for the
string.utf_codepoint
function to return Ok forU+FFFE
andU+FFFF
. (As examples, the Gleam accepts the literals"\u{FFFE}"
and"\u{FFFF}"
as valid unicode code points, and both Elixir and Rust also accept them as valid code points)The text was updated successfully, but these errors were encountered: