-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use UTF-8 to decode strings everywhere (and fixed locale-dependent unit tests) #25
Use UTF-8 to decode strings everywhere (and fixed locale-dependent unit tests) #25
Conversation
… files, and thus string constants, have Unix line endings even in Windows test environment)
…CII) -- this matches the de-facto standard encoding of ELF files (that clang produces, for example)
It's a pleasure to receive so refined and well described patches! As for the encoding part, I should definitely go for the UTF-8 solution in the first place, good point. Other changes are also nice. The only problematic part is formatting (but frankly speaking, I can't tell whether it was crippled earlier or it is now - due to the problems with formatting in older mono versions). Anyway I'll fix it by myself since it actually means just running the formatter. I'm not able to do the nuget package today, but should be available tomorrow. |
Use UTF-8 to decode strings everywhere (and fixed locale-dependent unit tests)
Since I'd also like to add you to the list of contributors (which will be created in README), please just tell me how I should name you (i.e. name, nick, mail to use). |
Ah, sorry about the formatting. I've set my VS to use tabs and hadn't noticed the mismatch until you pointed it out. Oops! You can put my name as simply "Cameron", and email as "cameron [@] moodycamel.com". Thanks! |
The nuget package should be available now. |
Awesome, thanks! |
Use UTF-8 to decode strings everywhere (and fixed locale-dependent unit tests)
Use UTF-8 to decode strings everywhere (and fixed locale-dependent unit tests)
Use UTF-8 to decode strings everywhere (and fixed locale-dependent unit tests)
I've changed the StringTable to use UTF-8 to decode strings, instead of the previous ISO-8859-1 decoding (via casting bytes to chars). While no string encoding is mentioned in the ELF specification, it seems that the de-facto standard, like for much else in the Linux-y world, is UTF-8 (e.g. this is what clang produces).
The StringTable is now also populated lazily, and reads the bytes (once) in a block instead of one at a time. This should improve performance.
Also, some of the unit tests were failing locally since they were dependent on the environment of the user running the tests -- one of them depended on the current timezone, and two others depended on
Environment.NewLine
being"\n"
. I fixed that too (all the tests pass for me now).I would be most grateful if you could accept these patches into the master, and in particular update the NuGet.
Finally, thanks for writing this library! It's extremely useful :-)