Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use UTF-8 to decode strings everywhere (and fixed locale-dependent unit tests) #25

Merged
merged 3 commits into from
Mar 11, 2015

Conversation

cameron314
Copy link
Contributor

I've changed the StringTable to use UTF-8 to decode strings, instead of the previous ISO-8859-1 decoding (via casting bytes to chars). While no string encoding is mentioned in the ELF specification, it seems that the de-facto standard, like for much else in the Linux-y world, is UTF-8 (e.g. this is what clang produces).

The StringTable is now also populated lazily, and reads the bytes (once) in a block instead of one at a time. This should improve performance.

Also, some of the unit tests were failing locally since they were dependent on the environment of the user running the tests -- one of them depended on the current timezone, and two others depended on Environment.NewLine being "\n". I fixed that too (all the tests pass for me now).

I would be most grateful if you could accept these patches into the master, and in particular update the NuGet.

Finally, thanks for writing this library! It's extremely useful :-)

… files, and thus string constants, have Unix line endings even in Windows test environment)
…CII) -- this matches the de-facto standard encoding of ELF files (that clang produces, for example)
@konrad-kruczynski
Copy link
Owner

It's a pleasure to receive so refined and well described patches! As for the encoding part, I should definitely go for the UTF-8 solution in the first place, good point. Other changes are also nice. The only problematic part is formatting (but frankly speaking, I can't tell whether it was crippled earlier or it is now - due to the problems with formatting in older mono versions). Anyway I'll fix it by myself since it actually means just running the formatter.

I'm not able to do the nuget package today, but should be available tomorrow.

konrad-kruczynski added a commit that referenced this pull request Mar 11, 2015
Use UTF-8 to decode strings everywhere (and fixed locale-dependent unit tests)
@konrad-kruczynski konrad-kruczynski merged commit 3637bca into konrad-kruczynski:master Mar 11, 2015
@konrad-kruczynski
Copy link
Owner

Since I'd also like to add you to the list of contributors (which will be created in README), please just tell me how I should name you (i.e. name, nick, mail to use).

@cameron314
Copy link
Contributor Author

Ah, sorry about the formatting. I've set my VS to use tabs and hadn't noticed the mismatch until you pointed it out. Oops!

You can put my name as simply "Cameron", and email as "cameron [@] moodycamel.com". Thanks!

@konrad-kruczynski
Copy link
Owner

The nuget package should be available now.

@cameron314
Copy link
Contributor Author

Awesome, thanks!

konrad-kruczynski added a commit that referenced this pull request Nov 6, 2015
Use UTF-8 to decode strings everywhere (and fixed locale-dependent unit tests)
konrad-kruczynski added a commit that referenced this pull request Nov 6, 2015
Use UTF-8 to decode strings everywhere (and fixed locale-dependent unit tests)
konrad-kruczynski added a commit that referenced this pull request Nov 6, 2015
Use UTF-8 to decode strings everywhere (and fixed locale-dependent unit tests)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants