Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting like Xcode. Fixes #615. #677

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft

Conversation

Coeur
Copy link
Contributor

@Coeur Coeur commented Apr 25, 2019

I'm not fully familiar with Ruby (apart from a Podfile/Podspec), so advices are welcome.

@Coeur Coeur force-pushed the xcode_sort branch 3 times, most recently from af8c682 to b7ee6ed Compare April 25, 2019 07:30
Copy link
Member

@segiddins segiddins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the performance implications of this change?

@@ -437,10 +438,9 @@ def sort(options = nil)
next groups_position == :above ? 1 : -1
end
end

result = File.basename(x.display_name.downcase, '.*') <=> File.basename(y.display_name.downcase, '.*')
result = XcodeSortString.new(File.basename(x.display_name, '.*')) <=> XcodeSortString.new(File.basename(y.display_name, '.*'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should XcodeSortString internally handle sorting lexicographically by basename and then extname ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think this situation is just to handle identical filenames in different folders: let's keep the fallback independent from the sort helper..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this situation with Xcode, and Xcode does not sort at all when filenames are identical: it will simply keep the existing order. That supports that we shouldn't include this behavior in XcodeSortString itself, and we should simply keep it externally like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@segiddins actually, I misunderstood extname for "extended name" (aka fullpath) when actually it's a function to get the "file extension". So yes, you're right, there is something to fix here. I'm looking at it.

describe 'In general' do
it 'sorts names like Xcode' do
unsorted_names = [
# spaces comparison (test disabled: not properly supported)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be handled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if someone figures out how to do it. It's not straight forward like simply stripping the spaces, it's more about handling three kinds of patterns instead of two: ints, strings, spaces. And all the possible comparisons (int vs space, string vs space, space vs space). If the Xcode sorting was documented, it would be easier.

# `Integer`+`rescue`: credit to https://stackoverflow.com/a/39025521
@ints_and_strings = @scrubbed.scan(/\d+|\D+/).map { |s| begin Integer(s, 10); rescue ArgumentError; s end }
# comparing patterns: credit to https://rosettacode.org/wiki/Natural_sorting#Ruby
@i_s_pattern = @ints_and_strings.map { |el| el.is_a?(Integer) ? :i : :s }.join
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please give this ivar a more descriptive name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's the "integers and strings pattern". I'll try to rename it.

compare
else
# equality, we sort reverse raw
-(str <=> other.str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to sort in reverse order, you can do other.str <=> str with no need for negation

@str = str
@scrubbed = str.downcase
# `Integer`+`rescue`: credit to https://stackoverflow.com/a/39025521
@ints_and_strings = @scrubbed.scan(/\d+|\D+/).map { |s| begin Integer(s, 10); rescue ArgumentError; s end }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this without having to swallow exceptions?

Copy link
Contributor Author

@Coeur Coeur Apr 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, absolutely, by adding a regex match: { |s| /d/ === s ? Integer(s, 10) : s }.
I do not know which one is the most performant, but I thought an exception may be faster than a regex match.
I'll change it.
[edit: actually, it doesn't give the same results, strange]

@Coeur
Copy link
Contributor Author

Coeur commented Apr 25, 2019

What are the performance implications of this change?

The filenames are split on their digits part, then the sort is made on those collections: the cost is probably quite low. We may do a performance test, but project filenames are generally short (often less than 50 chars) and don't often have more than 1 digit part. So I bet it would be at worst sorting twice slower than before.

@Coeur Coeur force-pushed the xcode_sort branch 3 times, most recently from c8a13c9 to ca7b41d Compare April 26, 2019 05:47
@Coeur
Copy link
Contributor Author

Coeur commented Apr 26, 2019

@segiddins I've now solved all my known test cases.

@Coeur
Copy link
Contributor Author

Coeur commented May 3, 2019

@dnkoutso Would you like to review this PR as well?

@Coeur
Copy link
Contributor Author

Coeur commented May 15, 2019

OK, I figured out something worth mentioning:
This PR solves the "integer" natural sorting, but it does not solve the "unicode" sorting.

Solving the unicode sort would most likely require activesupport dependency, which was removed two years ago by @segiddins according to the changelog.
So I'd like to know if this PR, as it is currently, is worth merging for its partial improvement on sorting, or if I need to re-introduce activesupport in Xcodeproj in order to be able to use transliterate or mb_chars.normalize(:kd) and attempt some Unicode Collation Algorithm sorting?

[edit]
Oh, nevermind, I just discovered https://ruby-doc.org/stdlib-2.3.1/libdoc/unicode_normalize/rdoc/String.html which doesn't require activesupport. I'll give it try and update the PR with it.

@dnkoutso
Copy link
Contributor

Mostly concerned with performance implications here. Large projects can take a big hit. Can you simulate a scenario with many many files and benchmark it?

@Coeur Coeur force-pushed the xcode_sort branch 3 times, most recently from a113a92 to 3ebffad Compare May 16, 2019 16:43
@Coeur
Copy link
Contributor Author

Coeur commented May 16, 2019

@dnkoutso I was busy solving some edge cases, and I figured out that Xcode sorting is broken for Unicode: http://www.openradar.me/radar?id=5012044621283328

I also figured out, from https://travis-ci.org/CocoaPods/Xcodeproj/builds/533316561 that ruby 2.4 and newer will sort Unicode strings differently than in ruby 2.3 and older.

Lastly, while macOS 10.14.5 is bundled with ruby 2.3.7p456, the upcoming macOS 10.15 will be bundled with ruby 2.6.0 according to Google: https://www.google.com/search?q=ruby+macOS+%2210.15%22.

All in all, we should wait for macOS 10.15 beta in a couple of weeks and see if Xcode sorting is affected by the newer version of ruby.

As for performances, I suggest generating 10,000 UDID (easy mix of letters and digits) and measuring sorting that data with a plain downcase compare and with our XcodeSortString helper. I haven't done the benchmark yet.

@dnkoutso
Copy link
Contributor

As for performances, I suggest generating 10,000 UDID (easy mix of letters and digits) and measuring sorting that data with a plain downcase compare and with our XcodeSortString helper. I haven't done the benchmark yet.

sg!

@dnkoutso
Copy link
Contributor

dnkoutso commented Jun 4, 2019

any updates here? Looking to make a release for Xcode 11 this week.

@Coeur
Copy link
Contributor Author

Coeur commented Jun 5, 2019

@dnkoutso :

require 'benchmark'
require 'securerandom'

array = (1..10000).map { SecureRandom.uuid }
array1 = array.dup
array2 = array.dup

Benchmark.bmbm do |x|
    x.report("downcase") { array1.sort! do |x, y|; x.downcase <=> y.downcase; end }
    x.report("XcodeSortString") { array2.sort! do |x, y|; Xcodeproj::XcodeSortString.new(x) <=> Xcodeproj::XcodeSortString.new(y); end }
end

output:

                      user     system      total        real
downcase          0.010000   0.000000   0.010000 (  0.011024)
XcodeSortString  14.490000   0.050000  14.540000 ( 14.556600)

lol ... I guess we need to give the choice between a fast sort and an accurate sort.

Another issue is that we're using sort!, which cancels the caching efforts made in the conversion. We would get better results with a sort_by!:

Benchmark.bmbm do |x|
    x.report("downcase") { array1.sort_by! do |x|; x.downcase; end }
    x.report("XcodeSortString") { array2.sort_by! do |x|; Xcodeproj::XcodeSortString.new(x); end }
end
                      user     system      total        real
downcase          0.010000   0.000000   0.010000 (  0.004472)
XcodeSortString   0.790000   0.000000   0.790000 (  0.793635)

@dnkoutso
Copy link
Contributor

dnkoutso commented Jun 5, 2019

If we can maintain the performance or make it negligible we can land this!

@Coeur Coeur force-pushed the xcode_sort branch 3 times, most recently from 48dfac8 to 864d6ce Compare June 5, 2019 06:27
@Coeur
Copy link
Contributor Author

Coeur commented Jun 5, 2019

I just found issues in sorting symbols... so it's not ready to merge.

@Coeur Coeur marked this pull request as draft October 4, 2024 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants