-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IO optimization #164
base: main
Are you sure you want to change the base?
IO optimization #164
Conversation
implented some small speedups
still inexplicably have an order-of-magnitude speed difference between calling `parse_newick` on a newick string and doing the _exact same thing_ within the io code...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a few minor comments. Some of the test failures are legitimate and at least one seems to be related to this PR (https://travis-ci.org/github/dendrograms/astrodendro/jobs/523668907#L1286).
self._smallest_index = min(self._indices) | ||
try: | ||
self._smallest_index = min(self._indices) | ||
except ValueError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What causes the ValueError in min? Is there a check somewhere if _indices
is empty?
parse_newick right now)
This is a WIP to speed up the reading and parsing of dendrograms from disk.
One key change in speeding up the parsing was removing a conversion from an array of coordinates to a list of tuples of coordinates. Creating N tuples was a very expensive step, apparently, and eliminating it cut the time by >50% in the
_fast_reader
.The WIP is (1) to make sure that we didn't break anything, really and (2) to try to speed up
parse_newick
.