-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graph theoretic fragmentation via graphgen
.
#86
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great improvement! Thank you!
I have a few strongly requested changes and some other comments.
oh and the most important comment, I really would add a test, probably something like |
Alright, unittests have been added for both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good now.
The only big question mark I still have is about the euclidean_norm
function.
A general comment just for the record and potential improvements for the future; we talked about it in person.
I think there is unnecessary casting between lists, tuples and sets.
This graph operation can most likely be completely done with sets
and frozenset
with only one cast to an ordered container such as list in the end.
In the current implementation there is more book-keeping necessary to avoid duplicates and unnecessary frequent casts.
But the code works, is really well tested, and readable. If you want to keep it like this, then I am also fine.
The idea is rather straightforward:
graphgen
generalizes the BE n fragmentation scheme to arbitrary fragment sizes (I've tested BE1-BE9 so far) by using a graph theoretic heuristic. In it, atoms are assigned to nodes in an adjacency graph and edges are weighted by some distance metric. For a given fragment center site, Dijkstra's algorithm is used to find the shortest path from that center to its neighbors. The number of nodes visited on that shortest path determines the degree of separation of the corresponding neighbor. I.e., all atoms whose shortest paths from the center site visit at most 1 node must be direct neighbors (adjacent to the center site), which gives BE2-type fragments; all atoms whose shortest paths visit at most 2 nodes must then be second-order neighbors, hence BE3; and so on. This depends on the NetworkX library.Major points to note:
autogen
for n=1->3)EXTRA: now adds
FragmentMap
data class.EXTRA2: there are 14 new unittests covering both
autogen
andgraphgen