issues with graphgen
#90
Labels
bug
Something isn't working
documentation
Improvements or additions to documentation
maintenance
things that should make our life easier in the future
Milestone
While comparing the output of the different fragmentation codes, i.e. "autogen", "graphgen" and my upcoming "chemgen" (#85) for larger molecules I noticed a few problems with "graphgen".
Incomplete Initialization
The graphgen fragmentation exposes its result via the
fragpart
class and looks as if it can be used as a replacement for autogen.Unfortunately a few components are not initialized although the information should available.
graphgen should either initialize them or make it explicit that it is incompletely initialized compared to autogen.
Let's say the user wanted to visualize the fragments and loops over
Frag_atom
, then it gives unexpected results for graphgen, because it is still an empty list in the case of graphgen.Since this is (?) relevant for the mixed basis code. I added @lweisburn and @mscho527 as well to this issue.
Fails at molecule
Testing graphgen at a large (342 atoms) molecule of the project of @lweisburn and @mscho527 fails with.
I cannot append the xyz file here, but will send it to you @ShaunWeatherly
To reproduce just call on the xyz file
Bad scaling
The aforementioned error happens after 36 s, i.e. the fragmentation is not finished after 36 s.
On the same molecule both "autogen" and "chemgen" need less than a second.
Apparently graphgen has a far worse scaling behaviour than the other algorithms since it is equally fast for small molecules such as octane.
If it is documented as a more or less drop-in-replacement, it should preferably have a similar scaling behaviour.
Or it should be documented, that it is a considerably slower algorithm.
Misleading name and interface
Coming back to our discussion from the review.
The interface and name of
euclidean_norm
is misleading and should be changed.Vector
and not just arbitrary tensors, (thenp.asarray
is fully redundant in both cases)np.floating[Any]
is unnecessary from my understanding; it can be justnp.floating
euclidean_norm(x, y)
they could assume it is the norm of one 2D vector with scalar componentsx, y
while in reality it is the distance between vectorsx
andy
.All in all
is better named and typed as
The text was updated successfully, but these errors were encountered: