Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find the minimum side length for cells such that minimization doesn't fail sometimes. #2

Open
mrshirts opened this issue Feb 2, 2022 · 14 comments

Comments

@mrshirts
Copy link

mrshirts commented Feb 2, 2022

Ww need to figure out how to set the minimum distance such that minimization doesn't fail when the cell box vectors change.

In theory, one could have the boxes change by a very large amount, on minimization, so one cannot guarantee that it will never fail, but in practice, there is a minimum distance.

Sam did this with 1.8 nm, and some percentage got stuck (see the readme). Is 2.0 nm enough? Does it need to be larger? We should figure out.

@mrshirts mrshirts changed the title Find the minimum side length for cells such that minimization doesn't Find the minimum side length for cells such that minimization doesn't fail sometimes. Feb 2, 2022
@Yu-Tang-Lin
Copy link
Collaborator

I check Sam's code. Currently, the size of a supercell is 2.0 nm. I am trying 3.0 nm and 4.0 nm for testing.

@mrshirts
Copy link
Author

mrshirts commented Feb 9, 2022

I think Sam might only have tested 1.8, and 2.0 was put into the code but not tested. But I can't remember.

3.0 will be bigger than needed. It should work as long as the cell length is 2x the short-range energy cutoff (which is 0.9 nm). But there could be some cells that start at greater than 2.0, but end up less than 1.8 nm

I would try 2.2 and 2.4 nm. 3.0 nm is almost certainly not needed, and will be slower - the approximate compute time scales with the size of the system, and a system that is 3.0 x 3.0 x 3.0 nm will be (3/2)**3 = 3.38 times slower.

@shirtsgroup shirtsgroup deleted a comment from Yu-Tang-Lin Feb 9, 2022
@Yu-Tang-Lin
Copy link
Collaborator

Okay, I will try this 2.2/2.4. Currently, we have around 300 COD files, and only 10% of files can find the minimization energy.

@mrshirts
Copy link
Author

Huh, can you check in with Sam Kennedy on this? I think he was getting the majority of the files to minimize. Check the README.md file

@Yu-Tang-Lin
Copy link
Collaborator

Sure, it will be better for us to confirm that.

@Yu-Tang-Lin
Copy link
Collaborator

Note for @Yu-Tang-Lin
Number of min energy found from 292 COD file

For different supercells,
20A: 33/292
22A: 37/292
24A: 53/292

Also, check with Sam whether the majority of the files minimize.

@Yu-Tang-Lin
Copy link
Collaborator

Yu-Tang-Lin commented Feb 14, 2022

Note for @Yu-Tang-Lin

Number of min energy found from 292 COD file

  1. Around 120 files exist SMILES database error
  2. For energy minimization with box, total around 170 files
    28 files can find minimization energy (18A)
    33 files can find minimization energy (20A)
    37 files can find minimization energy (22A)
    53 files can find minimization energy (24A)
    56 files can find minimization energy (26A)
    56 files can find minimization energy (28A)
    57 files can find minimization energy (30A)
    57 files can find minimization energy (32A)

@mrshirts
Copy link
Author

So, I think that we need more information:
So probably something like 24 A covers most cases.
For the ~170 files that don't have smiles error, how many of them:

  • Can minimize with OpenMM alone (i.e. minimize coordinates, without changing the box
  • Figure out what error occurs when they minimize box vectors as well - where does it fail in the process and what error is produced by that part of the code?

One thing that could be investigated is the effect of choosing different amounts of change when computing finite difference derivatives of the box vectors. Is minimization more stable if that difference is 2x bigger, or 2x smaller?

Probably makes sense (after collecting the data above) to see if minimizing in a,b,c,\alpha,\beta,\gamma is better than minimizing in the 6 box vectors (will need to play around with math!)

@Yu-Tang-Lin
Copy link
Collaborator

Note for @Yu-Tang-Lin

The reason for error occurs when they minimize box vectors. (Generic error - Total 134 files)

(36 files) ERROR: root: The periodic box size has decreased to less than twice the nonbonded cutoff.
(97 files) ERROR:root: Periodic box vectors must be in reduced form.
Chick this, it happened before
michellab/Sire#345
(1 file) - Topology

@Yu-Tang-Lin
Copy link
Collaborator

Note for @Yu-Tang-Lin

I guess I know the reason. First, when I am changing the epsilon it indeed can solve some partial problems for
(1) ERROR: root: The periodic box size has decreased to less than twice the nonbonded cutoff.
(2) ERROR: root: Periodic box vectors must be in reduced form.
Some of the files for this error may be solved or the error log shows a different message
(For example, for the same file, originally is (1), change epsilon the error shows (2))

Let me first assume those two issues may come from the same root cause.
Check the source code of OpenMM
https://github.com/openmm/openmm/blob/master/openmmapi/src/ContextImpl.cpp
Line:281
if (a[0] <= 0.0 || b[1] <= 0.0 || c[2] <= 0.0 || a[0] < 2*fabs(b[0]) || a[0] < 2*fabs(c[0]) || b[1] < 2*fabs(c[1])) throw OpenMMException("Periodic box vectors must be in reduced form.")

It come from the rule of OpenMM Periodic Boundary Conditions
http://docs.openmm.org/7.6.0/userguide/theory/05_other_features.html
image
We should add this in our code to make sure the crystal length fit the rule of OpenMM Periodic Boundary Conditions

@mrshirts
Copy link
Author

OK, interesting. I would imagine that a_x>0, b_y> 0, c_z > 0 would always be satisfied as long as there isn't a numerical stability issue with the minimization.

The question is whether one of the other box conditions is not satisfied, not because it is unphysical, but because it is in the wrong format.

I wonder if this could be fixed by initially, before minimization, rotating the box so that the longest side of the box is along the a_x axis, and the second longest axis along the b_xy plane? I am not entirely certain, though, I'm not quite understanding why OpenMM requests this format of the cells.

@mrshirts
Copy link
Author

mrshirts commented Feb 23, 2022

Here is one package that could potentially be used to understand how to do some of these calculations: https://github.com/qzhu2017/PyXtal.

This code (and the functions it links to) can explain how to do the conversion as well.

Specifically, look at: https://github.com/qzhu2017/PyXtal/blob/master/pyxtal/lattice.py#L1621 (para2matrix)

And https://github.com/qzhu2017/PyXtal/blob/master/pyxtal/lattice.py#L1581 (matrix2para)

@Yu-Tang-Lin
Copy link
Collaborator

Yu-Tang-Lin commented Mar 1, 2022

Note for @Yu-Tang-Lin

For a coordinate system,
Let's summarize how OpenFFCrystalBenchmarking and OpenMM current code do first
COD (A, B, C, alpha, beta, gamma) → OpenMM getPeriodicBoxVectors (change to Cartesian coordinates) → OpenMM getPotentialEnergy (x, y, z) → optimizer

What we want to do is
COD (A, B, C, alpha, beta, gamma) → OpenMM getPeriodicBoxVectors (x, y, z) → coordinate convert to (A, B, C, alpha, beta, gamma)→OpenMM getPotentialEnergy (x, y, z) → optimizer

Currently, I found our object function only accepts (x, y, z) as input, let me think about how to fix that
Maybe start from our jacobian rather initial value!

@Yu-Tang-Lin
Copy link
Collaborator

COD (A, B, C, alpha, beta, gamma) → OpenMM getPeriodicBoxVectors (x, y, z) → OpenMM getPotentialEnergy (x, y, z) → epsilon convert to (A, B, C, alpha, beta, gamma) → add to to jacobian → convert back to (x, y, z) → OpenMM getPotentialEnergy (x, y, z) → optimizer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants