Skip to content
This repository has been archived by the owner on Apr 27, 2023. It is now read-only.

Train the model for customised train-test split #281

Open
kdmsit opened this issue Aug 28, 2021 · 4 comments
Open

Train the model for customised train-test split #281

kdmsit opened this issue Aug 28, 2021 · 4 comments

Comments

@kdmsit
Copy link

kdmsit commented Aug 28, 2021

I have around 40K crystal data from the materials project database in .cif file format. I want to train the megnet model from scratch using my own train test split (e.g train 20% test 80%) for formation energy and bandgap property. Could you please help me, how to do that?

@chc273
Copy link
Contributor

chc273 commented Aug 30, 2021

@kdmsit can you be more specific?

Please see the example notebooks for how to use the models. Also the megnet model predicts intensive properties so for extensive properties you will need to convert it to a per-atom quantity

@kdmsit
Copy link
Author

kdmsit commented Aug 31, 2021

I am using the fo0llowing code snippet for it:

from pymatgen.core.structure import Structure
nfeat_bond = 100
epoch=1000
r_cutoff = 5
gaussian_centers = np.linspace(0, r_cutoff + 1, nfeat_bond)
gaussian_width = 0.5
graph_converter = CrystalGraph(cutoff=r_cutoff)
model = MEGNetModel(graph_converter=graph_converter, centers=gaussian_centers, width=gaussian_width)
graphs_valid = []
targets_valid = []
structures_invalid = []
for i in idx_train:
    crystal=Structure.from_file(os.path.join(data_path, str(i) + '.cif'))
    p=float(id_prop_data[i][index])
    try:
        graph = graph_converter.convert(crystal)
        graphs_valid.append(graph)
        targets_valid.append(p)
    except:
        structures_invalid.append(crystal)
print("Train Data Load Done......")

print("Training the model......")
model.train_from_graphs(graphs_valid, targets_valid,epochs=epoch)

for i in idx_test:
    try:
        new_structure = Structure.from_file(os.path.join(data_path, str(i) + '.cif'))
        pred_target = model.predict_structure(new_structure)
        true_target = float(id_prop_data[i][index])
        ae = abs(float(pred_target[0])-true_target)`
```

But I am not able to acheive good results. Could you please help me to understand whether I am doing the training in correct way or not.

@chc273
Copy link
Contributor

chc273 commented Sep 8, 2021

@kdmsit I don't see an issue in the code. In general, you need to check whether the target properties are intensive and whether or not they can be predicted from the structure. Please provide more details if you still cannot find the solutions.

@chc273
Copy link
Contributor

chc273 commented Sep 8, 2021

If it is only MP structures, formation energy and band gap, those should be fairly easy to train. https://github.com/materialsvirtuallab/megnet/blob/master/notebooks/crystal_example.ipynb Check this for example.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants