-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encode molecules starting from their SMILES #26
Comments
Hi @marcosbodio, Thank you for your question.
|
Hi @chao1224, thank you for your answer. I see in your paper that you have Table 5 where you list results on DTA tasks with Davis and KIBA. These datasets contains SMILES of molecules, so how did you use GraphMVP (or GraphMVP-G, GraphMVP-C) on these datasets? It would be very useful to see the code, because that would clarify what is the proper way of using your model starting from the SMILES of molecule. |
Hi @marcosbodio, Sure, you can check this python script, specifically, this line assigns which dataset to use. |
Hi @chao1224, I have looked at the script that you linked above, and I think that is for fine tuning your model, which I would prefer to avoid. I was hoping to use a checkpoint of your model, for example I wonder if I could do something like this: import torch
from rdkit import Chem
from rdkit.Chem.rdDistGeom import EmbedMolecule
from src_classification.GEOM_dataset_preparation import mol_to_graph_data_obj_simple_3D
smiles = 'Cn1cnc(c1)C(=O)c1ccc(CN2[C@H](Cc3ccccn3)C(=O)Nc3cc(Cl)ccc3C2=O)cc1'
mol = Chem.MolFromSmiles(smiles)
mol = Chem.AddHs(mol)
EmbedMolecule(mol=mol)
data = mol_to_graph_data_obj_simple_3D(mol) and then feed data to the model loaded from the checkpoint to compute an embedding of the SMILES. What do you think? |
Hi @marcosbodio, Yes, I think this is right if you want to use the 3D representation.
where |
HI @chao1224 , I have tried to load one of your model checkpoint, but I do not see model_path = 'output/3D_hybrid_02_masking/GEOM_3D_nmol50000_nconf5_nupper1000/CL_1_VAE_1/6_51_10_0.1/0.3_EBM_dot_prod_0.1_normalize_l2_detach_target_2_100_0/pretraining_model.pth'
model = torch.load(f=model_path, map_location=torch.device('cpu'))
print(model.keys())
print('model_3D' in model) where The previous code prints the following:
Am I loading the wrong checkpoint? |
Hi @marcosbodio , I need to double-check the checkpoint files when I got time. Meanwhile, you should be able to use this checkpoint, which is one of the SOTA PaiNN pretraining methods (paper link)). |
Hello, I would like to know if it is possible to use GraphMVP to encode molecule starting from their SMILES. I have read this issue, but that does not help much. I would be really grateful if you could provide some explanation, and ideally an example. Thank you!
The text was updated successfully, but these errors were encountered: