Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to run this project #21

Open
wangfei-123 opened this issue Feb 23, 2024 · 6 comments
Open

how to run this project #21

wangfei-123 opened this issue Feb 23, 2024 · 6 comments

Comments

@wangfei-123
Copy link

Hello author, I am currently reproducing your project, but I am not sure where the starting file is or which command code should be entered in the terminal to run. I have also seen that there is a pre trained model in this project, but how should this model be used for prediction (i.e., which instruction codes should be inputted)? Thank you!!

@ARY2260
Copy link
Owner

ARY2260 commented Feb 23, 2024

Hello there. We recently posted a tutorial on Deepchem about openpom usage. I hope that would help you get started:

https://deepchem.io/tutorials/predict-multi-label-odor-descriptors-using-openpom/

Feel free to ask anything else. Also the benchmark scores are of an ensemble of 10 models. You can find the code for that in the examples section.

@wangfei-123
Copy link
Author

wangfei-123 commented Feb 24, 2024 via email

@wangfei-123
Copy link
Author

wangfei-123 commented Feb 25, 2024 via email

@ARY2260
Copy link
Owner

ARY2260 commented Feb 28, 2024

I am sorry for the confusion and missing docs for inference. Hope this solves the issues.

To load the model, you need to initialize the model with same parameters that were used during training of the model.

# initialize model
model = MPNNPOMModel(n_tasks = n_tasks,
                     batch_size = 128,
                     learning_rate = learning_rate,
                     class_imbalance_ratio = train_ratios,
                     loss_aggr_type = 'sum',
                     node_out_feats = 100,
                     edge_hidden_feats = 75,
                     edge_out_feats = 100,
                     num_step_message_passing = 5,
                     mpnn_residual = True,
                     message_aggregator_type = 'sum',
                     mode = 'classification',
                     number_atom_features = GraphConvConstants.ATOM_FDIM,
                     number_bond_features = GraphConvConstants.BOND_FDIM,
                     n_classes = 1,
                     readout_type = 'set2set',
                     num_step_set2set = 3,
                     num_layer_set2set = 2,
                     ffn_hidden_list = [392, 392],
                     ffn_embeddings = 256,
                     ffn_activation = 'relu',
                     ffn_dropout_p = 0.12,
                     ffn_dropout_at_input_no_act = False,
                     weight_decay = 1e-5,
                     self_loop = False,
                     optimizer_name = 'adam',
                     log_frequency = 32,
                     model_dir = './experiments',
                     device_name ='cuda')

Then you have to restore the model using ".pt" checkpoint file:
model.restore("some_model.pt")

Let this be .csv file containing smiles for inference:

SMILES
CC(O)CN
CCC(=O)C(=O)O
O=C(O)CCc1ccccc1
OCc1ccc(O)cc1
O=Cc1ccc(O)cc1
O=C(O)c1ccc(O)cc1
CC(=O)O
CC=O
CC(=O)C(C)O
CC(C)=O

Now load the csv file and predict using the model

import pandas as pd
inference_csv_filepath = "infer_smiles.csv"
df = pd.read_csv(inference_csv_filepath)

# Featuize test smiles
featurizer = GraphFeaturizer()
featurized_data = featurizer.featurize(df['SMILES']) # 'SMILES' here is name of the column which contain SMILES

# Get predictions from trained model
prediction = model.predict(dc.data.NumpyDataset(featurized_data))

@ARY2260
Copy link
Owner

ARY2260 commented Feb 28, 2024

The included pretrained model example_model. pt is just an example. It may not give good results.

@wangfei-123
Copy link
Author

wangfei-123 commented Mar 1, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants