a simple vae to encode protein condensate + small molecule trajectory data

requirements

numpy
pandas
tqdm
sklearn
pytorch
torchvision
pyro
seaborn

data format

Data format is expected to be a csv with each row as a datapoint. If your data has a label column and you'd like to exclude it from the representation, pass in label_col=NAME_OF_COL to the InteractionDataset initialization in trainer.py. The default label_col is protein.

training

In train.py,

update DATA_DIR and MODEL_DIR to be compatible with your directory setup.
run python train.py --input_dim 239 --z_dim 16 --hidden_dim 64 or whatever your desired dimensions are.

If you'd like to use a slurm submission script (particularly with MIT's SuperCloud), edit the llsub.sh file and try running LLsub llsub.sh -g volta:1. If you are using a conda environment, change the name (mine is called pcsm).

evaluation

An example of running evaluation is included in llsub.sh.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
models		models
notebooks		notebooks
src		src
README.md		README.md
example.png		example.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

a simple vae to encode protein condensate + small molecule trajectory data

requirements

data format

training

evaluation

About

Releases

Packages

Languages

lindseyguan/PCSM-VAE

Folders and files

Latest commit

History

Repository files navigation

a simple vae to encode protein condensate + small molecule trajectory data

requirements

data format

training

evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages