Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate SMILES cntd. #20

Open
simolon opened this issue Jan 21, 2025 · 2 comments
Open

Duplicate SMILES cntd. #20

simolon opened this issue Jan 21, 2025 · 2 comments

Comments

@simolon
Copy link

simolon commented Jan 21, 2025

I understand that there are duplicate structures for tails 4 and 5 in the article figure and also saw duplicate smiles codes, due to this, in the data, as described here, #12 (comment)

Do you have a clean version of the experimental data where the values for the trans and cis lipids can be distinguished, and could share it? Or is there a way to derive it from the shared dataset, e.g. through the sequence of the tails you used when constructing the lipids library from the components?

Thanks a lot!

@cpuxuyue
Copy link

The SMILES strings were enumerated using MARVIN (https://chemaxon.com/marvin), which has inherent limitations and cannot distinguish between cis and trans isomers. Addressing this issue would require manual annotation to clean the data, a highly time-intensive process. However, since the transfection potency of the cis and trans tails was found to be highly comparable, such efforts are unlikely to significantly enhance data quality.

@simolon
Copy link
Author

simolon commented Jan 21, 2025

Thanks for the explanations! I had hoped that the experimental data had been recorded or processed in a structure along that of the heat map, i.e. categorized by the three components. Then one would only need to identify which tail was what isomer, from possibly a single entry in the lab journal. But I of course could only guess what is available in terms of data and documentation, so I was probably too optimistic.

In any way, thanks for sharing the code and data!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants