Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to define fixed bin set ? #3

Open
IamksGEEK opened this issue Sep 4, 2023 · 2 comments
Open

How to define fixed bin set ? #3

IamksGEEK opened this issue Sep 4, 2023 · 2 comments

Comments

@IamksGEEK
Copy link

IamksGEEK commented Sep 4, 2023

Hi
Could you please guide me on obtaining the fixed bin set? It's a critical component for validating the GEMINI model on an independent cohort. However, I couldn't find this vital information in either your code or your paper.
While reviewing the related paper for this code, I came across the explanation in Supplementary Figure 9.a, which seems to hint at the meaning of the fixed bin set used in the fixed GEMINI model. I've quoted the relevant portion here:

Supplymentary Figure9. Genome-wide fixed bins utilized for analysis of single molecule mutation frequencies and detection of lung cancer in cfDNA. a, Precent similarity of bins identified as being enriched for mutations in lung cancer and non-cancer samples in each training fold compared to the sets of bins utilized in the fixed model that were identified from analyses of all samples.

Does this imply that the fixed bin set consists of bins with the largest regional difference for all samples, specifically in the LUCAS cohort (n=365)?
I appreciate any assistance you can provide. Thank you for your help!

Best regards,
Kongshuang

@IamksGEEK IamksGEEK changed the title How to apply model generated by GMINI on new data? How to apply model generated by GEMINI on new data? Sep 5, 2023
@IamksGEEK IamksGEEK changed the title How to apply model generated by GEMINI on new data? How to fix GEMINI model to predict new sample? Sep 5, 2023
@IamksGEEK IamksGEEK changed the title How to fix GEMINI model to predict new sample? How to get fixed bin set ? Sep 5, 2023
@IamksGEEK IamksGEEK changed the title How to get fixed bin set ? How to define fixed bin set ? Sep 5, 2023
@yeyup
Copy link

yeyup commented Sep 21, 2023

Hi, lamksGEEK. I believe that the fixed bin sets obtained from the LUCAS cohort were generated using all training samples (n=110) without employing the leave-one-out validation strategy. I attempted this method on my independent dataset but it yielded poor results. I am still working on it to investigate if there might be any other factors affecting the outcome.

@IamksGEEK
Copy link
Author

IamksGEEK commented Sep 22, 2023

Hi, lamksGEEK. I believe that the fixed bin sets obtained from the LUCAS cohort were generated using all training samples (n=110) without employing the leave-one-out validation strategy. I attempted this method on my independent dataset but it yielded poor results. I am still working on it to investigate if there might be any other factors affecting the outcome.

Hi, yeyup.
I am very glad to hear your reply! I used all the training samples to build the model, but the accuracy of the model I obtained was also very poor, which is a very disappointing result :(. And I found that the number of mutations is influenced by factors such as sequencing depth, which leads to significant fluctuations in the number of mutations between samples. Now I have given up analyzing mutations from a quantitative perspective and started analyzing mutation signatures (mutsignatures ) which have greater performance on our data.
Wishing you get the desired results in your experiment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants