Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The params file format #4

Open
RupalHatkar opened this issue Jul 16, 2021 · 4 comments
Open

The params file format #4

RupalHatkar opened this issue Jul 16, 2021 · 4 comments

Comments

@RupalHatkar
Copy link

Hello MutSig2CV Authors,

In the instructions for param file it says that the example file is available in test/input/params.txt. but there is no test folder.

I made a params_file.txt file stating remove_duplicate_patients = FALSE to instruct MutSig2CV to noto remove duplicate patients but it's not working and it keeps removing all the patients leaving me with no results files.

Can you please guide me? I am getting the following error:

0 patients
WARNING: MutSig is not applicable to single patients.
Error using |
Matrix dimensions must agree.
Error in add_helper_is_fields (line 19)
Error in impute_callschemes (line 12)
Error in MutSig_2CV_v3_11_core (line 296)
Error in MutSig_2CV_v3_11_wrapper (line 50)
MATLAB:dimagree

Thank you!
Rupal

@julianhess
Copy link
Contributor

Hi Rupal,

Running MutSig on a cohort with a large mutation overlap between samples will yield very poor results. MutSig's background model explicitly assumes that mutations arise independently across samples, which is likely not the case when there is a substantial overlap. For example, multiple serial biopsies from the same patient will invariably share common ancestral events. In that case, violating the independence assumption will generate problematic results since passenger genes containing truncal mutations will appear to be recurrently mutated across many biopsies and show up as significant.

Another common reason for substantial overlaps between samples is due to germline contamination (common germline SNPs appear as recurrent somatic mutations), or poorly filtered/QC'd somatic variant calls that contain recurrent sequencing artifacts across samples. Obviously, these would also yield poor MutSig results.

I would carefully QC your mutation data before attempting to run with this option disabled.

—Julian

@RupalHatkar
Copy link
Author

RupalHatkar commented Jul 16, 2021 via email

@liux2250
Copy link

liux2250 commented Sep 9, 2022

I got the same issue as Rupal. I am wondering if you can kindly provide some sample maf data for our beginners to run the package. Thant would be very helpful.

Best,
Yang

@dansteiert
Copy link

It seemed to me that the option is silent in the file src/MutSig_2CV_v3_11_core.m
If you want to change this,
I added an if clause as shown below and set the default for remove_duplicate_patients to true instead of 1.

 196 % remove duplicate patients
 197 %%LINE ADDED HERE! - IF CASE!
 198 if P.remove_duplicate_patients
 199   fprintf('Scanning for duplicate patients...\n');
 200   X = new_find_duplicate_samples(M.mut);
 201   if ~isempty(X.drop)
 202     fprintf('Removing the following %d duplicate patients:\n',length(X.drop));
 203     disp(X.drop);
 204     M.mut = reorder_struct_exclude(M.mut,ismember(M.mut.patient,X.drop));
 205   end
 206 end

Best!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants