Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run workflow with sample data on local machine #59

Open
Al-Murphy opened this issue Aug 16, 2022 · 3 comments
Open

Run workflow with sample data on local machine #59

Al-Murphy opened this issue Aug 16, 2022 · 3 comments

Comments

@Al-Murphy
Copy link

Hi,

I'm trying to run the workflow locally with wdl using the sample data provided however I run into errors (full error message log attached - msg_log.txt):

I ran:

java -jar /shared/aemurphy/cromwell/cromwell-83.jar run finemapping-pipeline/wdl/finemap.wdl --inputs finemapping-pipeline/wdl/finemap_inputs.json

and got:

[2022-08-16 14:57:41,48] [info] WorkflowManagerActor: Workflow fe1a47c3-db70-46b4-82f0-ccfcf076a738 failed (during ExecutingWorkflowState): java.lang.RuntimeException: Failed to evaluate 'finemap.phenos' (reason 1 of 1): Evaluating read_lines(phenolistfile) failed: java.lang.IllegalArgumentException: Could not build the path "gs://r7_data/finemap/demopheno/conf/extended_demopheno". It may refer to a filesystem not supported by this instance of Cromwell. Supported filesystems are: HTTP, LinuxFileSystem. Failures: 
HTTP: gs://r7_data/finemap/demopheno/conf/extended_demopheno does not have an http or https scheme (IllegalArgumentException)

I also tried downloading the google storage data locally before running but I don't have access:

gsutil cp gs://r7_data/finemap/demopheno/conf/extended_demopheno .
AccessDeniedException: 403 [email protected] does not have storage.objects.list access to the Google Cloud Storage bucket.

I imagine there is a simple way around all this that I'm missing so any help would be great!

Cheers,
Alan.

Versions:
openjdk version "11.0.16"
cromwell-83

@mkanai
Copy link
Collaborator

mkanai commented Aug 24, 2022

Hi Alan, thanks for reaching out. Unfortunately, gs://r7_data is our private bucket and we cannot grant access to users outside the FinnGen.

extended_demopheno is a plain text file, containing a list of phenotypes to fine-map (e.g., T2D). Please generate yours locally.

Best,
Masa

@Al-Murphy
Copy link
Author

Hey Masa,

That makes sense thank you (and sorry on the delayed reply)! There are however some reference files stored on the private bucket that are listed:

  • finemap.ldstore_finemap.filter_and_summarize.snp_annot_file: gs://r7_data/annotations/R7_annotated_variants_v0.gz
  • finemap.ldstore_finemap.filter_and_summarize.snp_annot_file_tbi: gs://r7_data/annotations/R7_annotated_variants_v0.gz.tbi
  • finemap.ldstore_finemap.ldstore.sample: gs://r7_data/R7_321464_samples.sample
  • finemap.ldstore_finemap.ldstore.bgen_pattern: gs://r7_data/bgen/chrom/finngen_R7_{CHR}.bgen

Can these be shared or the code used to generate each? I want to make sure I follow the same procedure for this pipeline so having the exact same reference files would be paramount.

Thanks,
Alan.

@mkanai
Copy link
Collaborator

mkanai commented Aug 31, 2022

Hi Alan,

Please find below for the file descriptions:

  • snp_annot_file: This is a large annotation file that we internally use for multiple purposes. In our fine-mapping pipeline, we essentially refer to only these two fields most_severe,gene_most_severe, which are based on VEP annotations. You can generate annotations of your own variants using existing databases (e.g., gnomAD or scripts (e.g., Hail).

Example lines of the annotation file should look like:

#variant        chr     pos     gene_most_severe        most_severe
1:13668:G:A     1       13668   DDX11L1 non_coding_transcript_exon_variant
1:14506:G:A     1       14506   WASH7P  splice_region_variant
1:14717:G:A     1       14717   WASH7P  intron_variant
1:14773:C:T     1       14773   WASH7P  intron_variant
  • snp_annot_file_tbi: This is a tabix index of the above file.
  • bgen_pattern and sample: These are private genotype bgen and sample files from FinnGen. Please use your cohort's genotype files.

Hope this helps!

Best,
Masa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants