-
Notifications
You must be signed in to change notification settings - Fork 0
Generating master files
Use the generate_masterfile_job.sh
script to run demultiplexing and data processing steps for ribosomal profiling of individual samples.
Change parameter values between the EDIT BELOW HERE
and STOP EDITING
lines based on the requirements of your job. Do not change the parameter name, but do change the value that comes after. Ensure the results are all enclosed in quotes.
It creates one output, the master file based on the selected comaparison.
Parameter names to edit:
CONTROL_MANIFEST and TREATMENT_MANIFEST These are the manifest output files for the primary outputs of the 'Riboseq Sample Processing' tool. They should have the follow file types in them (third column):
Gene_counts
Gene_posRPM
Gene_negRPM
These can be the same file, if you are comparing two samples that were processed in the same run.
CONTROL_SAMPLE and TREATMENT_SAMPLE The name of the control and treatment samples. These should match the sample names in the second column of the manifests.
The output master file will calculate fold-changes as treatment/control.
REFERENCE Reference genome in fasta format.
ANNOTATION Gene annotations for your genome. This is a 5-column, tab-delimited file, formatted as:
[ID] [start] [end] [strand] [name/description]
For example:
ACT41903.1;thrL 190 255 + thr operon leader peptide
ACT41904.1;thrA 336 2798 + Bifunctional aspartokinase/homoserine dehydrogenase 1
ACT41905.1;thrB 2800 3732 + homoserine kinase
ACT41906.1;thrC 3733 5019 + L-threonine synthase
ACT41907.1;yaaX 5232 5528 + DUF2502 family putative periplasmic protein
ACT41908.1;yaaA 5681 6457 - peroxide resistance protein%2C lowers intracellular iron
OUTPUT_NAME Name for output file.
GENE_THRESHOLD Minimum coverage threshold to include a gene in the master file, across both samples.
CODON_THRESHOLD_SAMPLE and CODON_THRESHOLD_TREATMENT Minimum codon coverage threshold (counts per codon) to include a codon in the master file, for control or treatment samples respectively.
RPM_SEGMENT Set to segment
or all
to perform codon normalization based on:
-
segment
: Segmentation of the gene, normalizing separately within the first 10 codons, last 10 codons, and the rest of the gene (default, standard behavior). -
all
: Over the entire gene.