Analysis of RNA-Seq data of E. coli cells expressing targeting/nontargeting Type VI CRISPR-Cas system
Each directory contains data and scripts for the particular experiment:
- LshCas13a_C3000 - RNA-Seq of total RNA extracted from E. coli C3000 cells carrying activated/nonactivated LshCas13a enzyme;
- LshCas13a_d10LVM - RNA-Seq of total RNA extracted from E. coli
$\Delta$ 10LVM cells carrying activated/nonactivated LshCas13a enzyme; - LshCas13a_in_vitro_total_RNA - RNA-Seq of total RNA extracted from E. coli C3000 cells after in vivo incubation with activated/nonactivated LshCas13a enzyme;
- LshCas13a_in_vitro_tRNAs - RNA-Seq of total tRNA sample after in vivo incubation with activated/nonactivated LshCas13a enzyme;
Each directory contains the following subdirectories:
- Data - directory containing the raw reads data;
- Annotations - directory containing GFF tables with genomic features;
- Alignments - directory containing alignments produced with read_mapping.sh script;
- Reference_sequences - directory containing FASTA files of sequences used for reads mapping;
- Scripts - directory containing scripts for data processing;
- Results - directory containing the results of data processing.
The "Results" directory contains the following subdirectories:
- Tables
- Ends_counts - contains files with coordinates of 5' ends of fragments;
- Fragment_coords - contains files with coordinates of fragments (SeqID - Fragment_start - Fragment_end - Strand)
- Merged_ends_counts - contains tables with 5' ends counts derived from samples designeted for comparison
- Read_pairs_TABs - contains tables with coordinates of read pairs.
- WIG_files - contains wig-files with 5' ends coverage.
The "Scripts" directory contains a set of scripts for the data processing. There is a "basic" set of scripts which is common for all experiments:
- raw_data_processing.sh - performs reads quality assessment, removes adapters and discards low-quality reads.
- Requirements:
- fastqc, trimmomatic
- Requirements:
- read_mapping.sh - maps paired-end reads to the reference sequences. Since the SAM alignments file are quite large, the output data is compressed using gzip.
- Requirements:
- bowtie2
- Requirements:
- return_fragment_coords_table.py - receives alignment files (in gzipped SAM format) and generates all tables deposed in "Result-Tables" directory (except "Merged_ends_counts") and produces WIG files with 5' ends coverage;
- Requirements:
- python3 with gzip and pandas modules
- Requirements:
- merge_ends_count_tables.py - combines 5' ends counts tables from different tables into one table
- Requirements:
- python3 with pandas, gzip, re and functools modules
- Requirements:
- TCS_calling.R - performs statistical test producing table with the position, logFC and p-value values.
- Requirements:
- R with dplyr, data.table, tidyr and edgeR modules
- Requirements: