Generic integration searching #9

NanoporeEnthusiast · 2024-12-23T19:16:21Z

Is your feature related to a problem?

This workflow seems to be nearly compatible with integration searching in general, aside from the input of expected integration sites that are present with CRISPR experiments. Could this, or a similar workflow, be used to search for retroviral integration sites which may incorporate more randomly?

Describe the solution you'd like

As opposed to including an expected insertion site, can the workflow search for the insertional sequence and output the flanking genomic sites and proportions of insertions?

Describe alternatives you've considered

Number of insertions and locations (chromosome, nucleotide number, Gene ID if applicable), UMI discrimination functions, whether the insertions are in coding regions or non-coding regions, etc.

Additional context

No response

nrhorner · 2024-12-31T12:58:26Z

Thanks for your question @NanoporeEnthusiast.

Currently this workflow is not able to identify random integration sites. This functionality is also not currently available in the other EPI2ME workflows as far as I know. It does seem like a good idea for a new workflow. I'll let you know if we decide to make this.

NanoporeEnthusiast · 2025-01-06T14:47:51Z

Thank you nrhorner, I hope that it is something that can be done in the future. Here is an example of how something similar has been done with R based tools from Ajoge et al., (https://doi.org/10.1038/s41467-022-35379-y)

Integration site library and computational analysis
Genomic DNA was processed for integration site analysis and sequenced using the Illumina MiSeq platform36,50. Briefly, genomic DNA was restriction enzyme digested using MseI and NarI and the 3’ LTR-host genome junctions were amplified by ligation-mediated PCR. After gel purification of the PCR products, the purified DNA samples were processed using the Nextera XT DNA Sample Preparation kit. A limited-cycle PCR reaction was performed to amplify the insert DNA, which was then sequenced using Illumina MiSeq using 2×150 bp chemistry at the London Regional Genomics Centre (Robarts Research Institute, Western University, Canada). Fastq sequencing reads were quality trimmed and unique integration sites identified using our in-house bioinformatics pipeline36, which is called the Barr Lab Integration Site Identification Pipeline (BLISIP version 2.9) and includes the following updates: bedtools (v2.25.0), bioawk (awk version 20110810), bowtie2 (version 2.3.4.1), and restrSiteUtils (v1.2.9). HIV-1 3’ LTR-containing fastq sequences were identified and filtered by allowing up to a maximum of five mismatches with the reference NL4-3 3’ LTR sequence and if the 3’ LTR sequence had no match with any region of the human genome (GRCh37/hg19). Integration sites were determined from the sequence junction of the 3’ LTR and human genome sequences. All genomic sites in each dataset that hosted two or more sites (i.e., identical sites) were collapsed into one unique site for our analysis. Sites located in various common genomic features and non-B DNA motifs were quantified and heatmaps were generated using our in-house python program BLISIP Heatmap (BLISIPHA v1.0). Sites that could not be unambiguously mapped to a single region in the genome were excluded from the study. All non-B DNA motifs were defined according to previously established criteria88. Matched random control integration sites were generated by matching each experimentally determined site with 10 random sites in silico that were constructed to be the same number of bases away from the restriction site as was the experimental site36. Unique HIV 3’ LTRs were identified with BLISIP, aligned with MUSCLE (version 10.1.7)89 and gap-stripped with trimAl (version 1.2)90. All columns with gaps in more than 40% of the population were gap-stripped. Unique LTR sequence logos were generated using WebLogo (version 3.6)52.

nrhorner · 2025-01-08T08:50:30Z

Thanks again @NanoporeEnthusiast I will take a look at this

nrhorner added the question Further information is requested label Dec 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generic integration searching #9

Generic integration searching #9

NanoporeEnthusiast commented Dec 23, 2024 •

edited

Loading

nrhorner commented Dec 31, 2024

NanoporeEnthusiast commented Jan 6, 2025

nrhorner commented Jan 8, 2025

Generic integration searching #9

Generic integration searching #9

Comments

NanoporeEnthusiast commented Dec 23, 2024 • edited Loading

Is your feature related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

nrhorner commented Dec 31, 2024

NanoporeEnthusiast commented Jan 6, 2025

nrhorner commented Jan 8, 2025

NanoporeEnthusiast commented Dec 23, 2024 •

edited

Loading