Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using other reference genome builds #62

Open
mhSepehri opened this issue Sep 13, 2024 · 1 comment
Open

Using other reference genome builds #62

mhSepehri opened this issue Sep 13, 2024 · 1 comment

Comments

@mhSepehri
Copy link

Is there any way to use reference genomes of organisms other than hg19, hg38, mm9, and mm10?

I am trying to use facets for dog samples (canFam3) and the first part of generating pileup file works fine and creates an output with all 38 + x chromosomes output.csv.gz, but it cannot produce final results correctly.

@dariober
Copy link
Owner

dariober commented Sep 13, 2024

Is there any way to use reference genomes of organisms other than hg19, hg38, mm9, and mm10?

Hi- I'm afraid this is not possible at the moment and implementing it has been on my todo list for a while - it shouldn't be too difficult. facets (the R package) supports it, though.


Note to myself - this is how it could be implemented:

  • --gbuild/-g option should accept also a path to a bed file where the 4th column is the percentage GC in 1kb windows (users can use bedtools to prepare this file). Make a list (gcData) from this file where items of the list are vectors of %CG (4th column). This the format in the pctGCdata package.

  • If gbuild is a bed file, function make_header can get the chrom sizes from gcData.

  • In preProcSample function call, if gbuild is a bed file set set gbuild="udef", ugcpct=gcData. Use ugcpct=NULL otherwise.

  • Function reset_chroms can probably do nothing if gbuild is a bed file.

Alternatively:

  • If --gbuild is a preset string, rename chromosomes as appropriate and assign the pctGCdata list to object gcData. If gbuild is a bed file, read it, make it a list of chroms and assign it to gcData.

  • Proceed as if gbuild is always a custom genome and use gcData instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants