Skip to content

Commit

Permalink
Update projects page
Browse files Browse the repository at this point in the history
  • Loading branch information
anishmss committed Nov 20, 2024
1 parent 1c3ca09 commit 7f417a8
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions _data/projs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,17 @@

- title: Exploring applications of biological language models
image: molecule.png
description: Representing biological sequences as numerical vectors is the first step in building machine learning tools for any bioinformatics task. However, since sequences can be characterized using hundreds of properties, it is difficult to select which features are most likely to be informative. This challenge has motivated the development of biological language models. By learning patterns from large-scale protein databases, these models are able to automatically transform sequences into vectors that have been shown to capture aspects of the "grammar of life." Our lab has been working on applying these biological language models to important downstream tasks such as predicting phage-host interaction.
description: Representing biological sequences as numerical vectors is the first step in building machine learning tools for any bioinformatics task. However, since sequences can be characterized using hundreds of properties, it is difficult to select which features are most likely to be informative. This challenge has motivated the development of biological language models. By learning patterns from large-scale protein databases, these models are able to automatically transform sequences into vectors that have been shown to capture aspects of the "grammar of life." Our lab has been working on applying these biological language models to important downstream tasks such as predicting phage-host interaction. Check out this [paper](https://doi.org/10.1371/journal.pone.0289030){:target="\_blank"}, this [preprint](https://www.biorxiv.org/content/10.1101/2024.08.24.609479v1){:target="\_blank"}, and some code [here](https://github.com/bioinfodlsu/phage-host-prediction){:target="\_blank"} and [here](https://github.com/bioinfodlsu/PHIStruct){:target="\_blank"}.
# Attribution: https://cdn-icons-png.flaticon.com/512/11359/11359324.png

- title: Computational interpretation of genomic regions implicated by genome-wide association studies in rice
image: rice.png
description: Rice feeds half of humanity. The production of rice needs to match human population growth while being environmentally sustainable and climate change-resilient. These challenges have motivated the identification of genetic factors behind agronomically important traits, often using genome-scale techniques such as QTL analysis or genome-wide association studies (GWAS). These studies report regions in the genome that are statistically significant, but they remain short of explaining the biological significance. Our lab has been working on software solutions to gain biological insights on statistically significant genomic sites.
description: Rice feeds half of humanity. The production of rice needs to match human population growth while being environmentally sustainable and climate change-resilient. These challenges have motivated the identification of genetic factors behind agronomically important traits, often using genome-scale techniques such as QTL analysis or genome-wide association studies (GWAS). These studies report regions in the genome that are statistically significant, but they remain short of explaining the biological significance. Our lab has been working on software solutions to gain biological insights on statistically significant genomic sites. Check out this [paper](https://doi.org/10.1093/gigascience/giae013){:target="\_blank"}, this [web app](www.ricepilaf.bioinfodlsu.com){:target="\_blank"} and its [source code](https://github.com/bioinfodlsu/rice-pilaf){:target="\blank"}.
# Attribution: https://cdn-icons-png.flaticon.com/512/898/898133.png

- title: Differential gene expression analysis for non-model organisms
image: gene_expression.png
description: RNA-seq is being increasingly adopted for gene expression studies in a panoply of non-model organisms, with applications spanning the fields of agriculture, aquaculture, ecology, and environment. For organisms that lack a well-annotated reference genome or transcriptome, a conventional RNA-seq data analysis workflow requires constructing a de-novo transcriptome assembly and annotating it against a high-confidence protein database. We propose a shortcut that avoids the computationally demanding assembly process and instead obtains counts for differential expression analysis by directly aligning RNA-seq reads to the high-confidence proteome that would have been otherwise used for annotation.
description: RNA-seq is being increasingly adopted for gene expression studies in a panoply of non-model organisms, with applications spanning the fields of agriculture, aquaculture, ecology, and environment. For organisms that lack a well-annotated reference genome or transcriptome, a conventional RNA-seq data analysis workflow requires constructing a de-novo transcriptome assembly and annotating it against a high-confidence protein database. We propose a shortcut that avoids the computationally demanding assembly process and instead obtains counts for differential expression analysis by directly aligning RNA-seq reads to the high-confidence proteome that would have been otherwise used for annotation. Check out these papers [here](https://doi.org/10.1186/s12864-021-07891-w){:target="\blank"}, [here](https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-021-08278-7){:target="\blank"}, and [here](https://doi.org/10.1186/s12859-024-05924-1){:target="\blank"}, and these source codes [here]() and [here]().
# Attribution: https://cdn-icons-png.flaticon.com/512/1186/1186539.png

- title: Bioinformatics for HIV surveillance
Expand Down

0 comments on commit 7f417a8

Please sign in to comment.