Authors: Jesse Zaneveld1, Nia Prabhu*1, Aziz Bajouri*1,2, Ayomikun Akinrinade*1,3, Dr. Mushtaq Bilal*4
* Chapter and Vignette authors contributed equally and are listed in chronological order of first contribution.
1 Division of Biological Sciences, School of STEM, University of Washington, Bothell, Washington, USA
2 Division of Computer and Software Systems, School of STEM, University of Washington, Bothell, Washington, USA
3 Division of Health Studies, School of Nursing and Health Studies, University of Washington, Bothell, Washington, USA
4
Full Spectrum Bioinformatics is a free online text designed to introduce key topics in Bioinformatics using the Python programming language. The text is written in interactive Jupyter Notebooks, which allow you to try out and modify example code and analyses.
In addition to explanations of concepts, Full Spectrum Bioinformatics also includes Bioinformatics Vignettes written by readers of the text. Each vignette is focused around a particular core concept, and show how readers have applied that concepts to their research projects.
If you happen to already be familiar with GitHub and Jupyter Notebooks, you can download the entire project and run it interactively, or click the 'Open in Colab' links (they looks like this: ) to open interactive versions of each section in Google Colab (you will need to 'Save as' your own copy in order to change code).
If you would just like to read a chapter, you can also view a static version of each section using the nbviewer
links (they look like this: ). nbviewer
stands for 'notebook viewer', so this is just a way to view chapters with code in them without actually running the code. This will generally be the best way to view the chapters non-interactively.
Finally, you can also use the direct GitHub links (the link that's the name of each chapter) to view any chapeter. This shows the chapter on GitHub. It usually works well, but you may sometimes get a GitHub error message. Usually hitting reload page or using the link avoids this issue.
The text is currently in prototype status. Chapters with content you can preview are linked below:
The Many Paths to Bioinformatics
An Absurdly Brief Introduction to Biology
An Absurdly Brief Introduction to Computer Science
An Absurdly Brief Introduction to Statistics
Exercise. Little Brother is Missing: practice navigating on the command line
Exercise. Duck vs. Yeast: using BLAST+ on the command line to detect sequence similarity
Warm-up Exercise: Spot the Difference
A Tour of Python Data Types (ints, floats, boolean values, strings, lists, dicts, & sets)
A Tour of Python Syntax (functions, conditions, iteration, classes)
A Quick Win: using Python to run Statistical Tests and Make Simple Graphs
Another Quick Win: Loading tabular data with Pandas DataFrames
Using Literature Surveys to Ask Good Questions and Propose Testable Hypotheses
Write a Literature Synthesis...and get your Introduction for free!
Zotero for Beginners (a.k.a How to Avoid Repeatedly Reformatting 96 Citations by Hand)
An introduction to Biological Sequences
Representing and Manipulating Biological Sequences as Python Strings
Analyzing Biological Sequences with For Loops and If Statements
Reading and writing FASTA files using Python
Vignette (Aziz Bajouri): Using set objects to find circular RNAs involved in multiple diseases
Exercise: Error Bingo
Vignette (Nia Prabhu): Using For Loops and Dictionaries to Compare Nucleotide Composition in Pandemic and Non-Pandemic Causing Influenza Strains
Capstone: testing for depletion of CG dinucleotides in the human genome
Working with Tabular 'Omic data in Python using Pandas
Joining and Filtering Pandas DataFrames
Pandas Case Study: Analyzing tabular sleep data from the NHANES healthy survey
Analyzing Microbiome Alpha Diversity in Python
Analyzing Microbiome Beta Diversity in Python
Simulating the Effect of Sequencing Depth on Diversity Estimates
Reflecting on your Project so Far
Project Organization Strategies for Collaborative and Reproducible Research
Test Code: a powerful strategy for ensuring your results aren't lies.
Graphs as a Visual Language
Representing Distribution
Homology and Alignment
Global Alignment with the Needleman-Wunsch algorithm
Local Alignment with the Smith-Waterman algorithm
BLAST and the k-mer trick
Tree thinking
Representing Phylogenetic Trees with Python Classes
Generating Trees Using Birth-Death Models
Working with Traits on Trees
Maximum Parsimony Ancestral State Reconstruction
Phylogenetic Comparative Methods
Trait prediction
Simulating the Population Genetics of Natural Selection and Genetic Drift
Simulating Networks
Simulating the Evolution of Social Behavior
Linear Models - a Statistical Swiss Army Knife
Monte Carlo simulation and the Fundamental Unity of Statistical Hypothesis Tests
Statistical Distributions and Parametric Tests
Monte Carlo simulation of Effect Size, Sample Size, and Significance
Dealing with Multiple Comparisons
Exercise: Revising your writing about statistical results
An Introduction to Maximum Likelihood optimization
The Best Model of A Cat is a Cat - model complexity, overfitting, and the AIC
An Introduction to Bayesian Approaches
Unsupervised Classification: of ordination, clustering and fishtanks
Supervised Classification: from lines to trees to forests.
Vignette (Ayomikun Akinrinade): Using K-Nearest Neighbors and Binary Decision Tree Algorithms to Predict Enzyme Function from Protein Sequences
From Data to Conclusion: building a research manuscript brick by brick
Resistance is Futile: becoming a language Borg
Exercise: generating a targeted title using templating
The Inverted Pyramid: optimizing your text from a reader's perspective
Fighting for an Inclusive Workplace
Best practices for Success: Happiness Matters, Radical Collaboration, and Networking
Open-source Science as Shield and Sword
Applying for Grants Applying for Grants
Data Sources for Bioinformatics Projects
Timesaving Starter Code Template Script with Interface and Test Code IUPAC codes in python Standard Translation Tables in Python
This project is being developed with support from NSF Integrative and Organismal Systems award .
You can submit feedback about completed chapters at the following link