Reproducible code for the results shown in our manuscript Multiscale Comparative Connectomics (MCC
).
Vivek Gopalakrishnan, Jaewon Chung, Eric Bridgeford, Benjamin D. Pedigo, Jesús Arroyo, Lucy Upchurch, G. Allan Johnson, Nian Wang, Youngser Park, Carey E. Priebe, and Joshua T. Vogelstein. “Multiscale Comparative Connectomics”. arXiv:2011.14990 (Nov. 2020).
Click any link for individual instructions on how to generate that specific figure.
Alternatively, execute the file code/run
to generate all figures at once.
MCC
uses both Python 3.8 and R 3.6.1.
The analyses above depend on the following packages:
# Conda Python packages
jupyter==1.0.0
rpy2==3.3.6
statsmodels==0.12.1
# Pip install the latest version of graspologic
graspologic @ git+git://github.com/microsoft/graspologic@dev
# Conda R packages
r-base==3.6.1
bioconductor-complexheatmap==2.2.0
r-cairo==1.5_10
r-circlize==0.4.12
r-data.table==1.12.2
r-essentials==3.6.0
r-future==1.21.0
r-future.apply==1.7.0
r-ggplot2==3.1.1
r-globaloptions==0.1.2
r-igraph==1.2.4.1
r-mltools==0.3.5
r-tidyverse==1.2.1
# CRAN R packages
cdcsis==2.0.3
These analyses have been tested on macOS x64 and Linux x64.
We created a computational environment with these packages using a dedicated Miniconda kernel (should take about 5 mins to setup):
conda env create --file environment.yml --name mcc
To get rpy2
running on an M1 Mac, it's currently necessary to install it in API mode: RPY2_CFFI_MODE=API pip install rpy2
. Also, it might be easier to install the R
packages directly through the scripting interface instead of through conda since many compatibility conflicts haven't been resolved for new arm64 versions.
Scripts to reproduce the figures in MCC
are organized below.
- Run
code/1_statistical_framework_graphs.ipynb
(expected runtime: 5 seconds) - This script uses
igraph
to generate the sample connectomes and graph models seen above
- Run
code/2_plot_adjacency_matrices.ipynb
(expected runtime: 5 seconds) - This script uses
ComplexHeatmap
to generate average connectomes for each mouse strain
- Run
code/3_cc_emedding.ipynb
(expected runtime: 5 seconds) - This script uses
graspologic
to embed the corpus callosum brain region of every mouse in a low-dimensional space
- Run
code/4a_identifying_signal_components.ipynb
(expected runtime: 2 min) - This script uses
graspologic
and various k-sample hypothesis testing packages to identify the strongest signal edges, vertices, and communities
- Run
code/4b_format_signal_components_tables.ipynb
(expected runtime: 30 seconds) - This script uses
pandas
to nicely format the results generated for Figure 4 into publication-ready tables - Tables are found in the Supplement of the
MCC
manuscript
- Run
code/5_whole_brain_emedding.ipynb
(expected runtime: 5 seconds) - This script uses classical multidimensional scaling (cMDS) to embed the results of the omnibus embedding in a low-dimensional space
- Run
code/6_conditional_independence_anatomy.ipynb
(expected runtime: 5 hours on a 48 core machine, probably much longer on a normal laptop) - This script uses
cdcsis
to compute a bunch of conditional independence tests - The purpose of this test is to determine if our methods recover information about network topology not encoded in neuroanatomy