pcorr-intro.tex

\section{Introduction}
\emph{Functional connectivity} is a statistical description of observed \emph{multineuronal} activity patterns not reducible to the response properties of the individual cells. Functional connectivity reflects local synaptic connections, shared inputs from other regions, and endogenous network activity. Although functional connectivity is a phenomenological description without a strict mechanistic interpretation, it can be used to generate hypotheses about the anatomical architecture of the neural circuit and to test hypotheses about the processing of information at the population level.

Pearson correlations between the spiking activity of pairs of neurons are among the most familiar measures of functional connectivity \citep{Averbeck:2006, Zohary:1994, Kohn:2005, Bair:2001, Ecker:2010}.  In particular, \emph{noise correlations}, \emph{i.e.}\;the correlations of trial-to-trial response variability between pairs of neurons, have a profound impact on stimulus coding \citep{Zohary:1994, Abbott:1999, Sompolinsky:2001, Nirenberg:2003, Averbeck:2006, Josic:2009, Berens:2011, Ecker:2011}. In addition, noise correlations and correlations in spontaneous activity have been hypothesized to reflect aspects of synaptic connectivity \citep{Gerstein:1964}.  Interest in neural correlations has been sustained by a series of discoveries of their nontrivial relationships to various aspects of circuit organization such as the physical distances between the neurons \citep{Smith:2008, Denman:2013}, their synaptic connectivity \citep{Ko:2011},  stimulus response similarity \citep{Bair:2001, Arieli:1995, Chiu:2002, Kenet:2003, Kohn:2005, Cohen:2008, Cohen:2009, Ecker:2010, Rothschild:2010, Ko:2011, Smith:2013b}, cell types \citep{Hofer:2011}, cortical layer specificity \citep{Hansen:2012, Smith:2013}, progressive changes in development and in learning \citep{Golshani:2009, Gu:2011, Ko:2013}, changes due to sensory stimulation and global brain states \citep{Greenberg:2008, Goard:2009, Kohn:2009, Rothschild:2010, Ecker:2014, Renart:2010}.

Neural correlations do not come with ready or unambiguous mechanistic interpretations. They can arise from monosynaptic or polysynaptic interactions, common or correlated inputs, oscillations, top-down modulation, and background network fluctuations, and other mechanisms \citep{Perkel:1967, Moore:1970, Shadlen:1998, Salinas:2001, Ostojic:2009, Rosenbaum:2011}. But multineuronal recordings do provide more information than an equivalent number of separately recorded pairs of cells. For example, the eigenvalue decomposition of the covariance matrix expresses shared correlated activity components across the population; common fluctuations of population activity may be accurately represented by only a few eigenvectors that affect all correlation coefficients. On the other hand, a correlation matrix can be specified using the \emph{partial correlations} between pairs of the recorded neurons. The partial correlation coefficient between two neurons reflects their linear association conditioned on the activity of all the other recorded cells \citep{Whittaker:1990}.  Under some assumptions, partial correlations measure conditional independence between variables and may more directly approximate causal effects between components of complex systems than correlations \citep{Whittaker:1990}. For this reason, partial correlations have been used to describe interactions between genes in functional genomics \citep{Schafer:2005, Peng:2009} and between brain regions in imaging studies \citep{Varoquaux:2012, Ryali:2012}. These opportunities have not yet been explored in neurophysiological studies where most analyses have only considered the distributions of pairwise correlations \citep{Zohary:1994, Bair:2001, Smith:2008, Ecker:2010}.

However, estimation of correlation matrices from large populations presents a number of numerical challenges. The amount of recorded data grows only linearly with population size whereas the number of estimated coefficients increases quadratically. This mismatch leads to an increase in spurious correlations, overestimation of common activity (\emph{i.e.}\;overestimation of the largest eigenvalues) \citep{Ledoit:2004}, and poorly conditioned partial correlations \citep{Schafer:2005}. The \emph{sample correlation matrix} is an unbiased estimate of the true correlations but its many free parameters make it sensitive to sampling noise. As a result, on average, the sample correlation matrix is farther from the true correlation matrix than some structured estimates.

Estimation can be improved through \emph{regularization},  the technique of deliberately imposing a structure on an estimate in order to reduce its estimation error \citep{Schafer:2005, Bickel:2006}. To `impose a structure' on an estimate means to bias (`shrink') it toward a reduced representation  with fewer free parameters, the \emph{target estimate}.   The optimal target estimate and the optimal amount of shrinkage can be obtained from the data sample either analytically \citep{Ledoit:2003, Ledoit:2004, Schafer:2005}  or by cross-validation \citep{Friedman:1989}. An estimator that produces estimates that are, on average, closer to the truth for a given sample size is said to be more \emph{efficient} than other estimators.

Although regularized covariance matrix estimation is commonplace in finance \citep{Ledoit:2003}, functional genomics \citep{Schafer:2005}, and brain imaging \citep{Ryali:2012}, surprisingly little work has been done to identify optimal regularization of neural correlation matrices.

Improved estimation of the correlation matrix is beneficial in itself. For example, improved estimates can be used to optimize  decoding of the population activity \citep{Friedman:1989, Berens:2012}. But reduced estimation error is not the only benefit of regularization.  Finding the most efficient among many regularized estimators leads to insights about the system itself: the structure of the most efficient estimator is a parsimonious representation of the regularities in the data.

The advantages due to regularization increase with the size of the recorded population. With the advent of  big neural data \citep{Alivisatos:2013}, the search for optimal regularization schemes will become increasingly relevant in any model of population activity. Since optimal regularization schemes are specific to systems under investigation, the inference of functional connectivity in large-scale neural data will entail the search for optimal regularization schemes. Such schemes may involve combinations of heuristic rules and numerical techniques specially designed for given types of neural circuits.

What structures of correlation matrices best describe the multineuronal activity in specific circuits and in specific brain states?  More specifically, are correlations in the visual cortex during visual stimulation best explained by common fluctuations or by local interactions within the recorded microcircuit?

To address these questions, we evaluated four regularized covariance matrix estimators that imposed different structures on the estimate. The estimators are designated as follows:
\begin{description}
\item[$C_{\sf sample}$] -- sample covariance matrix, the unbiased, unregularized estimator.
\item[$C_{\sf diag}$] -- linear shrinkage of covariances toward zero, \emph{i.e.}\;toward a diagonal covariance matrix.
\item[$C_{\sf factor}$] -- a low-rank approximation of the sample covariance matrix, representing inputs from unobserved shared factors (latent units).
\item[$C_{\sf sparse}$] -- sparse partial correlations, \emph{i.e.}\;a large fraction of the \emph{partial} correlations between pairs of neurons are set to zero.
\item[$C_{\sf sparse+latent}$] -- sparse partial correlations between the recorded neurons \emph{and} linear interactions with a number of latent units.
\end{description}

First, we used simulated data to demonstrate that the selection of the optimal estimator indeed pointed to the true structure of the dependencies in the data.

We then performed a cross-validated evaluation to establish which of the four regularized estimators was most efficient for representing the population activity of dense groups of neurons in mouse primary visual cortex recorded with high-speed 3D random-access two-photon imaging of calcium signals. In our data, the sample correlation coefficients were largely positive and low.  We found that the best estimate of the correlation matrix was $C_{\sf sparse+latent}$.  This estimator revealed a sparse network of partial correlations (`{interactions'), between the observed neurons; it also inferred latent units exerting linear effects on the observed neurons. We analyzed these networks of partial correlations and found the following: Whereas significant noise correlations were predominantly positive, the inferred interactions had a large fraction of negative values possibly reflecting inhibitory circuitry.  Moreover, we found that these interactions exhibited a stronger relationship to the physical distances and to the differences in preferred orientations than noise correlations. In contrast, the inferred negative interactions were less selective.