Skip to content

Commit

Permalink
Revisions to paper
Browse files Browse the repository at this point in the history
  • Loading branch information
ghoshm committed Jan 22, 2025
1 parent c35b9b1 commit 8d53bd5
Show file tree
Hide file tree
Showing 10 changed files with 117 additions and 944 deletions.
2 changes: 1 addition & 1 deletion paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ downloads:


+++ {"part": "abstract"}
Neuroscientists are increasingly initiating large-scale collaborations which bring together tens to hundreds of researchers, and make their data, methods and results publicly available. This approach not only allows for larger scale problems to be tackled, but also enables contributions from a wider community including more diverse perspectives. Inspired by projects in pure mathematics, we set out to test the feasibility of making large-scale collaborative neuroscience even more inclusive by running a grassroots, massively collaborative project in computational neuroscience. The key difference with previous approaches in neuroscience was that anyone was welcome to join at any time. We launched a public Git repository, with code for training spiking neural networks to solve a sound localisation task via surrogate gradient descent. We then invited anyone, anywhere to use this code as a springboard for exploring questions of interest to them, and encouraged participants to share their work both asynchronously through Git and synchronously at monthly online workshops. The aim was to use the diversity of perspectives to make new discoveries that a single team would have been unlikely to find. At a scientific level, our work investigated how a range of biologically-relevant parameters, from time delays to membrane time constants and levels of inhibition, could impact sound localisation in networks of spiking units. At a more macro-level, our project brought together 31 researchers from multiple countries, provided hands-on research experience to early career participants, and opportunities for supervision and teaching to later career participants. Although the scientific results were not groundbreaking in this pilot project, looking ahead, our project provides a glimpse of what open, collaborative science without borders could look like, and provides a necessary, tentative step towards it.
Neuroscientists are increasingly initiating large-scale collaborations which bring together tens to hundreds of researchers. At this scale, such projects can tackle large-scale challenges and engage participants with diverse backgrounds and perspectives. Inspired by projects in pure mathematics, we set out to test the feasibility of widening access to such projects even further, by running a massively collaborative project in computational neuroscience. The key differences, with prior neuroscientific efforts, being that our entire project (code, results, writing) was public from day one, and that anyone could participate. To achieve this, we launched a public Git repository, with code for training spiking neural networks to solve a sound localisation task via surrogate gradient descent. We then invited anyone, anywhere to use this code as a springboard for exploring questions of interest to them, and encouraged participants to share their work both asynchronously through Git and synchronously at monthly online workshops. Our aim was to use the diversity of perspectives to make discoveries that a single team would have been unlikely to find. At a scientific level, our work investigated how a range of biologically-relevant parameters, from time delays to membrane time constants and levels of inhibition, could impact sound localisation in networks of spiking units. At a more macro-level, our project brought together 31 researchers from multiple countries, provided hands-on research experience to early career participants, and opportunities for supervision and teaching to later career participants. While our scientific results were not groundbreaking, our project demonstrates the potential of massively collaborative projects to transform neuroscience.
+++

# Introduction
Expand Down
Binary file not shown.
4 changes: 2 additions & 2 deletions paper/sections/basicmodel/basicmodel.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ The network is trained by defining a loss function that increases the further aw

The loss function we use is composed of two terms. The first is the cross entropy or negative log likelihood loss that measures how far our predicted probability distribution $x_k$ is from the true probability distribution (which has value 1 for the correct $k$ and 0 for all other $k$). The second term, which is not used in all the notebooks in this project, is an optional regularisation term. In [](../research/time-constant-solutions.ipynb) we regularise based on the firing rates of the hidden layer neurons. We compute the firing rate for each hidden neuron $r_m$. If this is below a mimimum threshold $r_-$ it contributes nothing to the loss, otherwise we compute $L_m=((r_m-r_-)/(r_+-r_-))^2$ for each neuron for a constant $r_+$ explained below. We now compute the average and multiply a constant $L=c\sum_m L_m/N_h$. The constant $r_+$ is the maximum firing rate we would like to see in the network, so that $L_m=1$ if $r_m=r_+$. The constant $c$ is chosen to be the expected initial cross-entropy loss of the network before training. This makes sure that a firing rate of $r_m=r_+$ is heavily penalised relative to the cross-entropy loss, but that any firing rate below $r_-$ is fine. We chose $r_-=100$ sp/s and $r_+=200$ sp/s.

For the results in this section, the model is trained on $128^2=16,384$ samples in batches of 128, 100 epochs using the Adam optimiser {cite:p}`kingma2017adammethodstochasticoptimization` with a learning rate of 0.001. The network needs to be retrained for each frequency, in this section we only use $f=50$ Hz. Test results are shown for a fresh draw of 4,096 samples.
For the results in this section, the model is trained on $128^2=16,384$ samples in batches of 128, for 100 epochs using the Adam optimiser {cite:p}`kingma2017adammethodstochasticoptimization` with a learning rate of 0.001. The network needs to be retrained for each frequency, in this section we only use $f=50$ Hz. Test results are shown for a fresh draw of 4,096 samples.

### Results

Expand All @@ -80,7 +80,7 @@ Confusion matrix. True interaural phase difference (IPD) is shown on the x-axis,
Hidden neuron firing rates, with the same setup as in [](#confusion-matrix).
```

Analysis of the trained networks show that it uses an unexpected strategy. Firstly, the hidden layer neurons might have been expected to behave like the encoded neurons in Jeffress' place theory, and like recordings of neurons in the auditory system, with a low baseline response and an increase for a preferred phase difference (best phase). However, very reliably they find an inverse strategy of having a high baseline response with a reduced response at a least preferred phase difference ({ref}`tuning-curves-hidden`). Note that the hidden layer neurons have been reordered in order of their least preferred delay to highlight this structure. These shapes are consistently learned, but the ordering is random. By contrast, the output neurons have the expected shape ({ref}`tuning-curves-output`). Interestingly, the tuning curves are much flatter at the extremes close to an IPD of $\pm \pi/2$. We can get further insight into the strategy found by plotting the weight matrices $W_{ih}$ from input to hidden layer, and $W_{ho}$ from hidden layer to output, as well as the product $W_{io}=W_{ih}\cdot W_{ho}$ which would give the input-output matrix for a linearised version of the network ({ref}`basic-weights`).
Analysis of the trained networks show that it uses an unexpected strategy. Firstly, the hidden layer neurons might have been expected to behave like the encoded neurons in Jeffress' place theory, and like recordings of neurons in the auditory system, with a low baseline response and an increase for a preferred phase difference (best phase). However, very reliably they find an inverse strategy of having a high baseline response with a reduced response at a least preferred phase difference ({ref}`tuning-curves-hidden`). Note that the hidden layer neurons have been reordered in order of their least preferred delay to highlight this structure. These shapes are consistently learned, but the ordering is random. By contrast, the output neurons have the expected shape ({ref}`tuning-curves-output`). Interestingly, the tuning curves are much flatter at the extremes close to an IPD of $\pm \pi/2$. We can get further insight into the strategy found by plotting the weight matrices $W_{ih}$ from input to hidden layer, and $W_{ho}$ from hidden layer to output, as well as their product $W_{io}=W_{ih}\cdot W_{ho}$ which would give the input-output matrix for a linearised version of the network ({ref}`basic-weights`).

```{figure} sections/basicmodel/tuning-hidden.png
:label: tuning-curves-hidden
Expand Down
4 changes: 2 additions & 2 deletions paper/sections/delays/Delays.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,15 +105,15 @@ Left: Evolution of training loss of the differential delay layer model as a func
```{figure} sections/delays/Confuse.png
:label: DelayConfuse
:width: 100%
Analysis of classifications by the trained differential delay layer model. Date is shown for errors made on the training data set (A) and test data set (B). Left shows a histogram of the true IPDs (blue) and estimated IPDs (orange). Right shows the confusion matrices on a blue-yellow colour scale (so perfect prediction would correspond to a blue image with a yellow diagonal).
Analysis of classifications by the trained differential delay layer model. Data are shown for errors made on the training data set (A) and test data set (B). Left shows a histogram of the true IPDs (blue) and estimated IPDs (orange). Right shows the confusion matrices on a blue-yellow colour scale (so perfect prediction would correspond to a blue image with a yellow diagonal).
```

Next we show results for the dilated convolutions with learnable spacings (DCLS) algorithm, in this case using 12 IPD classes instead of 36, in [](#DelaySpikeHistograms2). Performance of this algorithm for this task was better, with a mean absolute error on the test dataset of $4.2^\circ$.

```{figure} sections/delays/Confuse_dcls.png
:label: DelaySpikeHistograms2
:width: 100%
Analysis of classifications by the trained dilated convolutions with learnable spacings (DCLS) model. Date is shown for errors made on the training data set (A) and test data set (B). Left shows a histogram of the true IPDs (blue) and estimated IPDs (orange). Right shows the confusion matrices on a blue-yellow colour scale (so perfect prediction would correspond to a blue image with a yellow diagonal).
Analysis of classifications by the trained dilated convolutions with learnable spacings (DCLS) model. Data are shown for errors made on the training data set (A) and test data set (B). Left shows a histogram of the true IPDs (blue) and estimated IPDs (orange). Right shows the confusion matrices on a blue-yellow colour scale (so perfect prediction would correspond to a blue image with a yellow diagonal).
```

Learning synaptic delays with weights enables the visualization of the 'receptive field' of postsynaptic neurons, as illustrated in [](#rf). Five randomly chosen neurons from the hidden layer are plotted, revealing clear spatiotemporal separation of excitation and inhibition.
Expand Down
4 changes: 2 additions & 2 deletions paper/sections/discussion.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## What went well

The decision to start from the code base of the [Cosyne tutorial](https://neural-reckoning.github.io/cosyne-tutorial-2022/) {cite:p}`10.5281/zenodo.7044500` was very helpful. It meant that users had a clear entry path for the project without needing prior expertise, and a code base that was designed to be easy to understand. In addition, the popularity of the tutorial (over 38k views on YouTube at the time of writing) meant that many people heard about this project and were interested in participating. In addition, the GitHub-based infrastructure allowed for asynchronous work our website, that was automatically updated each time anyone made a change to their code or to the text of the paper, allowed for easy sharing of results.
Starting from a tutorial we ran {cite:p}`10.5281/zenodo.7044500`, meant that users had a clear entry point to the project without needing prior expertise, and a code base that was designed to be easy to understand. In addition, the popularity of the tutorial (over 38k views on YouTube at the time of writing) meant that many people heard about this project and were interested in participating. In addition, the GitHub-based infrastructure, which automatically updated our website when anyone made a change to their code or to the text of the paper, allowed for easy sharing of results.

By providing models which used spiking neurons to transform sensory inputs into behavioural outputs, participants were free to explore in virtually any direction they wished, much like an open-world or sandbox video game. Indeed over the course of the project we explored the full sensory-motor transformation from manipulating the nature of the input signals to perturbing unit activity and assessing network behaviour. Consequently, in addition to it's role in research, our code forms an excellent basis for teaching, as concepts from across neuroscience can be introduced and then implemented in class. In this direction, we integrated our project into two university courses and provide slides and a highly annotated python notebook, for those interested in teaching with these models.

Expand All @@ -17,4 +17,4 @@ Ultimately, while the project explored many interesting directions, which will f

## Conclusions

This paper does not present a scientific breakthrough. However, it does demonstrate the feasibility of open research projects which bring together large number of participants across countries and career stages to work together collaboratively on scientific projects. Looking ahead, we hope that by lowering the barrier to entry, these projects will welcome a wider and more diverse set of expertise and perspectives, generating new ideas and leading to discoveries beyond what any single group could realise.
This paper does not present a scientific breakthrough. However, it does demonstrate the feasibility of open research projects which bring together large number of participants across countries and career stages to work together collaboratively on scientific projects. Looking ahead, we hope that by lowering the barrier to entry, these projects will welcome a wider and more diverse set of expertise and perspectives, generate new ideas and lead to discoveries beyond what any single group could realise.
Loading

0 comments on commit 8d53bd5

Please sign in to comment.