ACDC_2022

ACDC Version 2022 is a Python 3.7 release of the Animal Call Detection and Classification code powered by Keras/Tensorflow.

Main Goal

While studying the social behavior of animals in a laboratory setting, a wealth of unstructured data is typically generated. These data, such as hours-long video and/or audio recordings, are tedious for humans to analyze manually. ACDC is a project that seeks to solve this problem for audio analysis - helping researchers to train a neural network and then automatically detect and classify animal calls, outputting the results in a convenient format.

Overall Flow

To train, the user places training data for discrete call types in the training_data folder and then instructs ACDC to develop detection and classification models based on that data. To process recordings, the user places recordings in the recordings folder and tells ACDC to process them, resulting in files containing call labels being generated and placed in an output folder. Operation is mainly through a command line interface with numbered options which allows the user to enter a number for the action to perform.

Most recent changes

Removal of lesser-used features
Fixed issues with newer versions of several packages incl Keras/Tensorflow
Save to Audacity labels format
Model save and load debugged
Installation using Requirements.txt

Approach for detection

The full recording is split up into overlapping segments, each a certain length (e.g. 0.5s). Each segment is fed to the multi-class classifier which determines which type of call that segment contains. Since there is a high degree of overlap between the segments, each section of the spectrogram is essentially covered many times. These results are put in a time series, and the "Scanner" class then goes through the raw results, smoothing them, and then finally discarding continuous segments that are less than a certain proportion of the average call length (e.g. if the average phee call is 1s, and a continuous set of segments were labeled phee, but that contiguous set only lasted 0.3s total, it would be discarded). These steps effectively create a "voting" scheme. If there is a false positive in one segment and one segment only, these steps will likely smooth over them or weed them out. Conversely, if there is a false negative in a sea of true positives, it will not disrupt the chain.

Installation

Download the repo and unzip the files in the directory where you want them
Install Anaconda https://www.anaconda.com/
Create a new environment with Python 3.7 in Anaconda Navigator
Click the environment, click Open Terminal (a command line terminal will open)
In the terminal, navigate to the directory that has the ACDC files (where acdc.py is)
Type pip install -r requirements.txt and hit enter. Pip should now be installing all the required packages.

To run ACDC, type python acdc.py. The following menu should appear

Now enter the number for the action you want to perform. Note that you first need data to run these options. Option 1 (prepare training data) requires having training data in the training_data folder. Option 2 (train models) assumes that option 1 has already been done. Option 3 (process recordings) requires a trained model in the models folder and a recording in the recordings folder.


Operation

Prepare training data


Put training data in the training_data folder. There should be a sub-folder for each class and the name of the sub-folder has to match the classes listed in the variable WINDOW_LENGTHS in variables.py. The folders should contain wave files (.wav format, mono, 48kHz, 16 bits/sample) with the target calls, nicely edited to start and stop at the beginning and end of the call. The training samples need to be good quality, clearn, representative examples of what will be encountered in the recordings. There should also be a folder named Noise with representative samples of noises that are loud enough to cross threshold but do not belong to any of the target classes. Set TRAINING_SEGMENTS_PER_CALL sufficiently high for data augmentation to take place.
To run data preparation, enter the corresponding number from in the menu. Output is a file called acdc.tdata in the models folder.

Train models


Once prepare training data has been run, there should be a file called acdc.tdata in the models folder and models can be trained. Make sure to set TRAINING_EPOCHS in variables.py sufficiently high (>10) for the model to optimize.
To run model training, enter the corresponding number from in the menu. Output is a set of files and sub-folders in models representing the trained model.

Process recordings


Once a model has been trained, the trained model should be in the models folder and recordings (.wav format, mono, 48kHz, 16 bits/sample) can be processed. Put wave files for analysis in the recordings folder.
To process recordings, enter the corresponding number from in the menu. Results are stored in a new sub-directory in results. Sub-directories are named according to the date and time of the run, like this: [YYYYMMDD][HHMMSS][recording filename]. Results are lists of call labels in .csv format and .txt format (tab-delimited Audacity readable) with a row for each call and 1st column start time (seconds), 2nd column end time (seconds) and 3rd column call type (‘Tr’, ‘Tw’, ‘Ph’ or ‘Chi’). The csv and txt files contain the same information.
An easy way to view the results is by loading the wave file into Audacity https://www.audacityteam.org/ in Spectrogram view, and then do 'File', 'Import', 'Labels...' and select the .txt file with labels.
The user may want to try out different values for CONFIDENCE_THRESHOLD and VOLUME_AMP_MULTIPLE (both in variables.py to get a better result. If that does not work, re-training with more samples may be necessary. Finally, to use a model architecture of your own, the current framework can still be useful. You need to edit model.py to enter the new model.

Important variables

variables.py contains constants used in various modules. Some of them are highlighted here because changing their values according to the use's needs can help get better results.
CONFIDENCE_THRESHOLD
This is the value that needs to be exceeded in the the final layer of the model to trigger detection of a call. Lowering this value makes the model more likely to detect something but can lead to more false postives. Raising this value makes the model less likely to detect something but reduces false postitives.
TRAINING_EPOCHS
This determines the number of iterations that the model trains. We have good experience using at least 10 epochs.
WINDOW_LENGTHS = {'Chi': 0.25,'Tr': 0.25,'Ph': 0.40,'Tw': 0.5}
Window lengths in seconds are set for each vocalization type. The names of the calls ‘Chi’, ‘Tr’ ‘Ph’ and ‘Tw’ have to correspond to folder names in the training_data folder. If different or additional classes need to be trained, this variable needs to change accordingly
TRAINING_SEGMENTS_PER_CALL
This is a target number of segments which determines whether the data needs to be augmented. It makes sense to set this value equal to the class with the most segments so that other classes are augmented and get the same number, removing class imbalance.
VOLUME_AMP_MULTIPLE
This variable determines by how much the data should be amplified. There is a threshold being applied so segments that do not cross the threshold are discarded. Change this value to get the optimal balance between false positives and false negatives.
Folders

models
This is where trained models and pre-processed training data are stored
recordings
This is where recordings for analysis (.wav files, mono, 48kHz, 16 bits/sample) are stored.
results
Results of processing a file are stored in this folder. A new sub-directory is created each time a file is processed. Sub-directories are named according to the date and time of the run, like this: [YYYYMMDD][HHMMSS][recording filename]. Results are lists of call labels in .csv format and .txt format (tab-delimited Audacity readable) with a row for each call and 1st column start time (seconds), 2nd column end time (seconds) and 3rd column call type (‘Tr’, ‘Tw’, ‘Ph’ or ‘Chi’). The csv and txt files contain the same information
training_data
Training data for training a new model goes here. There should be a folder for each call type ‘Tr’, ‘Tw’, ‘Ph’, ‘Chi’ and ‘Noise’. Each training sample should be a .wav file stored in the folder corresponding to the call type. The ‘Noise’ folder should contain a representative sampling of noises that are not vocalizations but so occur in the environment where the recordings are done, such as doors opening and closing, cage sounds, et cetera. Very low amplitude background noise does not need to be represented because thresholding already makes sure that gets discarded.
Collaborators
Samvaran Sharma, Karthik Srinivasan, and Rogier Landman
Additional info in paper 'Unobtrusive vocalization recording in freely moving marmosets' (in prep)
This project was developed in collaboration with MIT Brain and Cognitive Sciences (c) 2016-2022.

Name	Name	Last commit message	Last commit date
Latest commit mineraldragon Add files via upload Jan 2, 2022 2298ca5 · Jan 2, 2022 History 30 Commits
img	img	Delete README.md	Jan 1, 2022
models	models	Create dummy.txt	Jan 1, 2022
recordings	recordings	Create dummy.txt	Jan 1, 2022
results	results	Create dummy.txt	Jan 1, 2022
training_data	training_data	Create dummy.txt	Jan 1, 2022
README.md	README.md	Update README.md	Jan 2, 2022
acdc.py	acdc.py	Add files via upload	Jan 1, 2022
exporter.py	exporter.py	Add files via upload	Jan 1, 2022
filters.py	filters.py	Add files via upload	Jan 1, 2022
model.py	model.py	Add files via upload	Jan 1, 2022
process.py	process.py	Add files via upload	Jan 1, 2022
recording.py	recording.py	Add files via upload	Jan 1, 2022
requirements.txt	requirements.txt	Add files via upload	Jan 1, 2022
results.py	results.py	Add files via upload	Jan 1, 2022
scanner.py	scanner.py	Add files via upload	Jan 1, 2022
training_data.py	training_data.py	Add files via upload	Jan 1, 2022
variables.py	variables.py	Add files via upload	Jan 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ACDC_2022

Main Goal

Overall Flow

Most recent changes

Approach for detection

Installation

Operation

Prepare training data

Train models

Process recordings

Important variables

Folders

About

Releases

Packages

Languages

mineraldragon/ACDC_2022

Folders and files

Latest commit

History

Repository files navigation

ACDC_2022

Main Goal

Overall Flow

Most recent changes

Approach for detection

Installation

Operation

Prepare training data

Train models

Process recordings

Important variables

Folders

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages