-
Notifications
You must be signed in to change notification settings - Fork 48
Lab 3. Using Neural Nets on the UrbanSound Dataset
In your terminal, change to the directory where you keep your workshop repository. Use git pull
to get the new files.
XXXX:01_Spectrum Generation xxxx$ git pull
The program will warn you if it will overwrite any files you have. One basic way to make sure that nothing gets overwritten is to rename your files.
XXXX:01_Spectrum Generation xxxx$ mv Standard.SpecVar WendyStandard.SpecVar
Git should update the files in the repository but leave your own files alone unless they have the same name as files that are in the repository.
To enable the pulldown menus for the new .ipynb files, install ipy widgets:
conda install -c conda-forge ipywidgets
For this lab, we're repeating the process we used for Cats vs. Dogs. but using the UrbanSound Dataset. If you would like, you will have the option of retraining the whole network instead of just the last layer; this will take a longer time.
The UrbanSound Dataset contains 1302 labeled sound recordings of sound events from 10 classes: air_conditioner, car_horn, children_playing, dog_bark, drilling, enginge_idling, gun_shot, jackhammer, siren, and street_music. The audio codec, sampling rate, bit depth, and number of channels are the same as those of the original file uploaded to Freesound (and hence may vary from file to file).
Take a little time to look at the number of files, and look at some of the files.
As with Cats Vs. Dogs, the process for performing classification is:
Organizing Data -> Generating Spectrums -> Training the Neural Network -> Running the Neural Net.
The UrbanSound dataset comes with most of the data in a folder called data
. To see the same notebook code that we used in Cats Vs Dogs, the folders in UrbanSound/data
need to be moved up a level so that the directory tree structure looks like this:
.
├── Cats-Vs-Dogs
│ ├── Cats
│ └── Dogs
└── UrbanSound
├── air_conditioner
├── car_horn
├── children_playing
├── data
├── dog_bark
├── drilling
├── engine_idling
├── gun_shot
├── jackhammer
├── siren
└── street_music
The next step is to compute images from the audio data.
The notebook GeneratingSpectrums2.ipynb
in the 01_Spectrum Generation
folder will allow you to select which dataset in the 'AudioData' folder you wish to use.
Another update to GeneratingSpectrums2.ipynb
, is that you can select your Spectrum Variables file. You can add to the Standard.SpecVar
by using SpectrumsSettingTool2.ipynb
, playing with Spectrum settings, and saving them.
As before, we will be using a the ResNet CNN.
Please open the notebook TrainingResNets2
in the folder 02_Training
. This notebook has been renovated to allow the selection of other GeneratedData sets of Spectrograms, and has more labels between cells to help you understand what is happening at different places in the code.
The training will take much longer than Cats vs. Dogs, especially if you enable training on all variables instead of just the last layer.
Try out your neural net using ResNetInferenceInteractive
in the folder 03_Running
! Does it work as well as Cats vs. Dogs? Why or why not?
As promised, this version of the Inference functions provides more details on how the code responds to the inferences made by the neural net than ResNetInference
.
Try this again with the Stanford Sounds Dataset! Also, feel free to pad the sounds from that dataset with other sounds from Freesound.