From e3d6792b5161fa4d43d155b7f711d00622b8335d Mon Sep 17 00:00:00 2001 From: Jyotika Singh Date: Thu, 10 Jun 2021 19:59:54 -0700 Subject: [PATCH] Fix typos --- paper/paper.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/paper/paper.md b/paper/paper.md index 8bc18bb..af937ed 100644 --- a/paper/paper.md +++ b/paper/paper.md @@ -53,7 +53,7 @@ The motivation behind this software is understanding the popularity of Python fo Different data processing techniques work well for different types of data. For example, word vector formations work great for text data [@nlp]. However, passing numbers data, an audio signal or an image through word vector formation is not likely to bring back any meaningful numerical representation that can be used to train machine learning models. Different data types correlate with feature formation techniques specific to their domain rather than a "one size fits all". -PyAudioProcessing is a Python based library for processing audio data into features and building Machine Learning models. Audio processing and feature extraction research is popular in MATLAB. There are comparatively fewer resources for audio processing and classification in Python. This tool contains implementation of popular and different audio feature extraction that can be use in combination with most scikit-learn classifiers. Audio data can be trained, tested and classified using pyAudioProcessing. The output consists of cross validation scores and results of testing on custom audio files. +PyAudioProcessing is a Python based library for processing audio data into features and building Machine Learning models. Audio processing and feature extraction research is popular in MATLAB. There are comparatively fewer resources for audio processing and classification in Python. This tool contains implementation of popular and different audio feature extraction that can be used in combination with most scikit-learn classifiers. Audio data can be trained, tested and classified using pyAudioProcessing. The output consists of cross validation scores and results of testing on custom audio files. The library lets the user extract aggregated data features calculated per audio file. Unique feature extractions such as Mel Frequency Cepstral Coefficients (MFCC) [@6921394], Gammatone Frequency Cepstral Coefficients (GFCC) [@inbook], spectral coefficients, chroma features and others are available to extract and use in combination with different backend classifiers. While MFCC features find use in most commonly encountered audio processing tasks such as audio type classification, speech classification, GFCC features have been found to have application in speaker identification or speaker diarization. Many such applications, comparisons and uses can be found in this IEEE paper [@6639061]. All these features are also helpful for a variety of other audio classification tasks. @@ -81,7 +81,7 @@ This software offers pre-trained models. This is an evolving feature as new data | music | 48.80 | 1.20 | | speech | 0.60 | 49.40 | -2. Audio type classifier to determine speech versus music versus bird sounds: Trained SVM classifier that classifying audio into three possible classes - music, speech and birds. This classifier was trained using MFCC, spectral and chroma features. Confusion matrix has scores such as follows. +2. Audio type classifier to determine speech versus music versus bird sounds: Trained SVM classifier that classifies audio into three possible classes - music, speech and birds. This classifier was trained using MFCC, spectral and chroma features. Confusion matrix has scores such as follows. | | music | speech | birds | | --- | --- | --- | --- |