Skip to main content

Table 11 Acoustic event detection

From: An analytical study of information extraction from unstructured and multidimensional big data

 

Purpose

Technique

Dataset

Results

Limitations/benefits

[62]

Modeling data with exemplars

To explicitly model the background events

Exemplar-based method with NMF

Office Live recordings from 1 to 3 min and office synthetic with bg noise

With time wrapping, Fscore improved from 50.2 to 65.2% in office live dataset whereas in office synthetic dataset, results were not promising

Proposed solution suffers from data scarcity, and overfitting

[65]

To overcome the overfitting limitation

To improve the performance for large scale input

CNN to train AED end to end + data augmentation method to prevent overfitting

Acoustic event classification database

Achieved 16% improvement as compared to Bag of Audio Words (BoAW) and classical CNN

Results presented with and without data augmentation proved that augmentation improves the performance

[63]

To explore the impact of feature extraction in AED

To explore the effectiveness of deep learning approaches

Multiple single resolution recognizers + selection of optimal set of events + merging or removing repeated labels

CHIL2007

CNN performed better with combination scheme of multi-resolution approach

DNN has the ability to model high dimensional data

[64]

To improve the detection accuracy by extracting context information

Context recognition phase: UBM to capture unknown events and sound event detection stage

Audio database consisting 103 recordings of 1133 min duration

Knowledge of context as context dependent event prior can be used to improve the accuracy

Context dependent event selection and accurate sound event modeling are two important factors for the improvement in AED

[66]

To improve the efficiency of acoustic scene classification and acoustic event detection

Gated recurrent neural networks (GRNN) + linear discriminant analysis (LDA)

DCASE2016 task 1

Achieved overall accuracy of 79.1% on DCASE2016 challenge. Relative improvement of 19.8% as compared to GMM

LDA minimizes inner class variance but not efficient for high dimensional data