Multi-Temporal Resolution Convolutional Neural Networks for Acoustic Scene Classification

11/11/2018
by   Alexander Schindler, et al.
0

In this paper we present a Deep Neural Network architecture for the task of acoustic scene classification which harnesses information from increasing temporal resolutions of Mel-Spectrogram segments. This architecture is composed of separated parallel Convolutional Neural Networks which learn spectral and temporal representations for each input resolution. The resolutions are chosen to cover fine-grained characteristics of a scene's spectral texture as well as its distribution of acoustic events. The proposed model shows a 3.56 improvement of the best performing single resolution model and 12.49 DCASE 2017 Acoustic Scenes Classification task baseline.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset