The task of general audio detection and segmentation based in means of machine learning is a high-demanding procedure nowadays.
Relevant works in the last decade aim at modeling audio in order to conduct a semantics analysis and a high-level categorization. A generic strategy that would detect audio events as means of transitions from one audio state to another is considered interesting and would support whole classification workflow. This work investigates the possibilities in designing a robust event transition detection algorithm for audio that would perform well in different conditions, without relying on complicated machine learning schemes and by minimizing prior knowledge for detection model, and thus, delivering consistent performance for any input signal and computing environment. Additionally, a modern user-generated content approach for populating and updating ground truth databases is presented. Both techniques are embedded in a mobile application, called iSMAARTer (Intelligent Sound Measurement Audio Analysis & Recording Tool). iSMAARTer was presendet at the 2015 Audio Mostly Conference as a poster paper.
You can view the poster or download the dataset that was used for training and evaluation.
You must be logged in to post a comment.
Leave a comment