Cognitive Source Separation
Humans are masters in analyzing their auditory environment and in separating different sound sources. A classical example for human sound source separation is the “cocktail party problem”, where several people are talking simultaneously in the same room. Human listeners have no difficulty to attend to a single person while ignoring all the other people and background noise. Todays computational approaches for analyzing and separating auditory scenes especially in reverberant environments are far from achieving this extraordinary ability of the human brain.
The goal of this project is to find out if human cognition can be used to enhance computational auditory source separation. Human source analyzing and separation capabilities are severely influenced by assumptions and learned knowledge regarding the specific sources and the auditory scene. If humans enter an auditory scene i.e. a cocktail party, they first analyze the scene regarding positions, types and special characteristics of possible sound emanating sources. By entering the scene they classify each source as familiar vs. unfamiliar, speech vs. non-speech, … and analyze the characteristics of each classified source (i.e. the fundamental frequency of a speaking person). When source separation is performed, humans can access these information to support the decisions of the separation process. This project tries to mimic these cognitive strategies of the human brain and provide the estimated characteristics of the auditory scene as further input to a computational source separation approach.
Bob – The Robotic Head
To imitate human listening behavior a robotic human dummy head – called Bob – is used to explore the auditory scene. Bob consists of a Neumann KU-100 dummy head mounted on a Pan-Tilt-Roll-Unit that is controlled via RS-232 serial port from a host computer. The Pan-Tilt-Roll-Unit is able to move Bob in any manlike position in all three dimensions and allows Bob to investigate the auditory scene around him in a human manner.
Finished Theses
- Diploma Thesis: Binaural Sound Source Separation using a human dummy head – Andreas Neufang
- Diploma Thesis: Sound Source Localization using a movable human dummy head – Eric Haschke
- Study Thesis: Generation of Point Sound Sources and Surround Sound Effects – Tobias Jung
- Bachelor Thesis: Multiple Fundamental Frequency Estimation for Cognitive Source Separation – Jochen Krämer
- Bachelor Thesis: Binaural Source Tracking Using a Human Dummy Head – Janosch Offenberg
- Master Thesis: Audio Signal Classification in Non-Ideal Reverberant Environments – Ivan Mironenko