We live in a world of sounds, full of beautiful music, birds chirping, and the voices of our friends. It’s a rich cacophony, with blaring beeps, accented alarms, and knock-knock jokes. The sound of a door opening can alert us to a friend’s arrival, and a door slamming can alert us to an impending argument.
HEARBO (HEAR-ing roBOt) is a robot developed at Honda Research Institute–Japan (HRI-JP), and its job is to understand this world of sound, in a field called Computational Auditory Scene Analysis.
At the IEEE International Conference on Intelligent Robots and Systems and RO-MAN this year, several papers describing HEARBO’s latest functionalities were introduced.
With the dream of the futuristic robotic butler, researchers are trying to make robots understand our voice commands, a bit like Apple’s Siri but from 2 meters away. Typical approaches use a method called beamforming to “focus” on a sound, like a person speaking. The system then takes that sound, performs some noise reduction, and then tries to understand what the person is saying using automatic speech recognition.
The beamforming approach is widely used, but HEARBO takes the beamforming approach a step further. What about when the TV is on, the kids are playing on one side of the room, and the doorbell rings? Can our robot butler detect that? HEARBO researchers say it can, using their own 3-step paradigm: localization, separation, and recognition. This system, called HARK, lets you recover the original sounds from a mixture based on where the sounds are coming from. Their reasoning is that “noise” shouldn’t just be suppressed, but be separated out and then analyzed afterwards, since the definition of noise is highly dependent on the situation. For example, a crying baby may be considered noise, or it may convey very important information. [read more..]
Published at :