The human brain uses several cues for sound localization including the intensity (loudness), timing, and the frequency of noise. Due to the need to detect and react timely to threats, humans have become quite good at locating the source of sound — within two degrees of space, in fact! For the hearing enabled, it’s the reason you can tell whether a fire truck is getting closer or further away and what makes a game of Marco Polo a favorite pool game for generations of people.
When it comes to sound localization in an autonomous robot like Misty, it may not be a matter of survival (and certainly not pool games) but the ability to localize sound is extremely important in her ability to interact and engage with her environment and with you.
Misty’s sound localization: the tech
Misty is packed with capabilities and many of them are focused on her ability to interact and engage. Sound localization is considered an interaction capability along with face detection and recognition, wake word event, audio recording, audio playback, and capacitive touch. So how does it work?
Put simply, Misty uses sensory data (in this case, sound) that she detects with the three far-field microphones located in her head that use Qualcomm® Fluence™ PRO. These three powerful microphones have echo cancellation and noise suppression, sound position tracking to determine user location relative to the device, sound focus to capture voice from specific areas, and position tracking.
Misty Skill: Move to Sound
While sound localization opens the door to many possibilities, the real magic happens when you incorporate this capability into a real-world use case via your code. Misty Robotics Prototype Engineer, CP, understood this which is what inspired him to build the Move to Sound Misty Skill.
In this skill, when Misty detects a voice, she turns towards the person speaking and moves in their direction while raising an arm to wave. (More on that turning movement below.) The interaction continues if the speaker touches the capacitive sensor on Misty’s chin; she responds to the touch by taking a picture of whatever is in front of her.
Because Misty easily integrates with third-party APIs, CP used Microsoft Cognitive Services to pull data about what Misty took pictures of. He then used a text-to-speech service to have Misty vocalize what she saw when her chin sensor was activated. As Misty begins to understand the context of her environment, she is able to take on more complex tasks both in the home and for business applications.
Let’s revisit the way Misty turns — first her head and then her body — before she begins moving after detecting CP’s voice. The movement and mannerisms of robots matter. The more natural they are, the easier it is for humans to accept them as helpful companions versus just another piece of equipment; the more human-like they are, the easier it is to interact with them in our everyday lives in meaningful ways.
To enable this natural movement, CP leveraged Misty’s patent-pending 3-degrees of freedom neck via the Command Center.
In short, CP eliminated the jerky, “robotic” movements in favor of a more fluid movement. First, he programmed Misty to turn her head and, as the head is turning and about to hit its limits (yaw), her body begins turning to face the person, too. Then as the body is turning, the head begins to turn back to center which gives the appearance that her head is not moving and the body is turning.
While this smooth movement makes all the difference in this skill and any use cases it’s used in, it only requires a few lines of code which can be found here on CP’s GitHub repo.
Incorporating sound localization into your own skills
The Move to Sound Misty Skill is one that can invite engagement the way CP illustrated in the video above. Misty can greet you in the morning when you walk out of your bedroom and say her name, welcome your child when he returns home from school, and be the infinitely cheerful greeter at your next party or office event. (Well, she may need a charge after a few hours but don’t we all?)
Alternatively, you can opt to have Misty move away from noise which also adds to her personality. Just as a human or an animal would likely move away from a loud noise because it might present a threat to their safety, you can also program Misty to move away when she detects a potentially threatening noise. Some examples include a loud coffee grinder, a slamming door, a barking dog, or a vacuum (hey, that could be scary for a 14-inch robot!) A slight change to the Move to Sound Misty Skill code and voilá, you’ve just created another skill.
Building the first professional platform robot for developers requires serious technology. Misty’s three far-field speakers by Qualcomm® Fluence™ PRO, her dual Qualcomm® Snapdragon processors, 4k camera, and all of the other features packed away neatly into this six-pound robot ensure that the ideas you have for her can be executed successfully — All she needs now is your code.
Please share your ideas and the skills you’re building in our Community Forum and find other Misty Skills others have already built!