Robots Learn to Speak Body Language

03 Aug 2017

Articles

By Alyssa Pagano

If your friend says she feels relaxed, but you see that her fists are clenched, you may doubt her sincerity. Robots, on the other hand, might take her word for it. Body language says a lot, but even with advances in computer vision and facial recognition technology, robots struggle to notice subtle body movement and can miss important social cues as a result.

Researchers at Carnegie Mellon University developed a body-tracking system that might help solve this problem. Called OpenPose, the system can track body movement, including hands and face, in real time. It uses computer vision and machine learning to process video frames, and can even keep track of multiple people simultaneously. This capability could ease human-robot interactions and pave the way for more interactive virtual and augmented reality as well as intuitive user interfaces.

One notable feature of the OpenPose system is that it can track not only a person’s head, torso, and limbs but also individual fingers. To do that, the researchers used CMU’s Panoptic Studio, a dome lined with 500 cameras, where they captured body poses at a variety of angles and then used those images to build a data set.

They then passed those images through what is called a keypoint detector to identify and label specific body parts. The software also learns to associate the body parts with individuals, so it knows, for example, that a particular person’s hand will always be close to his or her elbow. This makes it possible to track multiple people at once.

The images from the dome were captured in 2D. But the researchers took the detected keypoints and triangulated them in 3D to help their body-tracking algorithms to understand how each pose appears from different perspectives. With all of this data processed, the system can determine how the whole hand looks when it’s in a particular position, even if some fingers are obscured.

Now that the system has this data set to draw from, it can run with only one camera and a laptop. It no longer requires the camera-lined dome to determine body poses, making the technology mobile and accessible. The researchers have already released their code to the public to encourage experimentation.

They say this technology could be applied to all sorts of interactions between humans and machines. It could play a huge role in VR experiences, allowing finer detection of the user’s physical movement without any additional hardware, like stick-on sensors or gloves.

It could also facilitate more natural interactions with a home robot. You could tell your robot to “pick that up,” and it could immediately understand what you’re pointing at. By perceiving and interpreting your physical gestures, the robot may even learn to read emotions by tracking body language. So when you’re silently crying with your face in your hands because a robot has taken your job, it might offer you a tissue.

[READ MORE]