Particle News: Columbia Robot Learns Realistic Lip Sync From Mirrors and YouTube

Overview

Columbia Engineering’s Creative Machines Lab reported the work in Science Robotics, describing a robotic head called EMO that learns facial control.
EMO first built a vision-to-action model by watching its reflection, then mapped audio to mouth motions after studying hours of human speech and singing videos.
The flexible silicone face is driven by 26 internal motors that coordinate nuanced lip shapes synchronized with synthetic speech.
Demonstrations show EMO forming words in multiple languages and performing a track from an AI‑generated debut album titled “hello world_.”
Researchers note current weaknesses with sounds such as “B” and “W,” foresee improvement with more training, and point to applications with conversational AI in education, entertainment, healthcare, and elder care as some economists predict large‑scale humanoid production.