Overview
- Columbia Engineering’s Creative Machines Lab first trained the system with mirror-based vision‑to‑action learning, then used hours of human videos to associate audio with lip shapes.
- The study, led by Professor Hod Lipson with PhD researcher Yuhang Hu, reports peer‑reviewed results and a public demo of speaking and singing.
- The robot performed a track from an AI‑generated debut album titled “hello world_,” demonstrating synchronized articulation during song.
- Researchers note current shortcomings with hard consonants like “B” and lip‑puckering sounds such as “W,” with performance expected to improve through continued training.
- The team says pairing the capability with conversational AI such as ChatGPT or Gemini could open uses in entertainment, education, medicine, and elder care.