Skip to main navigation Skip to search Skip to main content

LipNeRF: What is the right feature space to lip-sync a NeRF?

  • Aggelina Chatziagapi
  • , Shah Rukh Athar
  • , Abhinav Jain
  • , M. V. Rohith
  • , Vimal Bhat
  • , Dimitris Samaras
  • Stony Brook University
  • Amazon Prime Video

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

26 Scopus citations

Abstract

Synthesizing high-fidelity talking head videos of an arbitrary identity, lip-synced to a target speech segment, is a challenging problem. Recent GAN-based methods succeed by training a model on a large amount of videos, allowing the generator to learn a variety of audio-lip representations. However, they are unable to handle head pose changes. On the other hand, Neural Radiance Fields (NeRFs) model the 3D face geometry more accurately. Current audio-conditioned NeRFs are not as good in lip synchronization as GANs, since they are trained on limited video data of a single identity. In this work, we propose LipNeRF, a lip-syncing NeRF that bridges the gap between the accurate lip synchronization of GAN-based methods and the accurate 3D face modeling of NeRFs. LipNeRF is conditioned on the expression space of a 3DMM, instead of the audio feature space. We experimentally demonstrate that the expression space gives a better representation for the lip shape than the audio feature space. LipNeRF shows a significant improvement in lip-sync quality over the current state-of-the-art, especially in high-definition videos of cinematic content, with challenging pose, illumination and expression variations.

Original languageEnglish
Title of host publication2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition, FG 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350345445
DOIs
StatePublished - 2023
Event17th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2023 - Waikoloa Beach, United States
Duration: Jan 5 2023Jan 8 2023

Publication series

Name2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition, FG 2023

Conference

Conference17th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2023
Country/TerritoryUnited States
CityWaikoloa Beach
Period01/5/2301/8/23

Fingerprint

Dive into the research topics of 'LipNeRF: What is the right feature space to lip-sync a NeRF?'. Together they form a unique fingerprint.

Cite this