TY - GEN
T1 - MI-NeRF
T2 - Workshops that were held in conjunction with the 18th European Conference on Computer Vision, ECCV 2024
AU - Chatziagapi, Aggelina
AU - Chrysos, Grigorios G.
AU - Samaras, Dimitris
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - NeRFs have shown remarkable results in modeling the 4D dynamics and appearance of human faces. However, they require per-identity optimization. A crucial step towards building foundation models for humans would be to learn a unified representation for multiple subjects. In this work, we introduce MI-NeRF (multi-identity NeRF), a single network that models complex non-rigid facial motion for multiple identities, using only monocular videos. The core premise in our method is to learn the non-linear interactions between identity and non-identity specific information with a multiplicative module. We present an extensive study of different variants of our proposed module and their technical derivations. We demonstrate results for both facial expression transfer and talking face video synthesis. By training on multiple videos simultaneously, MI-NeRF not only reduces the total training time compared to standard single-identity NeRFs, but also demonstrates robustness in synthesizing novel expressions for any input identity. Our method can be further personalized for a target identity given only a short video. Project page: https://aggelinacha.github.io/MI-NeRF/.
AB - NeRFs have shown remarkable results in modeling the 4D dynamics and appearance of human faces. However, they require per-identity optimization. A crucial step towards building foundation models for humans would be to learn a unified representation for multiple subjects. In this work, we introduce MI-NeRF (multi-identity NeRF), a single network that models complex non-rigid facial motion for multiple identities, using only monocular videos. The core premise in our method is to learn the non-linear interactions between identity and non-identity specific information with a multiplicative module. We present an extensive study of different variants of our proposed module and their technical derivations. We demonstrate results for both facial expression transfer and talking face video synthesis. By training on multiple videos simultaneously, MI-NeRF not only reduces the total training time compared to standard single-identity NeRFs, but also demonstrates robustness in synthesizing novel expressions for any input identity. Our method can be further personalized for a target identity given only a short video. Project page: https://aggelinacha.github.io/MI-NeRF/.
KW - Face Representation
KW - Multiple Identities
KW - Neural Radiance Fields
UR - https://www.scopus.com/pages/publications/105006937240
U2 - 10.1007/978-3-031-92591-7_30
DO - 10.1007/978-3-031-92591-7_30
M3 - Conference contribution
AN - SCOPUS:105006937240
SN - 9783031925900
T3 - Lecture Notes in Computer Science
SP - 451
EP - 469
BT - Computer Vision – ECCV 2024 Workshops, Proceedings
A2 - Del Bue, Alessio
A2 - Canton, Cristian
A2 - Pont-Tuset, Jordi
A2 - Tommasi, Tatiana
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 29 September 2024 through 4 October 2024
ER -