Skip to main navigation Skip to search Skip to main content

An Adversarial Neuro-Tensorial Approach for Learning Disentangled Representations

  • Mengjiao Wang
  • , Zhixin Shu
  • , Shiyang Cheng
  • , Yannis Panagakis
  • , Dimitris Samaras
  • , Stefanos Zafeiriou
  • Imperial College London
  • Stony Brook University
  • Middlesex University

Research output: Contribution to journalArticlepeer-review

19 Scopus citations

Abstract

Several factors contribute to the appearance of an object in a visual scene, including pose, illumination, and deformation, among others. Each factor accounts for a source of variability in the data, while the multiplicative interactions of these factors emulate the entangled variability, giving rise to the rich structure of visual object appearance. Disentangling such unobserved factors from visual data is a challenging task, especially when the data have been captured in uncontrolled recording conditions (also referred to as “in-the-wild”) and label information is not available. In this paper, we propose a pseudo-supervised deep learning method for disentangling multiple latent factors of variation in face images captured in-the-wild. To this end, we propose a deep latent variable model, where the multiplicative interactions of multiple latent factors of variation are explicitly modelled by means of multilinear (tensor) structure. We demonstrate that the proposed approach indeed learns disentangled representations of facial expressions and pose, which can be used in various applications, including face editing, as well as 3D face reconstruction and classification of facial expression, identity and pose.

Original languageEnglish
Pages (from-to)743-762
Number of pages20
JournalInternational Journal of Computer Vision
Volume127
Issue number6-7
DOIs
StatePublished - Jun 1 2019

Keywords

  • Adversarial autoencoder
  • Disentangled representation
  • Tensor decomposition

Fingerprint

Dive into the research topics of 'An Adversarial Neuro-Tensorial Approach for Learning Disentangled Representations'. Together they form a unique fingerprint.

Cite this