Skip to main navigation Skip to search Skip to main content

HanDiffuser: Text-to-Image Generation with Realistic Hand Appearances

  • Supreeth Narasimhaswamy
  • , Uttaran Bhattacharya
  • , Xiang Chen
  • , Ishita Dasgupta
  • , Saayan Mitra
  • , Minh Hoai
  • Stony Brook University
  • Adobe Systems Incorporated

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

24 Scopus citations

Abstract

Text-to-image generative models can generate high-quality humans, but realism is lost when generating hands. Common artifacts include irregular hand poses, shapes, incorrect numbers of fingers, and physically implausible finger orientations. To generate images with realistic hands, we propose a novel diffusion-based architecture called HanDiffuser that achieves realism by injecting hand embeddings in the generative process. HanDiffuser consists of two components: a Text-to-Hand-Params diffusion model to generate SMPL-Body and MANO-Hand parameters from input text prompts, and a Text-Guided Hand-Params-to-Image diffusion model to synthesize images by conditioning on the prompts and hand parameters generated by the previous component. We incorporate multiple aspects of hand representation, including 3D shapes and joint-level finger positions, orientations and articulations, for robust learning and reliable performance during inference. We conduct extensive quantitative and qualitative experiments and perform user studies to demonstrate the efficacy of our method in generating images with high-quality hands. Project page: https://supreethn.github.io/research/handiffuser/index.html

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
PublisherIEEE Computer Society
Pages2468-2479
Number of pages12
ISBN (Electronic)9798350353006
ISBN (Print)9798350353006
DOIs
StatePublished - 2024
Event2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 - Seattle, United States
Duration: Jun 16 2024Jun 22 2024

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Conference

Conference2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
Country/TerritoryUnited States
CitySeattle
Period06/16/2406/22/24

Keywords

  • Diffusion
  • Hands
  • Humans

Fingerprint

Dive into the research topics of 'HanDiffuser: Text-to-Image Generation with Realistic Hand Appearances'. Together they form a unique fingerprint.

Cite this