Skip to main navigation Skip to search Skip to main content

DocUNet: Document Image Unwarping via a Stacked U-Net

  • Stony Brook University
  • Megvii Technology Limited

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

154 Scopus citations

Abstract

Capturing document images is a common way for digitizing and recording physical documents due to the ubiquitousness of mobile cameras. To make text recognition easier, it is often desirable to digitally flatten a document image when the physical document sheet is folded or curved. In this paper, we develop the first learning-based method to achieve this goal. We propose a stacked U-Net [25] with intermediate supervision to directly predict the forward mapping from a distorted image to its rectified version. Because large-scale real-world data with ground truth deformation is difficult to obtain, we create a synthetic dataset with approximately 100 thousand images by warping non-distorted document images. The network is trained on this dataset with various data augmentations to improve its generalization ability. We further create a comprehensive benchmark1 that covers various real-world conditions. We evaluate the proposed model quantitatively and qualitatively on the proposed benchmark, and compare it with previous non-learning-based methods.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
PublisherIEEE Computer Society
Pages4700-4709
Number of pages10
ISBN (Electronic)9781538664209
DOIs
StatePublished - Dec 14 2018
Event31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 - Salt Lake City, United States
Duration: Jun 18 2018Jun 22 2018

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Conference

Conference31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
Country/TerritoryUnited States
CitySalt Lake City
Period06/18/1806/22/18

Fingerprint

Dive into the research topics of 'DocUNet: Document Image Unwarping via a Stacked U-Net'. Together they form a unique fingerprint.

Cite this