Skip to main navigation Skip to search Skip to main content

CuTeX: A system for extracting data from text tables

  • Stony Brook University

Research output: Contribution to journalConference articlepeer-review

Abstract

A system for extracting data from irregular text tables is designed and implemented. This system, CuteX, is an association between every items in a column. It is implemented in Java and is approximately about 3000 lines of code. The system automatically partitions the set of input text tables into directories containing correct and incorrect extractions. This paper focuses on the demonstration of illustrating the robustness and iterative process of improving the extraction yield of the clustering algorithm.

Original languageEnglish
Pages (from-to)457
Number of pages1
JournalSIGIR Forum (ACM Special Interest Group on Information Retrieval)
StatePublished - 2002
EventProceedings of the Twenty-Fifth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - Tampere, Finland
Duration: Aug 11 2002Aug 15 2002

Fingerprint

Dive into the research topics of 'CuTeX: A system for extracting data from text tables'. Together they form a unique fingerprint.

Cite this