Skip to main navigation Skip to search Skip to main content

Identifying co-referential names across large corpora

  • Stony Brook University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Scopus citations

Abstract

A single logical entity can be referred to by several different names over a large text corpus. We present our algorithm for finding all such co-reference sets in a large corpus. Our algorithm involves three steps: morphological similarity detection, contextual similarity analysis, and clustering. Finally, we present experimental results on over large corpus of real news text to analyze the performance our techniques.

Original languageEnglish
Title of host publicationCombinatorial Pattern Matching - 17th Annual Symposium, CPM 2006, Proceedings
PublisherSpringer Verlag
Pages12-23
Number of pages12
ISBN (Print)3540354557, 9783540354550
DOIs
StatePublished - 2006
Event17th Annual Symposium on Combinatorial Pattern Matching, CPM 2006 - Barcelona, Spain
Duration: Jul 5 2006Jul 7 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4009 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th Annual Symposium on Combinatorial Pattern Matching, CPM 2006
Country/TerritorySpain
CityBarcelona
Period07/5/0607/7/06

Fingerprint

Dive into the research topics of 'Identifying co-referential names across large corpora'. Together they form a unique fingerprint.

Cite this