Skip to main navigation Skip to search Skip to main content

Vector-based similarity measurements for historical figures

  • Stony Brook University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Historical interpretation benefits from identifying analogies among famous people: Who are the Lincolns, Einsteins, Hitlers, and Mozarts? We investigate several approaches to convert approximately 600,000 historical figures into vector representations to quantify similarity according to their Wikipedia pages. We adopt an effective reference standard based on the number of human-annotated Wikipedia categories being shared and use this to demonstrate the performance of our similarity detection algorithms. In particular, we investigate four different unsupervised approaches to representing the semantic associations of individuals: (1) TF-IDF, (2) Weighted average of distributed word embedding, (3) LDA Topic analysis and (4) Deepwalk embedding from page links. All proved effective, but Deepwalk embedding yielded an overall accuracy of 91.33% in our evaluation to uncover historical analogies. Combining LDA and Deepwalk yielded even higher performance.

Original languageEnglish
Title of host publicationSimilarity Search and Applications - 8th International Conference, SISAP 2015, Proceedings
EditorsRichard Connor, Giuseppe Amato, Fabrizio Falchi, Claudio Gennaro
PublisherSpringer Verlag
Pages179-190
Number of pages12
ISBN (Print)9783319250861
DOIs
StatePublished - 2015
Event8th International Conference on Similarity Search and Applications, SISAP 2015 - Glasgow, United Kingdom
Duration: Oct 12 2015Oct 14 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9371
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th International Conference on Similarity Search and Applications, SISAP 2015
Country/TerritoryUnited Kingdom
CityGlasgow
Period10/12/1510/14/15

Keywords

  • Deepwalk
  • People similarity
  • Vector representations

Fingerprint

Dive into the research topics of 'Vector-based similarity measurements for historical figures'. Together they form a unique fingerprint.

Cite this