Skip to main navigation Skip to search Skip to main content

Cluster appearance glyphs: A methodology for illustrating high-dimensional data patterns in 2-d data layouts

  • Stony Brook University

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Two-dimensional space embeddings such as Multi-Dimensional Scaling (MDS) are a popu-lar means to gain insight into high-dimensional data relationships. However, in all but the simplest cases these embeddings suffer from significant distortions, which can lead to misinterpretations of the high-dimensional data. These distortions occur both at the global inter-cluster and the local intra-cluster levels. The former leads to misinterpretation of the distances between the various N-D cluster populations, while the latter hampers the appreciation of their individual shapes and composition, which we call cluster appearance. The distortion of cluster appearance incurred in the 2-D embedding is unavoidable since such low-dimensional embeddings always come at the loss of some of the intra-cluster variance. In this paper, we propose techniques to overcome these limitations by conveying the N-D cluster appearance via a framework inspired by illustrative design. Here we make use of Scagnostics which offers a set of intuitive feature descriptors to describe the appearance of 2-D scatterplots. We extend the Scagnostics analysis to N-D and then devise and test via crowd-sourced user studies a set of parameterizable texture patterns that map to the various Scagnostics descriptors. Finally, we embed these N-D Scagnostics-informed texture patterns into shapes derived from N-D statistics to yield what we call Cluster Appearance Glyphs. We demonstrate our framework with a dataset acquired to analyze program execution times in file systems.

Original languageEnglish
Article number3
JournalInformation (Switzerland)
Volume13
Issue number1
DOIs
StatePublished - Jan 2022

Keywords

  • Glyphs
  • High-dimensional data
  • Visual analytics

Fingerprint

Dive into the research topics of 'Cluster appearance glyphs: A methodology for illustrating high-dimensional data patterns in 2-d data layouts'. Together they form a unique fingerprint.

Cite this