TY - GEN
T1 - Clustering high dimensional categorical data via topographical features
AU - Chen, Chao
AU - Quadrianto, Novi
N1 - Publisher Copyright:
© 2016 by the author(s).
PY - 2016
Y1 - 2016
N2 - Analysis of categorical data is a challenging task. In this paper, we propose to compute topographical features of high-dimensional categorical data. We propose an efficient algorithm to extract modes of the underlying distribution and their attractive basins. These topographical features provide a geometric view of the data and can be applied to visualization and clustering of real world challenging datasets. Experiments show that our principled method outperforms state-of-the-art clustering methods while also admits an embarrassingly parallel property.
AB - Analysis of categorical data is a challenging task. In this paper, we propose to compute topographical features of high-dimensional categorical data. We propose an efficient algorithm to extract modes of the underlying distribution and their attractive basins. These topographical features provide a geometric view of the data and can be applied to visualization and clustering of real world challenging datasets. Experiments show that our principled method outperforms state-of-the-art clustering methods while also admits an embarrassingly parallel property.
UR - https://www.scopus.com/pages/publications/84999027894
M3 - Conference contribution
AN - SCOPUS:84999027894
T3 - 33rd International Conference on Machine Learning, ICML 2016
SP - 4000
EP - 4008
BT - 33rd International Conference on Machine Learning, ICML 2016
A2 - Weinberger, Kilian Q.
A2 - Balcan, Maria Florina
PB - International Machine Learning Society (IMLS)
T2 - 33rd International Conference on Machine Learning, ICML 2016
Y2 - 19 June 2016 through 24 June 2016
ER -