TY - GEN
T1 - Fast low-rank matrix approximation with locality sensitive hashing for quick anomaly detection
AU - Xie, Gaogang
AU - Xie, Kun
AU - Huang, Jun
AU - Wang, Xin
AU - Chen, Yuxiang
AU - Wen, Jigang
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/10/2
Y1 - 2017/10/2
N2 - Detecting anomalous traffic is a critical task for advanced Internet management. The traditional approaches based on Principal Component Analysis (PCA) are effective only when the corruption is caused by small additive i.i.d. Gaussian noise. The recent Direct Robust Matrix Factorization (DRMF) is proven to be more robust and accurate in anomaly detection, but it incurs a high computation cost due to its need of singular value decomposition (SVD) for low-rank matrix approximation and the iterative use of SVD execution to find the final solution. To enable the anomaly detection for large traffic matrix with the use of DRMF, we formulate the low-rank matrix approximation problem as a problem of searching for the subspace to project the traffic matrix with the minimum error. We propose a novel approach, LSH-subspace, for fast low-rank matrix approximation. To facilitate the matrix partition for the quick search of the subspace, we propose several novel techniques: a multi-layer locality sensitive hashing (LSH) table to reorder the OD pairs based on LSH function, a partition principle to guide the partition to minimize the projection error, and a lightweight algorithm to exploit the sparsity of the outlier matrix to update the LSH table at low overhead. Our extensive simulations based on real trace data demonstrate that our LSH-subspace is 3 times faster than DRMF with high anomaly detection accuracy.
AB - Detecting anomalous traffic is a critical task for advanced Internet management. The traditional approaches based on Principal Component Analysis (PCA) are effective only when the corruption is caused by small additive i.i.d. Gaussian noise. The recent Direct Robust Matrix Factorization (DRMF) is proven to be more robust and accurate in anomaly detection, but it incurs a high computation cost due to its need of singular value decomposition (SVD) for low-rank matrix approximation and the iterative use of SVD execution to find the final solution. To enable the anomaly detection for large traffic matrix with the use of DRMF, we formulate the low-rank matrix approximation problem as a problem of searching for the subspace to project the traffic matrix with the minimum error. We propose a novel approach, LSH-subspace, for fast low-rank matrix approximation. To facilitate the matrix partition for the quick search of the subspace, we propose several novel techniques: a multi-layer locality sensitive hashing (LSH) table to reorder the OD pairs based on LSH function, a partition principle to guide the partition to minimize the projection error, and a lightweight algorithm to exploit the sparsity of the outlier matrix to update the LSH table at low overhead. Our extensive simulations based on real trace data demonstrate that our LSH-subspace is 3 times faster than DRMF with high anomaly detection accuracy.
KW - Anomaly Detection
KW - Low-Rank Matrix Approximation
UR - https://www.scopus.com/pages/publications/85034116998
U2 - 10.1109/INFOCOM.2017.8057217
DO - 10.1109/INFOCOM.2017.8057217
M3 - Conference contribution
AN - SCOPUS:85034116998
T3 - Proceedings - IEEE INFOCOM
BT - INFOCOM 2017 - IEEE Conference on Computer Communications
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE Conference on Computer Communications, INFOCOM 2017
Y2 - 1 May 2017 through 4 May 2017
ER -