TY - GEN
T1 - Efficiently Inferring Top-k Elephant Flows based on Discrete Tensor Completion
AU - Xie, Kun
AU - Tian, Jiazheng
AU - Wang, Xin
AU - Xie, Gaogang
AU - Wen, Jigang
AU - Zhang, Dafang
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/4
Y1 - 2019/4
N2 - Finding top- k elephant flows is a critical task in network measurement, with applications such as congestion control, anomaly detection, and traffic engineering. Traditional top- k flow detection problem focuses on using a small amount of memory to measure the total number of packets or bytes of each flow. Instead, we study a challenging problem of inferring the top- k elephant flows in a practical system with incomplete measurement data as a result of sub-sampling for scalability or data missing. The recent study shows it is promising to more accurately interpolate the missing data with a 3-D tensor compared to that based on a 2-D matrix. Taking full advantage of the multilinear structures, we apply tensor completion to first recover the missing data and then find the top- k elephant flows. To reduce the computational overhead, we propose a novel discrete tensor completion model which uses binary codes to represent the factor matrices. Based on the model, we further propose three novel techniques to speed up the whole top- k flow inference process: a discrete optimization algorithm to train the binary factor matrices, bit operations to facilitate quick missing data inference, and simplifying the finding of top- k elephant flows with binary code partition. In our discrete tensor completion model, only one bit is needed to represent the entry in the factor matrices instead of a real value (32 bits) needed in traditional tensor completion model, thus the storage cost is reduced significantly. Extensive experiments using two real traces demonstrate that compared with the state of art tensor completion algorithms, our discrete tensor completion algorithm can achieve similar data inference accuracy using significantly smaller time and storage space.
AB - Finding top- k elephant flows is a critical task in network measurement, with applications such as congestion control, anomaly detection, and traffic engineering. Traditional top- k flow detection problem focuses on using a small amount of memory to measure the total number of packets or bytes of each flow. Instead, we study a challenging problem of inferring the top- k elephant flows in a practical system with incomplete measurement data as a result of sub-sampling for scalability or data missing. The recent study shows it is promising to more accurately interpolate the missing data with a 3-D tensor compared to that based on a 2-D matrix. Taking full advantage of the multilinear structures, we apply tensor completion to first recover the missing data and then find the top- k elephant flows. To reduce the computational overhead, we propose a novel discrete tensor completion model which uses binary codes to represent the factor matrices. Based on the model, we further propose three novel techniques to speed up the whole top- k flow inference process: a discrete optimization algorithm to train the binary factor matrices, bit operations to facilitate quick missing data inference, and simplifying the finding of top- k elephant flows with binary code partition. In our discrete tensor completion model, only one bit is needed to represent the entry in the factor matrices instead of a real value (32 bits) needed in traditional tensor completion model, thus the storage cost is reduced significantly. Extensive experiments using two real traces demonstrate that compared with the state of art tensor completion algorithms, our discrete tensor completion algorithm can achieve similar data inference accuracy using significantly smaller time and storage space.
KW - Tensor completion
KW - Top- k elephant flow inference
UR - https://www.scopus.com/pages/publications/85068219522
U2 - 10.1109/INFOCOM.2019.8737482
DO - 10.1109/INFOCOM.2019.8737482
M3 - Conference contribution
AN - SCOPUS:85068219522
T3 - Proceedings - IEEE INFOCOM
SP - 2170
EP - 2178
BT - INFOCOM 2019 - IEEE Conference on Computer Communications
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE Conference on Computer Communications, INFOCOM 2019
Y2 - 29 April 2019 through 2 May 2019
ER -