TY - GEN
T1 - Characterizing JSON traffic patterns on a CDN
AU - Vargas, Santiago
AU - Steiner, Moritz
AU - Goel, Utkarsh
AU - Balasubramanian, Aruna
N1 - Publisher Copyright:
© 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-1-4503-6948-0/19/10...$15.00
PY - 2019/10/21
Y1 - 2019/10/21
N2 - Content delivery networks serve a major fraction of the Internet traffic, and their geographically deployed infrastructure makes them a good vantage point to observe traffic access patterns. We perform a large-scale investigation to characterize Web traffic patterns observed from a major CDN infrastructure. Specifically, we discover that responses with application/json content-type form a growing majority of all HTTP requests. As a result, we seek to understand what types of devices and applications are requesting JSON objects and explore opportunities to optimize CDN delivery of JSON traffic. Our study shows that mobile applications account for at least 52% of JSON traffic on the CDN and embedded devices account for another 12% of all JSON traffic. We also find that more than 55% of JSON traffic on the CDN is uncacheable, showing that a large portion of JSON traffic on the CDN is dynamic. By further looking at patterns of periodicity in requests, we find that 6.3% of JSON traffic is periodically requested and reflects the use of (partially) autonomous software systems, IoT devices, and other kinds of machine-to-machine communication. Finally, we explore dependencies in JSON traffic through the lens of ngram models and find that these models can capture patterns between subsequent requests. We can potentially leverage this to prefetch requests, improving the cache hit ratio.
AB - Content delivery networks serve a major fraction of the Internet traffic, and their geographically deployed infrastructure makes them a good vantage point to observe traffic access patterns. We perform a large-scale investigation to characterize Web traffic patterns observed from a major CDN infrastructure. Specifically, we discover that responses with application/json content-type form a growing majority of all HTTP requests. As a result, we seek to understand what types of devices and applications are requesting JSON objects and explore opportunities to optimize CDN delivery of JSON traffic. Our study shows that mobile applications account for at least 52% of JSON traffic on the CDN and embedded devices account for another 12% of all JSON traffic. We also find that more than 55% of JSON traffic on the CDN is uncacheable, showing that a large portion of JSON traffic on the CDN is dynamic. By further looking at patterns of periodicity in requests, we find that 6.3% of JSON traffic is periodically requested and reflects the use of (partially) autonomous software systems, IoT devices, and other kinds of machine-to-machine communication. Finally, we explore dependencies in JSON traffic through the lens of ngram models and find that these models can capture patterns between subsequent requests. We can potentially leverage this to prefetch requests, improving the cache hit ratio.
KW - Content Delivery Networks (CDNs)
KW - JSON
KW - Web
UR - https://www.scopus.com/pages/publications/85074841368
U2 - 10.1145/3355369.3355594
DO - 10.1145/3355369.3355594
M3 - Conference contribution
AN - SCOPUS:85074841368
T3 - Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC
SP - 195
EP - 201
BT - IMC 2019 - Proceedings of the 2019 ACM Internet Measurement Conference
PB - Association for Computing Machinery
T2 - 19th ACM Internet Measurement Conference, IMC 2019
Y2 - 21 October 2019 through 23 October 2019
ER -