TY - GEN
T1 - GFWeb
T2 - 33rd USENIX Security Symposium, USENIX Security 2024
AU - Hoang, Nguyen Phong
AU - Dalek, Jakub
AU - Crete-Nishihata, Masashi
AU - Christin, Nicolas
AU - Yegneswaran, Vinod
AU - Polychronakis, Michalis
AU - Feamster, Nick
N1 - Publisher Copyright:
© USENIX Security Symposium 2024.All rights reserved.
PY - 2024
Y1 - 2024
N2 - Censorship systems such as the Great Firewall (GFW) have been continuously refined to enhance their filtering capabilities. However, most prior studies, and in particular the GFW, have been limited in scope and conducted over short time periods, leading to gaps in our understanding of the GFW's evolving Web censorship mechanisms over time. We introduce GFWeb, a novel system designed to discover domain blocklists used by the GFW for censoring Web access. GFWeb exploits GFW's bidirectional and loss-tolerant blocking behavior to enable testing hundreds of millions of domains on a monthly basis, thereby facilitating large-scale longitudinal measurement of HTTP and HTTPS blocking mechanisms. Over the course of 20 months, GFWeb has tested a total of 1.02 billion domains, and detected 943K and 55K pay-level domains censored by the GFW's HTTP and HTTPS filters, respectively. To the best of our knowledge, our study represents the most extensive set of domains censored by the GFW ever discovered to date, many of which have never been detected by prior systems. Analyzing the longitudinal dataset collected by GFWeb, we observe that the GFW has been upgraded to mitigate several issues previously identified by the research community, including overblocking and failure in reassembling fragmented packets. More importantly, we discover that the GFW's bidirectional blocking is not symmetric as previously thought, i.e., it can only be triggered by certain domains when probed from inside the country. We discuss the implications of our work on existing censorship measurement and circumvention efforts. We hope insights gained from our study can help inform future research, especially in monitoring censorship and developing new evasion tools.
AB - Censorship systems such as the Great Firewall (GFW) have been continuously refined to enhance their filtering capabilities. However, most prior studies, and in particular the GFW, have been limited in scope and conducted over short time periods, leading to gaps in our understanding of the GFW's evolving Web censorship mechanisms over time. We introduce GFWeb, a novel system designed to discover domain blocklists used by the GFW for censoring Web access. GFWeb exploits GFW's bidirectional and loss-tolerant blocking behavior to enable testing hundreds of millions of domains on a monthly basis, thereby facilitating large-scale longitudinal measurement of HTTP and HTTPS blocking mechanisms. Over the course of 20 months, GFWeb has tested a total of 1.02 billion domains, and detected 943K and 55K pay-level domains censored by the GFW's HTTP and HTTPS filters, respectively. To the best of our knowledge, our study represents the most extensive set of domains censored by the GFW ever discovered to date, many of which have never been detected by prior systems. Analyzing the longitudinal dataset collected by GFWeb, we observe that the GFW has been upgraded to mitigate several issues previously identified by the research community, including overblocking and failure in reassembling fragmented packets. More importantly, we discover that the GFW's bidirectional blocking is not symmetric as previously thought, i.e., it can only be triggered by certain domains when probed from inside the country. We discuss the implications of our work on existing censorship measurement and circumvention efforts. We hope insights gained from our study can help inform future research, especially in monitoring censorship and developing new evasion tools.
UR - https://www.scopus.com/pages/publications/85199538234
M3 - Conference contribution
AN - SCOPUS:85199538234
T3 - Proceedings of the 33rd USENIX Security Symposium
SP - 2617
EP - 2633
BT - Proceedings of the 33rd USENIX Security Symposium
PB - USENIX Association
Y2 - 14 August 2024 through 16 August 2024
ER -