Skip to main navigation Skip to search Skip to main content

PAT: Geometry-Aware Hard-Label Black-Box Adversarial Attacks on Text

  • Muchao Ye
  • , Jinghui Chen
  • , Chenglin Miao
  • , Han Liu
  • , Ting Wang
  • , Fenglong Ma
  • Pennsylvania State University
  • Iowa State University
  • Dalian University of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Scopus citations

Abstract

Despite a plethora of prior explorations, conducting text adversarial attacks in practical settings is still challenging with the following constraints: black box - the inner structure of the victim model is unknown; hard label - the attacker only has access to the top-1 prediction results; and semantic preservation - the perturbation needs to preserve the original semantics. In this paper, we present PAT, a novel adversarial attack method employed under all these constraints. Specifically, PAT explicitly models the adversarial and non-adversarial prototypes and incorporates them to measure semantic changes for replacement selection in the hard-label black-box setting to generate high-quality samples. In each iteration, PAT finds original words that can be replaced back and selects better candidate words for perturbed positions in a geometry-aware manner guided by this estimation, which maximally improves the perturbation construction and minimally impacts the original semantics. Extensive evaluation with benchmark datasets and state-of-the-art models shows that PAT outperforms existing text adversarial attacks in terms of both attack effectiveness and semantic preservation. Moreover, we validate the efficacy of PAT against industry-leading natural language processing platforms in real-world settings.

Original languageEnglish
Title of host publicationKDD 2023 - Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages3093-3104
Number of pages12
ISBN (Electronic)9798400701030
DOIs
StatePublished - Aug 4 2023
Event29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023 - Long Beach, United States
Duration: Aug 6 2023Aug 10 2023

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
ISSN (Print)2154-817X

Conference

Conference29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023
Country/TerritoryUnited States
CityLong Beach
Period08/6/2308/10/23

Keywords

  • hard-label adversarial attack
  • robustness of language model

Fingerprint

Dive into the research topics of 'PAT: Geometry-Aware Hard-Label Black-Box Adversarial Attacks on Text'. Together they form a unique fingerprint.

Cite this