Skip to main navigation Skip to search Skip to main content

FALCON: Fair Active Learning for Content Moderation

  • Zuhui Wang
  • , Sandra Sajeev
  • , Gaurav Mittal
  • , Matthew Hall
  • , Ye Yu
  • , Zhaozheng Yin
  • , Mei Chen
  • Stony Brook University
  • Microsoft USA

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Content moderation is the task of filtering inappropriate content (e.g., rude, hateful, or toxic posts) on online platforms. Deep learning models have been developed to address this task, however they tend to be prone to making unfair decisions for underrepresented groups such as racial minorities. Most popular methods for improving fairness only focus on a single group and single class bias, while multi-group and multi-class biases are prevalent and challenging in content moderation. In this paper, we present a novel framework, Fair Active Learning for CONtent moderation (FALCON), that helps mitigate multi-group and multi-class biases simultaneously while maintaining performance. We present a novel group-aware sample selection algorithm to actively select a subset of the entire dataset for training, and novel augmented uncertainty information that improves the query sample selection strategy by considering group fairness levels. We validate FALCON using multiple fairness evaluation metrics on three public datasets, including the Jigsaw Unintended Bias dataset. Our results show that FALCON maintains comparable performance to several bias mitigation methods while obtaining higher group fairness across multiple axes and datasets, as measured by a 22.5% improvement in demographic parity difference and an 8.4% improvement for equalized odds on average. Experiments on the Amazon Review dataset demonstrate the general applicability of FALCON beyond content moderation datasets. Warning: some content in this paper may be harmful, racist, and inappropriate.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2024 Workshops, Proceedings
EditorsAlessio Del Bue, Cristian Canton, Jordi Pont-Tuset, Tatiana Tommasi
PublisherSpringer Science and Business Media Deutschland GmbH
Pages1-17
Number of pages17
ISBN (Print)9783031926471
DOIs
StatePublished - 2025
EventWorkshops that were held in conjunction with the 18th European Conference on Computer Vision, ECCV 2024 - Milan, Italy
Duration: Sep 29 2024Oct 4 2024

Publication series

NameLecture Notes in Computer Science
Volume15643 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceWorkshops that were held in conjunction with the 18th European Conference on Computer Vision, ECCV 2024
Country/TerritoryItaly
CityMilan
Period09/29/2410/4/24

Keywords

  • Active Learning
  • Content Moderation
  • Fairness

Fingerprint

Dive into the research topics of 'FALCON: Fair Active Learning for Content Moderation'. Together they form a unique fingerprint.

Cite this