Skip to main navigation Skip to search Skip to main content

M4X: Enhancing Cross-View Generalizability in RF-Based Human Activity Recognition by Exploiting Synthetic Data in Metric Learning

  • Mengjing Liu
  • , Zongxing Xie
  • , Fan Ye
  • Stony Brook University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Human activity recognition provides insights into physical and mental well-being by monitoring patterns of movement and behavior, facilitating personalized interventions and proactive health management. Radio Frequency (RF)-based human activity recognition (HAR) is gaining attention due to its less privacy exposure and non-contact characteristics. However, it suffers from data scarcity problems and is sensitive to environment changes. Collecting and labeling such data is labor-intensive and time consuming. The limited training data makes generalizability challenging when the sensor is deployed in a very different relative view in the real world. Synthetic data generation from abundant videos presents a potential to address data scarcity issues, yet the domain gaps between synthetic and real data constrain its benefit. In this paper, we firstly share our investigations and insights on the intrinsic limitations of existing video-based data synthesis methods. Then we present M4X, a method using metric learning to extract effective view-independent features from the more abundant synthetic data despite their domain gaps, thus enhancing cross-view general-izability. We explore two main design issues in different mining strategies for contrastive pairs/triplets construction, and different forms of loss functions. We find that the best choices are offline triplet mining with real data as anchors, balanced triplets, and a triplet loss function without hard negative mining for higher discriminative power. Comprehensive experiments show that M4X consistently outperform baseline methods in cross-view generalizability. In the most challenging case of the least amount of real training data, M4X outperforms three baselines by 7.9-16.5 % on all views, and 18.9-25.6 % on a view with only synthetic but no real data during training. This proves its effectiveness in extracting view-independent features from synthetic data despite their domain gaps. We also observe that given limited sensor deployments, a participant-facing viewpoint and another at a large angle (e.g. 60°) tend to produce much better performance.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE/ACM Conference on Connected Health
Subtitle of host publicationApplications, Systems and Engineering Technologies, CHASE 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages49-60
Number of pages12
ISBN (Electronic)9798350345018
DOIs
StatePublished - 2024
Event2024 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies, CHASE 2024 - Wilmington, United States
Duration: Jun 19 2024Jun 21 2024

Publication series

NameProceedings - 2024 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies, CHASE 2024

Conference

Conference2024 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies, CHASE 2024
Country/TerritoryUnited States
CityWilmington
Period06/19/2406/21/24

Keywords

  • Cross-view Generalizability
  • Human Activity Recognition
  • Metric Learning
  • Radio Frequency

Fingerprint

Dive into the research topics of 'M4X: Enhancing Cross-View Generalizability in RF-Based Human Activity Recognition by Exploiting Synthetic Data in Metric Learning'. Together they form a unique fingerprint.

Cite this