Abstract
The commoditization of private data has become an attractive research topic with the emergence of Big Data era. In this paper, we study the trading of high-dimensional private data with differential privacy guarantee. We propose Cheap, which is a novel Correlated data trading framework for High-dimEnsionAl Private data. Cheap first models data correlations among high-dimensional user attributes, and builds an initial attribute clustering scheme. Combined with this scheme, Cheap devises a novel data perturbation mechanism by solving optimal attribute clustering (OAC) problem, in order to improve data utility of traded data and further generate a privacy-preserving high-dimensional dataset with close joint distribution with the original one. It then quantifies privacy loss based on near-optimal attribute cluster scheme due to the NP-hardness of the OAC problem, and further compensates data owners by running auction in a cost-effective way. We evaluate the performance of Cheap on UserBehavior dataset and Obesity dataset, respectively. Our evaluation and analysis demonstrate that Cheap well balances data utility and privacy protection, and achieves all desired economic properties of budget balance, individual rationality and truthfulness.
| Original language | English |
|---|---|
| Pages (from-to) | 1047-1059 |
| Number of pages | 13 |
| Journal | IEEE Transactions on Parallel and Distributed Systems |
| Volume | 34 |
| Issue number | 3 |
| DOIs | |
| State | Published - Mar 1 2023 |
Keywords
- Data correlation
- data privacy
- data trading
Fingerprint
Dive into the research topics of 'Towards Correlated Data Trading for High-Dimensional Private Data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver