Skip to main navigation Skip to search Skip to main content

A comparative study of demographic attribute inference in twitter

  • Emory University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

38 Scopus citations

Abstract

Social media platforms have become a major gateway to receive and analyze public opinions. Understanding users can provide invaluable context information of their social media posts and significantly improve traditional opinion analysis models. Demographic attributes, such as ethnicity, gender, age, among others, have been extensively applied to characterize social media users. While studies have shown that user groups formed by demographic attributes can have coherent opinions towards political issues, these attributes are often not explicitly coded by users through their profiles. Previous work has demonstrated the effectiveness of different user signals such as users' posts and names in determining demographic attributes. Yet, these efforts mostly evaluate linguistic signals from users' posts and train models from artificially balanced datasets. In this paper, we propose a comprehensive list of user signals: self-descriptions and posts aggregated from users' friends and followers, users' profile images, and users' names. We provide a comparative study of these signals side-by-side in the tasks on inferring three major demographic attributes, namely ethnicity, gender, and age. We utilize a realistic unbalanced datasets that share similar demographic makeups in Twitter for training models and evaluation experiments. Our experiments indicate that self-descriptions provide the strongest signal for ethnicity and age inference and clearly improve the overall performance when combined with tweets. Profile images for gender inference have the highest precision score with overall score close to the best result in our setting. This suggests that signals in self-descriptions and profile images have potentials to facilitate demographic attribute inferences in Twitter, and are promising for future investigation.

Original languageEnglish
Title of host publicationProceedings of the 9th International AAAI Conference on Web and Social Media, ICWSM 2015
PublisherAAAI Press
Pages590-593
Number of pages4
Edition1
ISBN (Electronic)9781577357339
StatePublished - 2015
Event9th International AAAI Conference on Web and Social Media, ICWSM 2015 - Oxford, United Kingdom
Duration: May 26 2015May 29 2015

Publication series

NameProceedings of the 9th International Conference on Web and Social Media, ICWSM 2015
Number1
Volume9

Conference

Conference9th International AAAI Conference on Web and Social Media, ICWSM 2015
Country/TerritoryUnited Kingdom
CityOxford
Period05/26/1505/29/15

Fingerprint

Dive into the research topics of 'A comparative study of demographic attribute inference in twitter'. Together they form a unique fingerprint.

Cite this