TY - GEN
T1 - Identifying Self-Disclosures of Use, Misuse and Addiction in Community-based Social Media Posts
AU - Yang, Chenghao
AU - Chakrabarty, Tuhin
AU - Hochstatter, Karli R.
AU - Slavin, Melissa N.
AU - El-Bassel, Nabila
AU - Muresan, Smaranda
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - In the last decade, the United States has lost more than 500, 000 people from an overdose involving prescription and illicit opioids1 making it a national public health emergency (USDHHS, 2017). Medical practitioners require robust and timely tools that can effectively identify at-risk patients. Community-based social media platforms such as Reddit allow self-disclosure for users to discuss otherwise sensitive drug-related behaviors. We present a moderate size corpus of 2500 opioid-related posts from various subreddits labeled with six different phases of opioid use: Medical Use, Misuse, Addiction, Recovery, Relapse, Not Using. For every post, we annotate span-level extractive explanations and crucially study their role both in annotation quality and model development.2 We evaluate several state-of-the-art models in a supervised, few-shot, or zero-shot setting. Experimental results and error analysis show that identifying the phases of opioid use disorder is highly contextual and challenging. However, we find that using explanations during modeling leads to a significant boost in classification accuracy demonstrating their beneficial role in a high-stakes domain such as studying the opioid use disorder continuum.
AB - In the last decade, the United States has lost more than 500, 000 people from an overdose involving prescription and illicit opioids1 making it a national public health emergency (USDHHS, 2017). Medical practitioners require robust and timely tools that can effectively identify at-risk patients. Community-based social media platforms such as Reddit allow self-disclosure for users to discuss otherwise sensitive drug-related behaviors. We present a moderate size corpus of 2500 opioid-related posts from various subreddits labeled with six different phases of opioid use: Medical Use, Misuse, Addiction, Recovery, Relapse, Not Using. For every post, we annotate span-level extractive explanations and crucially study their role both in annotation quality and model development.2 We evaluate several state-of-the-art models in a supervised, few-shot, or zero-shot setting. Experimental results and error analysis show that identifying the phases of opioid use disorder is highly contextual and challenging. However, we find that using explanations during modeling leads to a significant boost in classification accuracy demonstrating their beneficial role in a high-stakes domain such as studying the opioid use disorder continuum.
UR - https://www.scopus.com/pages/publications/85197915576
U2 - 10.18653/v1/2024.findings-naacl.161
DO - 10.18653/v1/2024.findings-naacl.161
M3 - Conference contribution
AN - SCOPUS:85197915576
T3 - Findings of the Association for Computational Linguistics: NAACL 2024 - Findings
SP - 2507
EP - 2521
BT - Findings of the Association for Computational Linguistics
A2 - Duh, Kevin
A2 - Gomez, Helena
A2 - Bethard, Steven
PB - Association for Computational Linguistics (ACL)
T2 - 2024 Findings of the Association for Computational Linguistics: NAACL 2024
Y2 - 16 June 2024 through 21 June 2024
ER -