Treffer: Using Large Language Models for sentiment analysis of health-related social media data: empirical evaluation and practical tips.

Title:
Using Large Language Models for sentiment analysis of health-related social media data: empirical evaluation and practical tips.
Authors:
He L; Joseph J. Zilber College of Public Health, University of Wisconsin-Milwaukee, USA., Omranian S; Department of Computer Science, University of Wisconsin-Milwaukee, USA., McRoy S; Department of Computer Science, University of Wisconsin-Milwaukee, USA., Zheng K; Department of Informatics, Donald Bren School of Information and Computer Science, University of California, Irvine, USA.
Source:
AMIA ... Annual Symposium proceedings. AMIA Symposium [AMIA Annu Symp Proc] 2025 May 22; Vol. 2024, pp. 503-512. Date of Electronic Publication: 2025 May 22 (Print Publication: 2024).
Publication Type:
Journal Article
Language:
English
Journal Info:
Publisher: American Medical Informatics Association Country of Publication: United States NLM ID: 101209213 Publication Model: eCollection Cited Medium: Internet ISSN: 1942-597X (Electronic) Linking ISSN: 15594076 NLM ISO Abbreviation: AMIA Annu Symp Proc Subsets: MEDLINE
Imprint Name(s):
Original Publication: Bethesda, MD : American Medical Informatics Association, c2003-
References:
J Am Med Inform Assoc. 2021 Jun 12;28(6):1125-1134. (PMID: 33355353)
JMIR Ment Health. 2024 Jan 25;11:e50150. (PMID: 38271138)
J Med Internet Res. 2024 Jan 30;26:e51069. (PMID: 38289662)
J Am Med Inform Assoc. 2024 Sep 1;31(9):1812-1820. (PMID: 38281112)
J Biomed Inform. 2022 Aug;132:104142. (PMID: 35835437)
Brief Bioinform. 2023 Nov 22;25(1):. (PMID: 38168838)
AMIA Annu Symp Proc. 2021 Jan 25;2020:544-553. (PMID: 33936428)
J Am Med Inform Assoc. 2019 Nov 1;26(11):1297-1304. (PMID: 31265066)
NPJ Digit Med. 2024 Jan 11;7(1):6. (PMID: 38200151)
AMIA Annu Symp Proc. 2020 Mar 04;2019:607-616. (PMID: 32308855)
J Med Internet Res. 2017 May 26;19(5):e167. (PMID: 28550002)
J Am Med Inform Assoc. 2021 Jul 14;28(7):1564-1573. (PMID: 33690794)
J Biomed Semantics. 2017 Mar 3;8(1):9. (PMID: 28253919)
J Med Internet Res. 2024 Mar 1;26:e49139. (PMID: 38427404)
Entry Date(s):
Date Created: 20250526 Date Completed: 20250526 Latest Revision: 20250527
Update Code:
20250527
PubMed Central ID:
PMC12099340
PMID:
40417506
Database:
MEDLINE

Weitere Informationen

Health-related social media data generated by patients and the public provide valuable insights into patient experiences and opinions toward health issues such as vaccination and medical treatments. Using Natural Language Processing (NLP) methods to analyze such data, however, often requires high-quality annotations that are difficult to obtain. The recent emergence of Large Language Models (LLMs) such as the Generative Pre-trained Transformers (GPTs) has shown promising performance on a variety of NLP tasks in the health domain with little to no annotated data. However, their potential in analyzing health-related social media data remains underexplored. In this paper, we report empirical evaluations of LLMs (GPT-3.5-Turbo, FLAN-T5, and BERT-based models) on a common NLP task of health-related social media data: sentiment analysis for identifying opinions toward health issues. We explored how different prompting and fine-tuning strategies affect the performance of LLMs on social media datasets across diverse health topics, including Healthcare Reform, vaccination, mask wearing, and healthcare service quality. We found that LLMs outperformed VADER, a widely used off-the-shelf sentiment analysis tool, but are far from being able to produce accurate sentiment labels. However, their performance can be improved by data-specific prompts with information about the context, task, and targets. The highest performing LLMs are BERT-based models that were fine-tuned on aggregated data. We provide practical tips for researchers to use LLMs on health-related social media data for optimal outcomes. We also discuss future work needed to continue to improve the performance of LLMs for analyzing health-related social media data with minimal annotations.
(©2024 AMIA - All rights reserved.)