Challenges in utilizing the FAERS database for adverse drug reaction data mining: A critical analysis

Xuelin Sun , Yatong Zhang , Dongfang Qian

Precision Medication ›› 2026, Vol. 3 ›› Issue (1) : 100080

PDF (1516KB)
Precision Medication ›› 2026, Vol. 3 ›› Issue (1) :100080 DOI: 10.1016/j.prmedi.2026.100080
research-article
Challenges in utilizing the FAERS database for adverse drug reaction data mining: A critical analysis
Author information +
History +
PDF (1516KB)

Abstract

The FAERS database is a vital tool for identifying adverse drug reactions (ADRs). However, data mining in FAERS faces significant challenges, including data quality issues (e.g., integrity, consistency, and completeness) and limitations in traditional model selection. These issues can introduce biases and affect the reliability of safety signal detection. This review critically analyzes the current state and limitations of FAERS data mining, particularly by briefly comparing it with other mainstream global databases to contextualize its unique challenges. It then proposes optimization strategies, focusing on improved data preprocessing, algorithm refinement, and the integration of emerging technologies. We emphasize the potential of Artificial Intelligence (AI) and multi-source data fusion to enhance detection sensitivity, accelerate the risk signal identification cycle, and address challenges in data-limited scenarios, such as rare diseases. We recommend promoting database standardization, strengthening validation, and formulating policy changes to fully realize FAERS's potential for precision pharmacovigilance.

Keywords

FAERS database / Adverse drug reactions / Data mining / Artificial intelligence / Database optimization

Cite this article

Download citation ▾
Xuelin Sun, Yatong Zhang, Dongfang Qian. Challenges in utilizing the FAERS database for adverse drug reaction data mining: A critical analysis. Precision Medication, 2026, 3(1): 100080 DOI:10.1016/j.prmedi.2026.100080

登录浏览全文

4963

注册一个新账户 忘记密码

Declarations

Not applicable.

CRediT authorship contribution statement

Xuelin Sun: Conceptualization, Methodology, Writing - Original draft preparation. Yatong Zhang: Literature retrieval and manuscript revision. Dongfang Qian: Visualization, Investigation.

Ethics approval and consent to participate

Not applicable.

Consent for publication

All authors have read and agreed to the published version of the manuscript and give their consent for publication in this journal.

Availability of data and materials

Not applicable.

Funding

This work was supported by the National High Level Hospital Clinical Research Funding (No. BJ-2023-200).

Declaration of Competing interest

The authors declare no competing interests.

Acknowledgements

Not applicable.

Authors' other information

Not applicable.

References

[1]

Duan R, Zhang X, Du J, et al. Post-marketing drug safety evaluation using data mining based on FAERS. Data Min Big Data. 2017:379-389 2017;2017.

[2]

Arku D, Yousef C, Abraham I. Changing paradigms in detecting rare adverse drug reactions: from disproportionality analysis, old and new, to machine learning. Expert Opin Drug Saf. 2022; 21(10):1235-1238.

[3]

Nango D, Sekizuka T, Goto M, et al. Analysis of information on drug adverse reactions using U.S. Food and Drug Administration Adverse Event Reporting System (FAERS). Yakugaku Zasshi. 2022; 142(4):341-344.

[4]

Sakaeda T, Tamon A, Kadoyama K, et al. Data mining of the public version of the FDA Adverse Event Reporting System. Int J Med Sci. 2013; 10(7):796-803.

[5]

Ventola CL. Big Data and Pharmacovigilance: Data Mining for Adverse Drug Events and Interactions. P T. 2018; 43(6):340-351.

[6]

HJELMSTRÖM Peter, BOWRING Geoffrey, YUE Qun-Ying, et al. Methods for signal management using the global safety database VigiBase. Chin J Pharmacovigil. 2024; 21(7):836-840.

[7]

Postigo R, Brosch S, Slattery J, et al. EudraVigilance Medicines Safety Database: Publicly Accessible Data for Research and Public Health Protection. Drug Saf. 2018; 41(7):665-675.

[8]

Gibbons RD, Amatya AK, Brown CH, et al. Post-approval drug safety surveillance. Annu Rev Public Health. 2010; 31:419-437.

[9]

Martínez-Abad F. Identification of factors associated with school effectiveness with data mining techniques: testing a new approach. Front Psychol. 2019; 10:2583.

[10]

Jiménez-Carvelo AM, González-Casado A, Bagur-González MG, et al. Alternative data mining/machine learning methods for the analytical evaluation of food quality and authenticity - a review. Food Res Int. 2019; 122:25-39.

[11]

Li C, Gao J, Pan Q, et al. Design and development of a big data platform for disease burden based on the Spark engine. Comput Intell Neurosci. 2023; 2023:8963053.

[12]

Liao PH, Chu W, Chu WC. Evaluation of the mining techniques in constructing a traditional Chinese-language nursing recording system. Comput Inf Nurs. 2014; 32(5):223-231.

[13]

Veronin MA, Schumaker RP, Dixit R. The irony of MedWatch and the FAERS database: an assessment of data input errors and potential consequences. J Pharm Technol. 2020; 36(4):164-167.

[14]

Cho S, Ensari I, Weng C, et al. Factors affecting the quality of person-generated wearable device data and associated challenges: rapid systematic review. JMIR Mhealth Uhealth. 2021; 9(3):e20738.

[15]

Harkener S, Stausberg J, Hagel C, et al. Towards a core set of indicators for data quality of registries. Stud Health Technol Inf. 2019; 267:39-45.

[16]

Kim MK, Rouphael C, McMichael J, et al. Challenges in and opportunities for electronic health record-based data analysis and interpretation. Gut Liver. 2024; 18(2):201-208.

[17]

Raju SH, Rao MN. Application of a data mining task called data preprocessing on the input data and efficient external sorting using refinement of existing algorithm. Int J Pharm Technol. 2016; 8(3):18395-18407.

[18]

Hung E, Hauben M, Essex H, et al. More extreme duplication in FDA Adverse Event Reporting System detected by literature reference normalization and fuzzy string matching. Pharmacoepidemiol Drug Saf. 2023; 32(3):387-391.

[19]

Carrizosa E, Molero-Río C, Romero Morales D. Mathematical optimization in classification and regression trees. Top (Berl). 2021; 29(1):5-33.

[20]

Loeffler C, Karlsberg A, Martin LS, et al. Correction to: Improving the usability and comprehensiveness of microbial databases. BMC Biol. 2020; 18(1):92.

[21]

Muthuraj, Singla S. Artificial intelligence and machine learning. MedLeg Update. 2023; 23(5):6-11.

[22]

Damar M, Yüksel İ, Çetinkol AE, et al. Advancements and integration: a comprehensive review of health informatics and its diverse subdomains with a focus on technological trends. Health Technol. 2024; 14(4):635-648.

[23]

Dong P, Mao A, Qiu W, et al. Improvement of cancer prevention and control: reflection on the role of emerging information technologies. J Med Internet Res. 2024; 26:e50000.

[24]

Zdravevski E, Pires IM. Advancing methods in big data capture, integration, classification and liberation. BMC Res Notes. 2023; 16(1):64.

[25]

Sacco SJ, Chen K, Wang F, et al. Using transfer learning to improve prediction of suicide risk in acute care hospitals. J Am Med Inf Assoc. 2026; 33(1):159-166.

[26]

Kazlouski A, Montoya Perez I, Noor F, et al. Towards practical federated learning and evaluation for medical prediction models. Int J Med Inf. 2025; 204:106046.

[27]

Nguyen DA, Nguyen CH, Mamitsuka H. A survey on adverse drug reaction studies: data, tasks and machine learning methods. Brief Bioinform. 2021; 22(1):164-177.

[28]

Choi YH, Han CY, Kim KS, et al. Future directions of pharmacovigilance studies using electronic medical recording and human genetic databases. Toxicol Res. 2019; 35(4):319-330.

PDF (1516KB)

0

Accesses

0

Citation

Detail

Sections
Recommended

/