Publication

Adverse drug reaction profile prediction: denoising, signal enhancement and missing row imputation

Zhong, Yezhao
Citation
Abstract
Adverse Drug Reactions (ADRs) cause significant risks to human health, making it essential to identify potential ADRs in the early-stage of drug development. However, this process is costly and time-consuming. Therefore, developing advanced computational methods to predict ADR profiles is important. We developed a series of approaches to enhance ADR profile prediction across three main strategies. First, to address noise in imbalanced ADR data, we proposed a novel hybrid method, Kernel Regression (KR) on V (VKR), which combines Non-negative Matrix Factorization (NMF) with KR on the drug-component matrix V derived from NMF. Second, we introduced Smoothed KR (SKR) to enhance signal detection for rare ADRs. Finally, we developed a missing row imputation strategy to enrich drug databases by imputing missing rows of features for non-overlapping drugs, increasing the datasets breadth and predictive capability. Our three strategies yielded significant improvements. VKR demonstrated superior performance over existing methods on both single features and integrated features. SKR significantly improved prediction performance for rare ADRs, outperforming other methods in this challenging category, while improvements for common ADRswere more modest. The extended size of dataset further enhanced model performance with both the single features and the integrated features, indicating the benefit of the missing row imputation strategy. Together, these methods provide a robust framework for ADR prediction, addressing ADR data noise, rare ADR detection, and limitations of ADR data usage. VKR effectively reduces noise introduced by imbalanced data and the binary representation of drug-ADR data, while SKR addresses the gap in rare ADR prediction, which is crucial for real-world applications. Current models often overlook rare ADRs due to the dominance of common ADR signals. However, capturing these rare ADRs, which often lead to severe cases, is crucial for comprehensive drug safety assessment. The limited overlap of drugs across feature databases significantly reduces usable training data, making the missing row imputation strategy a valuable addition for preserving critical drug information and improving predictive outcomes.
Publisher
University of Galway
Publisher DOI
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International