DivHeart: A Novel Machine Learning Framework for the Prediction of Drug Induced Cardiotoxicity

Divyanshu Patel, Vineet Richhariya

Authors

Divyanshu Patel, Vineet Richhariya

Keywords:

Cardiotoxicity, hERG inhibition, Machine Learning, Extra Trees Classifier, SMOTE, Predictive Modeling

Abstract

Drug-induced cardiotoxicity remains a critical concern in the pharmaceutical industry, particularly through the inhibition of the human ether-à-go-go-related gene (hERG) potassium channel, leading to QT prolongation and arrhythmias. Accurate prediction of hERG channel liability is vital to reducing the risks of late-stage drug failure. This paper presents DivHeart, a novel machine learning (ML)-based quantitative structure-activity relationship (QSAR) framework designed to predict drug-induced cardiotoxicity. The model addresses the issue of class imbalance in cardiotoxicity datasets through the application of Synthetic Minority Oversampling Technique (SMOTE) and evaluates several classifiers, including Extra Trees, Random Forest, and K-Nearest Neighbors (KNN). The model demonstrated robust performance, achieving an accuracy of 0.93, sensitivity of 0.94, and specificity of 0.93, outperforming previous models. Additionally, the DivHeart framework is deployed as an accessible web tool, enabling real-time predictions for drug discovery processes. The findings suggest that this approach can significantly aid in the early screening of drugs for potential cardiotoxic effects, minimizing the risk of QT prolongation.

References

“Cardiovascular diseases (CVDs).” Accessed: Sep. 27, 2025. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases %28cvds%29?utm_source=chatgpt.com

R. Pandit, T. Pandit, L. Goyal, and K. Ajmera, “A Review of National Level Guidelines for Risk Management of Cardiovascular and Diabetic Disease,” Cureus, vol. 14, no. 6, p. e26458, Jun. 2022, doi: 10.7759/CUREUS.26458.

R. L. Jones, C. Swanton, and M. S. Ewer, “Anthracycline cardiotoxicity,” Expert Opin Drug Saf, vol. 5, no. 6, pp. 791–809, Nov. 2006, doi: 10.1517/14740338.5.6.791.

M. Recanatini, E. Poluzzi, M. Masetti, A. Cavalli, and F. De Ponti, “QT prolongation through hERG K+ channel blockade: Current knowledge and strategies for the early prediction during drug development,” Med Res Rev, vol. 25, no. 2, pp. 133–166, Mar. 2005, doi: 10.1002/med.20019.

M. Seierstad and D. K. Agrafiotis, “A QSAR model of HERG binding using a large, diverse, and internally consistent training set,” Chem Biol Drug Des, vol. 67, no. 4, pp. 284–296, Apr. 2006, doi: 10.1111/J.1747-0285.2006.00379.X.

M. Song and M. Clark, “Development and evaluation of an in silico model for hERG binding,” J Chem Inf Model, vol. 46, no. 1, pp. 392–400, 2006, doi: 10.1021/CI050308F.

K. M. Sakthivel and C. S. Rajitha, “Model Selection for Count Data with Excess Number of Zero Counts,” Am J Appl Math Stat, vol. 7, no. 1, pp. 43–51, Jan. 2019, doi: 10.12691/AJAMS-7-1-7.

T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 13-17-August-2016, pp. 785–794, Mar. 2016, doi: 10.1145/2939672.2939785.

C. Cai et al., “Deep Learning-Based Prediction of Drug-Induced Cardiotoxicity,” J Chem Inf Model, vol. 59, no. 3, pp. 1073–1084, Mar. 2019, doi: 10.1021/ACS.JCIM.8B00769.

J. Y. Ryu, M. Y. Lee, J. H. Lee, B. H. Lee, and K. S. Oh, “DeepHIT: a deep learning framework for prediction of hERG-induced cardiotoxicity,” Bioinformatics, vol. 36, no. 10, pp. 3049–3055, May 2020, doi: 10.1093/BIOINFORMATICS/BTAA075.

S. Jamal, W. Ali, P. Nagpal, S. Grover, and A. Grover, “Computational models for the prediction of adverse cardiovascular drug reactions,” J Transl Med, vol. 17, no. 1, May 2019, doi: 10.1186/S12967-019-1918-Z

“SMOTE: Synthetic Minority Over-sampling Technique.” Accessed: Sep. 27, 2025. [Online]. Available: https://www.scs.cmu.edu/afs/cs/project/jair/pub/volume16/chawla02a-html/chawla2002.html?utm_source=chatgpt.com

S. Prasanna and R. J. Doerksen, “Topological Polar Surface Area: A Useful Descriptor in 2D-QSAR,” Curr Med Chem, vol. 16, no. 1, p. 21, Dec. 2009, doi: 10.2174/092986709787002817.

C. W. Yap, “PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints,” J Comput Chem, vol. 32, no. 7, pp. 1466–1474, May 2011, doi: 10.1002/JCC.21707.

C. N. Cavasotto and V. Scardino, “Machine Learning Toxicity Prediction: Latest Advances by Toxicity End Point,” ACS Omega, vol. 7, no. 51, pp. 47536–47546, Dec. 2022, doi: 10.1021/ACSOMEGA.2C05693.

A. Setiya, V. Jani, U. Sonavane, and R. Joshi, “MolToxPred: small molecule toxicity prediction using machine learning approach,” RSC Adv, vol. 14, no. 6, p. 4201, Jan. 2024, doi: 10.1039/D3RA07322J.

Y. N. Fuadah, M. A. Pramudito, L. Firdaus, F. J. Vanheusden, and K. M. Lim, “QSAR Classification Modeling Using Machine Learning with a Consensus-Based Approach for Multivariate Chemical Hazard End Points,” ACS Omega, vol. 9, no. 51, p. 50796, Dec. 2024, doi: 10.1021/ACSOMEGA.4C09356.

“ExtraTreesClassifier — scikit-learn 1.7.2 documentation.” Accessed: Sep. 27, 2025. [Online]. Available: https://scikit learn.org/stable/modules/generated/sklearn.ensemble. ExtraTreesClassifier.html?utm_source=chatgpt.com

“3.2. Tuning the hyper-parameters of an estimator — scikit-learn 1.7.2 documentation.” Accessed: Sep. 27, 2025. [Online]. Available: https://scikit-learn.org/stable/modules/grid_search.html?utm_source=chatgpt.com

“Target: Voltage-gated inwardly rectifying potassium channel KCNH2 (CHEMBL240) - ChEMBL.” Accessed: Sep. 27, 2025. [Online]. Available: https://www.ebi.ac.uk/chembl/explore/target/CHEMBL240

E. Ylipää et al., “hERG-toxicity prediction using traditional machine learning and advanced deep learning techniques,” Curr Res Toxicol, vol. 5, p. 100121, Jan. 2023, doi: 10.1016/J.CRTOX.2023.100121.

M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Inf Process Manag, vol. 45, no. 4, pp. 427–437, Jul. 2009, doi: 10.1016/J.IPM.2009.03.002.

D. Chicco and G. Jurman, “The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,” BMC Genomics, vol. 21, no. 1, pp. 1–13, Jan. 2020, doi: 10.1186/S12864-019-6413-7/TABLES/5

DivHeart: A Novel Machine Learning Framework for the Prediction of Drug Induced Cardiotoxicity

Authors

Keywords:

Abstract

References

Downloads

How to Cite

Issue

Section

License

Similar Articles

Make a Submission

Keywords

Abstracting & Indexing

Flag Counter