Enhancing Heart Disease Prediction through Effective Handling of Data Imbalances and the Use of Ensemble Learning Techniques.

Authors

  • Noura Marie Elshiebani (1) Geber Khalifa Geber (2)Mostafa Nser brka (3Abraheem Mohammed sulayman Alsubayh(4) Higher Institute of Science and Technology - Suluq Higher Institute of Science and Technology - SuluqHigher Institute of Science and Technology, Al Shomokh Faculty of Arts and Sciences, University of Benghazi, Solouq, Libya , Author

DOI:

https://doi.org/10.65405/.v10i37.650

Keywords:

: Heart Disease Prediction; Cardiovascular Disease (CVD); Machine Learning; Feature Selection; Class Imbalance; SMOTE; Ensemble Learning; Stacking

Abstract

This study investigates the prediction of heart disease severity by utilising the Cleveland Heart
Disease dataset from the UCI Machine Learning Repository. The research commenced with
extensive data preprocessing, such as handling missing values and applying one-hot encoding to
categorical variables. Several machine learning models were tested, and the combination of
LightGBM with SMOTE produced the highest accuracy (0.6739) and ROC AUC (0.8941).
To further improve predictive performance, a stacking ensemble approach was applied, combining
the strengths of several machine learning models. Age and exercise-induced angina emerged as
critical predictors, offering insights for early detection and better management of heart disease.
These results highlight the importance of these variables in reducing heart disease severity and point
toward possible improvements in patient care. The analysis utilised Python in Google Colab,
leveraging its extensive libraries and tools to achieve precise results.

Downloads

Download data is not yet available.

References

K. M. Shiwangi, J. K. Sandhu, and R. Sahu, “Effective Heart-Disease

Prediction by Using Hybrid Machine Learning Technique,” Proc. Int.

Conf. Circuit Power Comput. Technol. ICCPCT 2023, pp. 1670–1675,

2023, doi: 10.1109/ICCPCT58313.2023.10245785.

[4] M. S. Pathan, A. Nag, M. M. Pathan, and S. Dev, “Analyzing the impact of

feature selection on the accuracy of heart disease prediction,” Healthc.

Anal., vol. 2, no. February, p. 100060, 2022, doi:

10.1016/j.health.2022.100060.

[5] H. Jindal, S. Agrawal, R. Khera, R. Jain, and P. Nagrath, “Heart disease

prediction using machine learning algorithms,” IOP Conf. Ser. Mater. Sci.

Eng., vol. 1022, no. 1, 2021, doi: 10.1088/1757-899X/1022/1/012072.

[6] A. S. Dina, A. B. Siddique, and D. Manivannan, “Effect of Balancing Data

Using Synthetic Data on the Performance of Machine Learning Classifiers

for Intrusion Detection in Computer Networks,” IEEE Access, vol. 10, no.

August, pp. 96731–96747, 2022, doi: 10.1109/ACCESS.2022.3205337.

[7] A. Ishaq, S. Sadiq, M. Umer, and S. Ullah, “Improving the Prediction of

Heart Failure Patients ’ Survival Using SMOTE and Effective Data

Mining Techniques,” pp. 39707–39716, 2021, doi:

10.1109/ACCESS.2021.3064084.

[8] D. Widhyanti and D. Juniati, “Heart disease prediction using machine

learning techniques Heart disease prediction using machine learning

techniques”, doi: 10.1088/1757-899X/1022/1/012046.

[9] N. Varshney, M. E. M. Soudagar, and L. A. Al-keridis, “Cardiovascular

diseases prediction by machine learning incorporation with deep learning,”

no. April, pp. 1–9, 2023, doi: 10.3389/fmed.2023.1150933.

[10] A. Khan, M. Qureshi, M. Daniyal, and K. Tawiah, “A Novel Study on

Machine Learning Algorithm-Based Cardiovascular Disease Prediction,”

vol. 2023, no. Cvd, 2023, doi: 10.1155/2023/1406060.

[11] R. Bharti, A. Khamparia, M. Shabaz, G. Dhiman, S. Pande, and P. Singh,

“Prediction of Heart Disease Using a Combination of Machine Learning

and Deep Learning,” vol. 2021, 2021, doi: 10.1155/2021/8387680.

[12] K. Vishnu, V. Reddy, I. Elamvazuthi, A. A. Aziz, and S. Paramasivam,

“applied sciences Heart Disease Risk Prediction Using Machine Learning

Classifiers with Attribute Evaluators,” 2021.

[13] U. Nagavelli, D. Samanta, and P. Chakraborty, “Machine Learning

Technology-Based Heart Disease Detection Models,” vol. 2022, 2022, doi:

10.1155/2022/7351061.

[14] J. A. Jevin, H. Jayant, R. Sanjay, V. Hemasai, and P. V Venkatasrinivas,

“Heart Disease Identification Method Using Machine Learning

Classification in,” vol. 10, no. 3, 2023.

Downloads

Published

2025-11-25

How to Cite

Enhancing Heart Disease Prediction through Effective Handling of Data Imbalances and the Use of Ensemble Learning Techniques. (2025). Comprehensive Journal of Science, 10(37), 2922-2938. https://doi.org/10.65405/.v10i37.650