Enhancing Heart Disease Prediction through Effective Handling of Data Imbalances and the Use of Ensemble Learning Techniques.
DOI:
https://doi.org/10.65405/.v10i37.650Keywords:
: Heart Disease Prediction; Cardiovascular Disease (CVD); Machine Learning; Feature Selection; Class Imbalance; SMOTE; Ensemble Learning; StackingAbstract
This study investigates the prediction of heart disease severity by utilising the Cleveland Heart
Disease dataset from the UCI Machine Learning Repository. The research commenced with
extensive data preprocessing, such as handling missing values and applying one-hot encoding to
categorical variables. Several machine learning models were tested, and the combination of
LightGBM with SMOTE produced the highest accuracy (0.6739) and ROC AUC (0.8941).
To further improve predictive performance, a stacking ensemble approach was applied, combining
the strengths of several machine learning models. Age and exercise-induced angina emerged as
critical predictors, offering insights for early detection and better management of heart disease.
These results highlight the importance of these variables in reducing heart disease severity and point
toward possible improvements in patient care. The analysis utilised Python in Google Colab,
leveraging its extensive libraries and tools to achieve precise results.
Downloads
References
K. M. Shiwangi, J. K. Sandhu, and R. Sahu, “Effective Heart-Disease
Prediction by Using Hybrid Machine Learning Technique,” Proc. Int.
Conf. Circuit Power Comput. Technol. ICCPCT 2023, pp. 1670–1675,
2023, doi: 10.1109/ICCPCT58313.2023.10245785.
[4] M. S. Pathan, A. Nag, M. M. Pathan, and S. Dev, “Analyzing the impact of
feature selection on the accuracy of heart disease prediction,” Healthc.
Anal., vol. 2, no. February, p. 100060, 2022, doi:
10.1016/j.health.2022.100060.
[5] H. Jindal, S. Agrawal, R. Khera, R. Jain, and P. Nagrath, “Heart disease
prediction using machine learning algorithms,” IOP Conf. Ser. Mater. Sci.
Eng., vol. 1022, no. 1, 2021, doi: 10.1088/1757-899X/1022/1/012072.
[6] A. S. Dina, A. B. Siddique, and D. Manivannan, “Effect of Balancing Data
Using Synthetic Data on the Performance of Machine Learning Classifiers
for Intrusion Detection in Computer Networks,” IEEE Access, vol. 10, no.
August, pp. 96731–96747, 2022, doi: 10.1109/ACCESS.2022.3205337.
[7] A. Ishaq, S. Sadiq, M. Umer, and S. Ullah, “Improving the Prediction of
Heart Failure Patients ’ Survival Using SMOTE and Effective Data
Mining Techniques,” pp. 39707–39716, 2021, doi:
10.1109/ACCESS.2021.3064084.
[8] D. Widhyanti and D. Juniati, “Heart disease prediction using machine
learning techniques Heart disease prediction using machine learning
techniques”, doi: 10.1088/1757-899X/1022/1/012046.
[9] N. Varshney, M. E. M. Soudagar, and L. A. Al-keridis, “Cardiovascular
diseases prediction by machine learning incorporation with deep learning,”
no. April, pp. 1–9, 2023, doi: 10.3389/fmed.2023.1150933.
[10] A. Khan, M. Qureshi, M. Daniyal, and K. Tawiah, “A Novel Study on
Machine Learning Algorithm-Based Cardiovascular Disease Prediction,”
vol. 2023, no. Cvd, 2023, doi: 10.1155/2023/1406060.
[11] R. Bharti, A. Khamparia, M. Shabaz, G. Dhiman, S. Pande, and P. Singh,
“Prediction of Heart Disease Using a Combination of Machine Learning
and Deep Learning,” vol. 2021, 2021, doi: 10.1155/2021/8387680.
[12] K. Vishnu, V. Reddy, I. Elamvazuthi, A. A. Aziz, and S. Paramasivam,
“applied sciences Heart Disease Risk Prediction Using Machine Learning
Classifiers with Attribute Evaluators,” 2021.
[13] U. Nagavelli, D. Samanta, and P. Chakraborty, “Machine Learning
Technology-Based Heart Disease Detection Models,” vol. 2022, 2022, doi:
10.1155/2022/7351061.
[14] J. A. Jevin, H. Jayant, R. Sanjay, V. Hemasai, and P. V Venkatasrinivas,
“Heart Disease Identification Method Using Machine Learning
Classification in,” vol. 10, no. 3, 2023.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.








