Prediksi Penyakit Jantung Berdasarkan Indikator-Indikator Kesehatan
Abstract
This research aims to predict heart disease using Machine Learning. The dataset used is obtained from a website called Kaggle. The benefits of this research can help prevent someone from getting heart disease; and start a lifestyle change. The use of Machine Learning can help to predict the likelihood of a person developing heart disease. Some Machine Learning algorithms used in this research are K-Nearest Neighbor, Decision Tree, and Random Forest. The use of Machine Learning algorithms can also be supported using Hyperparameter tuning to improve the performance of Machine Learning algorithms. Hyperparameter tuning aims to find the best parameters of each algorithm that provide the most optimal performance. The results of the experiments obtained show that the most optimal algorithm is Random Forest with a value of f1-score (micro) 94%, f1-score (macro) 73%, f1-score (weighted) 94%, and balance accuracy 74%. For the best parameters from the use of Hyperparameter tuning from the Random Forest algorithm, namely max_depth = 80, max_features = 4, min_samples_leaf = 4, min_samples_split = 10, and n_estimators = 400. For the best parameters from the use of Hyperparameter tuning from the Decision Tree algorithm, namely criterion = gini, max_depth = 100, max_features = sqrt, min_samples_leaf = 1, and min_samples_split = 3, and for the best parameters from the use of Hyperparameter tuning from the K-Nearest Neighbor algorithm, namely leaf_size = 10, n_neighbors = 40, and weights = distance. In addition to experiments to find the best machine learning algorithm, this research also produces a web-based system that accepts input in the form of health indicators and provides prediction results.