Application of K-Means and Decision Tree for Disease Prediction Using Data Mining Approach

Authors

  • Riah Ukur Ginting Universitas Sari Mutiara
  • Fernando H Sinaga Sari Mutiara Indonesia University
  • Rianto Sitanggang Sari Mutiara Indonesia University
  • Ivan Elisabeth Purba Sari Mutiara Indonesia
  • Aprima A Matondang Medan State Polytechnic

Keywords:

K-Means Clustering, Data Mining, decision tree

Abstract

This study aims to analyze the distribution patterns of patient diseases using a data mining approach at UPTD Puskesmas Pakkat. The dataset consists of secondary data from 4,633 patients collected between January 2022 and December 2023, obtained from digital medical records, with variables including age, gender, and 22 disease diagnosis categories. The K-Means Clustering method was employed to identify disease grouping patterns based on patient characteristics. The optimal number of clusters was determined using the Silhouette Score, with the best value of 0.5556 at K=6. Cluster quality was further evaluated using the Davies-Bouldin Index (DBI) with a value of 0.6722, indicating good cluster separation. To support the classification process, the Decision Tree algorithm was applied to predict cluster membership for new patient data. Model evaluation was conducted using a train-test split scheme and k-fold cross-validation to enhance reliability and minimize the risk of overfitting. The results indicate distinct disease patterns across age groups, where infectious diseases such as acute respiratory infections (ARI) and diarrhea dominate in children, while non-communicable diseases such as hypertension and diabetes are more prevalent among adults and the elderly. This study contributes by integrating clustering and classification methods and provides data-driven epidemiological insights that can support decision-making in primary healthcare services.

Downloads

Published

2026-03-31

How to Cite

Riah Ukur Ginting, Fernando H Sinaga, Rianto Sitanggang, Ivan Elisabeth Purba, & Aprima A Matondang. (2026). Application of K-Means and Decision Tree for Disease Prediction Using Data Mining Approach . JOURNAL ZETROEM, 8(1), 83–88. Retrieved from https://ejournal.unibabwi.ac.id/index.php/Zetroem/article/view/7634

Issue

Section

Article