Application of K-Means and Decision Tree for Disease Prediction Using Data Mining Approach

Riah Ukur Ginting; Fernando H Sinaga; Rianto Sitanggang; Ivan Elisabeth Purba; Aprima A Matondang

doi:10.36526/ztr.v8i1.7634

Authors

Riah Ukur Ginting Universitas Sari Mutiara
Fernando H Sinaga Sari Mutiara Indonesia University
Rianto Sitanggang Sari Mutiara Indonesia University
Ivan Elisabeth Purba Sari Mutiara Indonesia
Aprima A Matondang Medan State Polytechnic

DOI:

https://doi.org/10.36526/ztr.v8i1.7634

Keywords:

K-Means Clustering, Data Mining, decision tree

Abstract

This study aims to analyze the distribution patterns of patient diseases using a data mining approach at UPTD Puskesmas Pakkat. The dataset consists of secondary data from 4,633 patients collected between January 2022 and December 2023, obtained from digital medical records, with variables including age, gender, and 22 disease diagnosis categories. The K-Means Clustering method was employed to identify disease grouping patterns based on patient characteristics. The optimal number of clusters was determined using the Silhouette Score, with the best value of 0.5556 at K=6. Cluster quality was further evaluated using the Davies-Bouldin Index (DBI) with a value of 0.6722, indicating good cluster separation. To support the classification process, the Decision Tree algorithm was applied to predict cluster membership for new patient data. Model evaluation was conducted using a train-test split scheme and k-fold cross-validation to enhance reliability and minimize the risk of overfitting. The results indicate distinct disease patterns across age groups, where infectious diseases such as acute respiratory infections (ARI) and diarrhea dominate in children, while non-communicable diseases such as hypertension and diabetes are more prevalent among adults and the elderly. This study contributes by integrating clustering and classification methods and provides data-driven epidemiological insights that can support decision-making in primary healthcare services.

Application of K-Means and Decision Tree for Disease Prediction Using Data Mining Approach

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

HOME PAGE

visitor

referencemanagementsoftware

format

indexe

Developed By