Abstract:
Pneumonia is a form of acute respiratory tract infection (ARTI) that affects the lungs. It is the single leading cause of mortality in children under five and is a major cause of child mortality in every region of the world, with most deaths occurring in developing countries. A large number of children who die from pneumonia do so as a result of inappropriate treatment due to misdiagnosis of signs and symptoms. Despite the emergence of new diagnostic tests and clinical guidelines, clinicians still face challenges in making proper clinical decisions because the presenting signs and symptoms are nonspecific, might be subtle particularly in infants and young children, and vary, depending on the patient’s age, responsible pathogen, and severity of the infection. The objective of this study is to develop a predictive model for identifying the severity of pneumonia by mining patients’ retrospective data collected over the years. In order to achieve the goal of this study, the hybrid model was followed and the data extracted from Arbaminch General Hospital was used. The classification algorithms namely Decision Trees and Rule Induction classifiers were used for model building. WEKA 3.6.1 data mining tool that was used since it contains the algorithms and functionalities that aid in knowledge discovery process. A 10-fold cross validation was applied for model training and testing. The performance of the classification and prediction models was measured in terms of accuracy, WTP Rate, WFP Rate, and AUC. In this particular study, the PART unpruned rule induction found to be best performing 91% WROC area. Therefore, a predictive model is developed with the use of PART -U -M 2 -C 0.25 -Q 1. The results from this study were encouraging and confirmed that applying data mining techniques could indeed support a predictive model building task that predicts severity of pneumonia of under-five children in Ethiopia. In the future, integrating large demographic and health survey data set and clinical data set, employing other classification algorithms, tools and techniques could yield better results.