Document Type : Original Article
Authors
1
Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran.
2
Student Research Committee, Department of Medical Informatics, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
3
Department of medical informatics, Faculty of medicine, University of medical sciences, Mashhad, Iran
4
Medical Toxicology and Drug Abuse Research Center (MTDRC), Birjand University of Medical Sciences, Birjand, Iran
5
Medical Toxicology and Drug Abuse Research Center (MTDRC), Birjand University of Medical Sciences,
Abstract
Purpose:
One of the goals of medical research is to determine the factors association of diseases in prognosis. One of the most common metabolic diseases in Iran is diabetes. The aim of this study was to identify the related factors that predict diabetes by using artificial neural network and decision tree algorithms. In this study we will compare the performance of these models.
Methods:
In this study, 901 cases of people referred to health centers in Mashhad were used. Initially, data were analyzed using descriptive and analytical statistics. Then, 70% of the data were randomly selected for constructing artificial neural network and decision tree models and the remaining 30% were used to compare the performance of the models. Finally, the performance of the models was compared using the ROC curve.
Results:
Development of two predictive models was performed by using13 input (independent) variables and 1 output (dependent) variable. The two models were evaluated in terms of area under the ROC curve, sensitivity, specificity and accuracy. Area under ROC curve, sensitivity, specificity and accuracy for artificial neural network model were 69.1, 74.2, 56.03 and 61.3. For CART algorithm of decision tree the under ROC curve, sensitivity, specificity and accuracy were obtained as 68.9, 64.77, 63.47 and 65.3 respectively. In all modes, family history of diabetes, triglycerides, body mass index, low density lipoprotein, and systolic and diastolic blood pressure were the most important factors associated with type 2 diabetes.
Conclusion:
The results showed that the perceptron multi-layer neural network model had a better result than the CART decision tree in term of area under the ROC curve for prediction of diabetes type 2. Also, low density lipoprotein was identified as the most important related factor of type 2 diabetes. The study suggests that modern data mining techniques such as artificial neural network and decision trees can be used to identify associated disease factors.
Keywords
Main Subjects