Fuzzy Logic and Correlation-Based Hybrid Classification on Hepatitis Disease Data Set


Başarslan M. S. , Bakır H., Yücedağ İ.

in: Artificial Intelligence and Applied Mathematics in Engineering Problems, Hemanth D.,Kose U., Editor, Springer Nature, Zug, pp.787-800, 2020

  • Publication Type: Book Chapter / Chapter Research Book
  • Publication Date: 2020
  • Publisher: Springer Nature
  • City: Zug
  • Page Numbers: pp.787-800
  • Editors: Hemanth D.,Kose U., Editor

Abstract

Developments in the health field are closely affecting humanity. The development of information technologies increases this effect. In this study, it was aimed to help the decision-makers by increasing the accuracy rate in the detection of hepatitis disease. The data set was obtained from UCI machine learning source. Data preprocessing, attribute selection, and classifier models were established on this data set, respectively. After the deficiency in the data of the patients with hepatitis was normalized, correlation-based and fuzzy-based rough force attribute selection methods were applied and the attributes that contributed to the classification were selected. The hepatitis dataset and the data set formed by the attributes determined by the correlation-based and the fuzzy-based rough-attribute selection methods were classified using the k-nearest neighbor, Random Forest, Naive Bayes, and Logistic Regression algorithms, and the results were compared. Accuracy, sensitivity precision, ROC curve, and F-measure values were used in the comparison of classification algorithms. In the process of separating the data set as a test and training set, a 5-fold cross-validation method was applied. It has been observed that the fuzzy rough clustering algorithm is more successful than the k-nearest neighbor, Random Forest, Naive Bayes, and Logistic Regression classification methods in the detection of hepatitis disease.

Developments in the health field are closely affecting humanity. The development of information technologies increases this effect. In this study, it was aimed to help the decision-makers by increasing the accuracy rate in the detection of hepatitis disease. The data set was obtained from UCI machine learning source. Data preprocessing, attribute selection, and classifier models were established on this data set, respectively. After the deficiency in the data of the patients with hepatitis was normalized, correlation-based and fuzzy-based rough force attribute selection methods were applied and the attributes that contributed to the classification were selected. The hepatitis dataset and the data set formed by the attributes determined by the correlation-based and the fuzzy-based rough-attribute selection methods were classified using the k-nearest neighbor, Random Forest, Naive Bayes, and Logistic Regression algorithms, and the results were compared. Accuracy, sensitivity precision, ROC curve, and F-measure values were used in the comparison of classification algorithms. In the process of separating the data set as a test and training set, a 5-fold cross-validation method was applied. It has been observed that the fuzzy rough clustering algorithm is more successful than the k-nearest neighbor, Random Forest, Naive Bayes, and Logistic Regression classification methods in the detection of hepatitis disease.