A Contemporary Machine Learning Method for Accurate Prediction of Cervical Cancer

Jesse Jeremiah Tanimu; Mohamed Hamada; Mohammed Hassan; Saratu Yusuf Ilu

doi:10.1051/shsconf/202110204004

All issues

Volume 102 (2021)

SHS Web Conf., 102 (2021) 04004

Abstract

Open Access

Issue		SHS Web Conf. Volume 102, 2021 The 3^rd ETLTC International Conference on Information and Communications Technology (ETLTC2021)


Article Number		04004
Number of page(s)		6
Section		Applications in Computer Science
DOI		https://doi.org/10.1051/shsconf/202110204004
Published online		03 May 2021

SHS Web of Conferences 102, 04004 (2021)

A Contemporary Machine Learning Method for Accurate Prediction of Cervical Cancer

Jesse Jeremiah Tanimu¹^*, Mohamed Hamada²^**, Mohammed Hassan³^*** and Saratu Yusuf Ilu³^****

¹ Department of Computer Science, Bayero University, Kano. Nigeria
² University of Aizu, Japan
³ Department of Software Engineering, Bayero University, Kano. Nigeria

^* e-mail: tanimujessej@gmail.com
^** e-mail: hamada@u-aizu.ac.jp
^*** e-mail: mhassan.se@buk.edu.ng
^**** e-mail: syilu.cs@buk.edu.ng

Abstract

With the advent of new technologies in the medical field, huge amounts of cancerous data have been collected and are readily accessible to the medical research community. Over the years, researchers have employed advanced data mining and machine learning techniques to develop better models that can analyze datasets to extract the conceived patterns, ideas, and hidden knowledge. The mined information can be used as a support in decision making for diagnostic processes. These techniques, while being able to predict future outcomes of certain diseases effectively, can discover and identify patterns and relationships between them from complex datasets. In this research, a predictive model for predicting the outcome of patients’ cervical cancer results has been developed, given risk patterns from individual medical records and preliminary screening tests. This work presents a Decision tree (DT) classification algorithm and shows the advantage of feature selection approaches in the prediction of cervical cancer using recursive feature elimination technique for dimensionality reduction for improving the accuracy, sensitivity, and specificity of the model. The dataset employed here suffers from missing values and is highly imbalanced. Therefore, a combination of under and oversampling techniques called SMOTETomek was employed. A comparative analysis of the proposed model has been performed to show the effectiveness of feature selection and class imbalance based on the classifier’s accuracy, sensitivity, and specificity. The DT with the selected features and SMOTETomek has better results with an accuracy of 98%, sensitivity of 100%, and specificity of 97%. Decision Tree classifier is shown to have excellent performance in handling classification assignment when the features are reduced, and the problem of imbalance class is addressed.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.