The use of CLAHE for improving an accuracy of CNN architecture for detecting pneumonia

. Artificial intelligence (AI) has now grown rapidly for helping many aspects of human life, one of them is for medical image processing. Currently, the world is still suffering from COVID-19 pandemic outbreak which affects more than 36 million people and it is estimated that more than 1 million death occurred as a result of this outbreak. Early detection for COVID-19 suffers is needed to assist doctors and medical experts to determine the next medication for patients for avoiding the worsening condition which leads to death. AI-based model is can be used for assisting medical experts for detecting and classify the lung condition based on chest x-ray (CXR) patient’s image accurately by using deep learning. On this paper, authors proposed the use on contrast limited adaptive histogram equalization (CLAHE) for pre-processing the medical images combined with CNN AlexNet architecture. The result of this method then compared with non-CLAHE CNN AlexNet also self-made CNN architecture. The result shows a promising result by the accuracy of CNN AlexNet architecture is 91.11%.


Introduction
Artificial intelligence (AI) has now grown rapidly to help human life and has been applied to many aspects of living, one of which is by utilizing deep learning in image classification, especially in the process of medical imaging [1]. AI-based models on images classification are often used in computer aided diagnosis (CAD) technology to help doctors or medical experts diagnose patients more accurately. Currently, the world is suffering from the COVID-19 pandemic outbreak which affects more than 36 million people and it is estimated that more than 1 million deaths occurred as a result of this outbreak [2], [3]. Early detection in COVID-19 sufferers is needed to assist doctors in making decisions in determining follow-up actions, to prevent worsening of the patient's condition due to lower oxygen level that leads to death [4]. The main technique in diagnosing COVID-19 is to use reverse transcription polymerase chain reaction (RT-PCR), however, chest X-ray (CXR) images can be used to perform early detection of a patient's condition with an accuracy of up to 80% [5]. CAD can be used to assist doctors in determining the condition of patients based on the results of CXR images, whether the patient is infected by COVID-19 or not. Utilizing the deep learning model, the convolutional neural network (CNN), CXR images then classified into specific categories, hence, this deep learning model is very interesting to develop.

Related Work
In previous results of research, CNN is used for classifying pneumonia disease and tuberculosis, by using CXR images, there were conducted four classification. Those result shown a promising accuracy, they are binary classification (Normal vs COVID-19), the accuracy obtained is 99.7%, three classification (Normal vs COVID-19 vs Pneumonia), obtained 95.02% and four classification (Normal vs COVID-19 vs Pneumonia vs Tuberculosis (TB)) obtained 94.53% [1]. Other research conduct by Muhamet Fatih Aslan, Muhammed Fahri Unlersen, Kadir Sabanci and Akif Durdu were using a modified AlexNet and adding flatten and bidirectional long short term memories (BiLSTM). This layer were added in order to ensure the data is transformed into one-dimensional array. On this research best accuracy obtained is 98.70% [2]. Another method was also proposed by using long-short term memories (LSTM) to detect COVID-19 based on CT images [3]. Highest accuracy that gained is 99.37%. LSTM also combined by Recurrent Neural Network (RNN) in order to detect COVID-19 based on CXR images, it obtained 95.04% accuracy after getting robust normalization. AI also implemented in detecting COVID-19, by using a modified CNN, named AI-enables smart biomedical diagnosis system (AIRBiS) to detect dan distinguish images, whether it was COVID-19 infection or pneumonia. The best accuracy gained is 97.18% [5]. In addition, transfer learning is used to detect COVID-19 Pneumonia, viral pneumonia, and normal circumstances. In this case there were eight different architecture combined with transfer learning and image augmentation to fasten the training process and increase the accuracy. The best accuracy gained is by using DenseNet201 architecture with image augmentation with an accuracy 99.70% [6]. Another study proposed a COV-MCNet (Multi-classification network) that carries eight architectures that have been modified to detect CXR images and divide it into four different classifications, namely normal conditions, COVID-19 pneumonia, viral pneumonia, and bacterial pneumonia [7].

Convolutional Neural Network (CNN)
CNN is a deep learning algorithm which usually use to process spatial data like image processing. CNN has the dynamic ability to comprehend spatial information in a gradual, low-to-high level pattern which inspired by the workings of the human nervous system [8].
Operation of CNN will have a high complexity and necessitates a large amount of data and execution time during the training period because the form of feed forward operation, this operation is a hierarchical operation which the preceding process' output results should be used as an input in the following process. CNN operation is a complex mathematical operation that generally consists of convolutional layers, down-sampling layers (pooling), activation function, and fully connected layer.

Convolutional Layer
The convolution layer is the main layer in the CNN process. This layer is a collection of mathematical processes that use an array of numbers to perform convolution processes that produce linear transformation processes. This process contains spatial information that can later be used for future processes on CNN [8].

Rel-U Activation
Rel-U is one of activation layer which makes all pixel values that are less than 0 will be made to 0, but if the pixel values are more than 0, this function will let the values remain the same [9]. Rel-U activation equation shown below :

Pooling Layer
Pooling layer is a sampling layer that plays a role in reducing spatial size, lowering the amount of model parameters and also reducing the calculations complexity, so as to speed up the process of classification of data. Pooling layer is generally divided into two categories; maximum pooling which takes the highest value in each image matrix and average pooling which takes the average value of each matrix [10].

Fully Connected Layer
A fully activated layer has activation neurons that are interconnected to the previous and subsequent layers. Fully Connected layer will later classify the results of the image into several categories available. The results of the recognition at this layer will be in the form of percentages. In summarize, the data extraction process occurs in the convolution and pooling layer, and the extraction data is then handled at the fully connected layer to generate an output in the form of classification results [10].

Softmax Activation
Softmax activation layer is a activation layer which plays classification output on more than two categories. This activation layer is in form of logistic regression.

Contrast Limited Adaptive Histogram Equalization (CLAHE)
Contrast limited adaptive histogram equalization (CLAHE) is a development version of the adaptive histogram equation (AHE) that plays a role in increasing contrast in the image by increasing the intensity range of the image or performing a stretching out mechanism at the most frequent intensity value in the image [10]. In CLAHE, the image is broken down into sub-images called tiles or blocks, then performs the histogram equalization process on each sub-images that has a certain value that causes the image to be overamplified and then redistribute the pixels back to the histogram, resulting in the contrast in the image being increasingly visible [10], [11].

AlexNet
AlexNet consist of multiple sets of convolutional layers, combined with pooling layer and fully connected layer. Convolutional layers function is to extract feature of input images into pixel value, after that the process will be continued in pooling layer, where the maximum value or average value will be used as down-sampling to make the model compact without losing any precious spatial information. After extracting features are done, fully connected layer take part to classify the result of extraction based on proposed research. The block diagram of AlexNet shown on Table 1.

Our proposed scheme
This research will be conducted to find out the most suitable parameters for each circumstance. The authors propose three scenarios, the first one is using AlexNet itself, and then the result of this architecture will be compared to CLAHE + AlexNet and YENS-Net, self-made CNN architecture. At first, the parameters, such as epoch, optimizer, batch size, and learning rate will be optimized in AlexNet architecture in order to find the best combination for gaining the best accuracy. The same scenario is also applied to CLAHE+ AlexNet architecture and YENS-Net architecture, so every condition might have different batch size, learning rate, or even parameter.  Accuracy is a comparison between the data which is predicted correctly with the predicted whole data, the equation can be seen below.

Accuracy and Confusion Matrix
n in the equation shows the category. The category for the classification is three.

Dataset Description
The collection of image data is done with open source data that can be accessed by public. The three types of datasets COVID-19, namely Normal and Viral Pneumonia CXRs are all obtained through the Kaggle (https://www.kaggle.com/tawsifurrahman/covid19radiography-database) site which already has the type of CXR COVID-19, Normal and pneumonia images on a large-number datasets. The total dataset used in the study was 3616 CXR COVID-19, 1345 CXR Pneumonia and 3993 Normal CXR with a total of 8954 data in format. PNG as research material. In this study, the data was divided by 75% for data training with a total data of 6715 and 25% for data testing with a total of 2239 data. Then test with test data in order to know the level of accuracy.

Pre-processing
In this research, images as input first processed by reshaping dimension of images, in order to have an uniform size of input. The size of images is resized to 50x50. Then after resizing (2) images, the input will be processed by AlexNet architecture also YENS architecture. After finishing the phase, the result will be compared to the images which got two step preprocessing, firstly, the images got reshaped to 50x50 and then CLAHE is applied to input images.

Result and Discussion
After several measurement, the best accuracy that successfully gain comes from 50x iteration (epoch) with Adamax Optimizer for every architecture (AlexNet, CLAHE + AlexNet, and YENS-Net). The result shows that there is no any significant differences between using CLAHE for pre-processing images input before it is extracted to architecture. The result can be seen in figure below.

Best Model
After we find out the best parameters for our model, CNN AlexNet have a slightly advantages compared to CLAHE + AlexNet. The best accuracy that succesfully gained is 91.11% with precision rate of 90%, recall rate is 90% and for f-1 score, it has successfully gain 90%.

Conclution and Future Work
This research apply two different CNN architecture with CLAHE addition in pre-processing for pneumonia detection. Initially, the authors studied the use of CLAHE for enhancing quality images for detecting and classifying pneumonia disease, authors also calculated the best parameter that might suit for each scenario and compare them, they are CNN AlexNet, CLAHE + CNN AlexNet, and YENS-Net, which is self-made CNN architecture for the same database. Finally, the result shows that there are no any significant differences the use of CLAHE for improving the accuracy. The best accuracy that successfully gained comes for AlexNet by the score 91.11%. Since COVID-19 is still raging from all over the world, the following study can continue to detect this disease precisely, so the deterioration of patient's condition can be avoided.