Abstract:
Lung cancer is one of the most increasing diseases in the rapidly changing world. This disease can be cured in the initial stage. It should be identified at the early stage for diagnosis purposes. The prediction of lung disease stages can be done using image processing techniques. The proposed algorithm consists of a segmentation process using Dual-Tree Complex Wavelet Transform (DT-CWT). The segmented lungs are subjected to the feature extraction process which includes the Gray Level Co-occurrence (GLCM) algorithm. The main prediction can be done using the classifier process. The classifier included in the proposed work is the Backpropagation neural network classifier. The classes of the tumor level can be predicted using the neural network. This prediction would indicate the early stage of cancer and it will be helpful for the treatment of the cancerous unit in the human body. Once the tumor is detected, the tumor-affected part can be identified using morphological operations which include various algorithms. A morphological operation consists of filling, dilation, open and closing. This technique will give the total area affected and the region where the lung are affected. Thus, the achieved accuracy level of the proposed work is 90%.
KEYWORDS: Dual-Tree Complex Wavelet Transform (DT-CWT), Gray Level Co-occurrence Matrix (GLCM), Back Propagation Neural Network.
Introduction
Image processing is the trending technique which is used to predict the disease within the living things. One of the common diseases in the world after the heart disease is cancer. Lung cancer is the single most inimical cause of cancer-related deaths [1]. The symptoms of lung cancer come into light at the final stage. So it is very tough to identify in its beginning stage. For this reason, the death percentage is very high for lung cancer in comparison with all other types of cancer. The two kinds of lung disease which develop and spread in an unexpected way, are little cell lung malignancies (SCLC) and non-little cell lung tumors (NSCLC) [2]. Prediction and diagnosis of lung cancer is mainly done using computed tomography (CT)images. Disease in the patient as soon as possible, especially in tumors [3].In our proposed work we have collected database images from NCBI database. It consists of various angles of CT scan images of different patients. Lung cancer occurs in all type of living things including any type of gender. According to World Health Organization (WHO), lung cancer stood at first position among another type of cancers such as liver cancer, gastric cancer, colorectal cancer, breast cancer and esophagus cancer. According to National Cancer Institute (NCI), in US, 159260 cancerous deaths among 224210 are found due to lung cancer only. In India, 87% of male and 85% of females are suffered from lung cancer due to
smoking. The main cause of lung cancer is the addiction of smoking cigarettes, carcinogenic environment such as radioactive gas and also air pollution. Based on its histopathy, lung cancers are of two types: Small Cell Lung cancer (SCLC) and Non-Small Cell Lung cancer (NSCLC). Around 80 % of deaths are due to lung cancer of NSCLC type [4]. The main aim of the proposed work is to detect the lung cancer with the CT scan images. This would increase the accuracy of the prediction process. The paper is ordered as follows: In section II, the existing approaches for lung cancer detection is reviewed with its result and future scope. Section III, describes the methodology of proposed system. Section IV, results of the proposed approach are discussed and finally section V, concludes the paper with, the analysis and findings from the approach.
Section IILiterature SurveyLITERATURE SURVEY
Various works had been undergone in lung cancer detection and prediction. A detailed survey of lung cancer classification using Support Vector Machine (SVM) is represented in [6].From the represented technique, lung cancer is classified as normal, benign and malignant Tumors. The segmentation techniques for lung cancer detection in CT scan images are presented in [7]. The classification process includes an Artificial Neural network, Back propagation technique and multilayered perceptron techniques which is included in[8]. The major classification of lung cancer is carried out using Artificial Neural Network and Fuzzy Clustering Methods investigated by [9]. The major research has been invented by Almas Pathan, Bairu Saptalkar to predict the lung cancer using Neural Network by taking the dataset as an X-ray format[10]. Many research had been by Zagreb, Croatia based on the classification of asthma and chronic obstructive pulmonary disease (COPD) in the form of fuzzy rules and they have training phases.[11].The main aim of the proposed work is based on the segmentation using Dual-Tree Complex Wavelet Transform (DT-CWT) and classification using Back Propagation Neural Networks.
Section III
Methodology
A.Database
Input database images are collected from the Cancer Archive database. The Cancer Archive database which consists of large number of lung cancer patients records at the different locations in the lungs. As we have trained around 100 images from the collected database and the testing phase is carried out using 50 database images. The CT scan images which consist of normal lung images as well as abnormal lung images are as shown in figure 1(a) and 1(b).
(a) (b)
Figure 1(a)Normal lung 1(b)Abnormal lung
B.Methods
The detection process is carried out using initial pre-processing stage. Pre-processing method consists of gray-scale conversion and filtering process. Input CT scan images was converted into grayscale image where the pixel ranges from 0 to 255.Filtering process was carried out using gabor filter to remove the noise in the input image.FFT algorithm is used work on Fourier transform in the image. It is used in filter in the proposed work. Segmentation process is carried out using DT-CWT technique. Dual Tree Complex Wavelet Transform (DT-CWT)is used for the segmentation process. Segmentation is carried out to find the region of interest. Extracted Region feature are calculated using GLCM (Gray Level Co-occurrence Matrix).Various features are extracted from the GLCM which is included as homogeneity, energy, contrast, correlation, and variance. The Result of GLCM feature extraction can be trained in the training phase using Neural Networks. The Neural network which is used for training the given dataset is the Back Propagation Neural Network. By using the Back Propogation Neural Network the accuracy level of the proposed algorithm can be increased and the training phase time will be lesser as compared with the other neural networks. The above process can be carried using step by step method as shown in the figure.2
Figure 2 Proposed work
C.Results And Discussion
Image pre-processing methods includes four major processing blocks such as image processing, image segmentation, feature extraction and classification.
1. Image-Preprocessing
The processing stage consists of getting the input CT scan image from the collected database is shown in figure 4(a).The CT scan image may have the regularize pixels range of 207x207.The pre-processing stage consists of Gray scale conversion which converts the input Ct scan image into gray scale range as shown in the figure 4(b).
Figure4(a)originalimage,4(b)gray-scale image.
The color conversion process is carried out using filtering process.This filtering process consists of gabor filtering and an FFT filtering algorithm.Gabor filter is used to remove the noise for the processed Gray scale image as shown in figure 4(c) [image: ]
Figure 4(c) Gabor filtered image
FFT filtering algorithm is carried out to undergo noise removal using Fast Fourier Transform technique. Thus the Pre-Processing stage is followed using the image segmentation process.
2. Image Segmentation
Image segmentation is used to find out the region of the cancer-affected part. The area of the cancer affected can be calculated using DT-CWT (Dual Tree Complex Wavelet Transform).The DTCWT computed as a compound transform[12]used to separate the two Discrete Wavelet Transform of the tree of the Decompositions. DT-CWT removes the fissures and detect the cancer-affected region using the Discrete Wavelet Transform. The segmented part of the grayscale image undergoes the feature extraction process.the tumor segmented image is shown in figure 5
Figure 5 segmented tumor image
3. Feature Extraction
The proposed method feature extraction process consists of GLCM algorithm. The GLCM consists of following equations.
Contrast = (1)
Energy= (2)
Entropy = (3)
Usually the angles used are 0°,45 °, 90 °, and 135 0 [13]. GLCM features extraction used in the proposed work can be energy, contrast, entropy.
D. Classification
The extracted features undergo the training stage using the classifiers. The Classifiers included in the proposed work is Back Propogation Neural Network classifier. The classification process consists of training as well as testing phase. Training in the proposed system undergoes using Back Propogation Neural Network which includes different target value and they have a different classification. The classification process which is included as normal lung cancer images as the value of 1’s and the abnormal or defected lung as the value of 0’s.Testing can be allocated with the initial parameters and the weight of the Neural network(learning rate 0.3, hidden layer value 20; and epoch 1000). In the proposed work, considering learning rate 0.3, hidden layer value 20, and epoch 1000.This would increase the accuracy and specificity level of the proposed work. The percentage of the training and validation done can be 90% in the confusion matrix. The classification can be shown in figure6. [image: ]The Performance measure can be carried out by calculating the accuracy, sensitivity and specificity of the proposed algorithm. Performance measure can shown in figure 7 [image: ]IV. CONCLUSION Various image processing stage has been carried out to detect the lung cancer and the area affected by cancer. The testing phase is carried out with 50 Database CT images and the training phase is carried out using 50% database CT images. Thus, the accuracy of the proposed work can be 90% done. In order to obtain higher accuracy, further research is needed by improving the preprocessing process, image segmentation, feature extraction, and learning process.