|Year : 2020 | Volume
| Issue : 1 | Page : 40-52
Detecting anomalous growth of skin lesion using threshold-based segmentation algorithm and Fuzzy K-Nearest Neighbor classifier
S Sivaraj, R Malmathanraj, P Palanisamy
Department of Electronics and Communication Engineering, National Institute of Technology, Tiruchirappalli, Tamil Nadu, India
|Date of Submission||29-Mar-2017|
|Date of Decision||01-Oct-2017|
|Date of Acceptance||25-Feb-2018|
|Date of Web Publication||26-Oct-2018|
Department of Electronics and Communication Engineering, National Institute of Technology, Tiruchirappalli - 620 015, Tamil Nadu
Source of Support: None, Conflict of Interest: None
Context: Skin cancer is a complex and life-threatening disease caused primarily by genetic instability and accumulation of multiple molecular alternations.
Aim: Currently, there is a great interest in the prospects of image processing to provide quantitative information about a skin lesion, that can be relevance for the clinical images and also used as a stand-alone cautioning tool.
Setting and Design: To accomplish a powerful approach to recognize skin cancer without performing any unnecessary skin biopsies, this article presents a new hybrid technique for the classification of skin images using Firefly with K-Nearest Neighbor algorithm (FKNN).
Materials and Methods: FKNN classifier is used to predict and classify skin cancer along with threshold-based segmentation and ABCD feature extraction. Image preprocessing and feature extraction techniques are mandatory for any image-based applications.
Statistical Analysis Used: Initially, it is essential to eliminate the illumination variation and the other unwanted shadow areas present in the skin image, which is done by homomorphic filtering called preprocessing.
Results: The comparison of our proposed method with other existing methods and a comprehensive discussion is explored based on the obtained results.
Conclusion: The proposed FKNN provides a quantitative information about a skin lesion through hybrid KNN and firefly optimization that helps for recognizing the skin cancer efficiently than other technique with low computational complexity and time.
Keywords: ABCD features, Fuzzy K-Nearest Neighbor classifier, homomorphic filtering, preprocessing, red, green, and blue to grayscale conversion, skin cancer, threshold-based segmentation
|How to cite this article:|
Sivaraj S, Malmathanraj R, Palanisamy P. Detecting anomalous growth of skin lesion using threshold-based segmentation algorithm and Fuzzy K-Nearest Neighbor classifier. J Can Res Ther 2020;16:40-52
|How to cite this URL:|
Sivaraj S, Malmathanraj R, Palanisamy P. Detecting anomalous growth of skin lesion using threshold-based segmentation algorithm and Fuzzy K-Nearest Neighbor classifier. J Can Res Ther [serial online] 2020 [cited 2020 Jun 6];16:40-52. Available from: http://www.cancerjournal.net/text.asp?2020/16/1/40/244209
| > Introduction|| |
Skin cancer is a major concern of the real reasons for death everywhere throughout the world which is brought on by the uncontrolled development of unusual cells in the body. Detecting the skin cancer is troublesomedue to the confusing appearance of wide variety of skin lesions. There are approximately three commonly known types of skin cancers, such as malignant melanomas, basal cell carcinomas, and squamous cell carcinomas. Malignant melanoma, the most deadly form of skin cancer, is a standout among the most quickly expanding cancers in the world. For the detection of the skin cancer, image processing techniques are widely used. If melanoma is diagnosed and treated in its initial stages, it can be cured yet if the conclusion ends up being late. Melanoma can form further into the skin and spread to different parts of the body. Advances in dermoscopy technology have contributed altogether to enhanced detection and survival rates. With emerging development of sophisticated computer-aided image analysis technologies, an amazing interest has emerged for dermatologists to look for a goal from computer-aided diagnosis (CAD) programming for assisting skin lesion malignancy diagnosis.
The fundamental steps in the computerized/automatic identification of skin cancer are detection and segmentation of lesions that might correspond to malignant melanoma or other types of skin cancer. A few segmentation schemes applied to images of pigmented skin lesions (PSLs), a large portion of them dealing with the segmentation of clinical images, have been proposed. For segmenting skin lesions, some segmentation techniques such as region-based segmentation, watershed segmentation, level set segmentation, texture-based segmentation, histogram thresholding-based segmentation, clustering-based segmentation, edge-based segmentation, morphological segmentation, model-based segmentation, soft computing segmentation, and active contour segmentation are used.
After segmentation, one of the essential techniques to analyze the skin cancer images is the feature extraction process. Feature extraction strategies are broadly utilized to reduce dimensionality of data and to enhance the discriminatory information. Features such as border abruptness, asymmetry, the number of colors, and the number of structural components present in the lesion must be extracted. The features extracted are based on geometry, colors, and texture of the lesions including complex image processing techniques. The features extracted are utilized for the classification of the skin lesions in the abnormal image. Several classifiers are used to locate and classify the regions from the input image., In addition, for recognition of the diseases from the skin images, different popular classifiers are used such as local binary pattern, gray-level co-occurrence matrix (GLCM), discrete cosine transform, and discrete Fourier transform with support vector machine-based classifiers. It is essential to detect the irregular border of the lesion as a result of classification. Hence, several research studies have been passed out to improve classification accuracy and to detect the boundary of irregularity of the lesion.
In this, the skin cancer lesions are distinguished from the digital image by utilizing threshold-based segmentation technique which portions the skin lesion-influenced range from the typical skin and results whether there is a skin cancer-influenced territory present in the given image or not and it is further ordered by a viable classifier; for that purpose, a hybrid classifier is proposed in this article. This article is organized as follows: Section 2 gives some of the recent research works done for detecting the skin lesions. The proposed methodology and its detailed explanation are presented in Section 3. The simulation results and the performance comparison of the proposed method with the other traditional methods are briefed in Section 4 followed by the conclusion in Section 5.
Some of the works related to the skin cancer detection are described as follows:
Sujitha et al . have actualized a nonobtrusive real-time automated skin lesion analysis system for melanoma early detection and prevention. The principal component was a real-time caution to enable clients to avoid skin to consume radiations caused by daylight. A novel equation to compute the time for skin to consume was subsequently introduced. The second component was a computerized image investigation which contains image acquisition, hair detection and exclusion, lesion segmentation, feature extraction, and classification, where the client would have the capacity to capture the images of skin moles and the image processing module classified under the category in which the modules fall into considerate as a typical, or melanoma. An alarm would be given to the client to look for medical assistance if the mole has a place with the typical or melanoma category.
Shivangi and Pise have built up a combined segmentation approach for melanoma skin cancer diagnosis. The images of the skin lesions are collected. Preprocessing is performed by utilizing different filters which expel noise and other unwanted artifacts like hair, sweat bubbles from the image. Segmentation is an important process which differentiates the affected part from the background skin. From the portioned image, necessary features are extracted. At that point, the postprocessing step is carried out in the image. By utilizing the features extracted, classification is done using Soft Computing techniques like Fuzzy logic or Artificial Neural Networks. The most important feature of the affected skin lesion was ABCD (A–Asymmetry B–Border C–Color D–Diameter). This was the conventional technique for melanoma detection. This strategy is utilized because the lesions have asymmetrical characteristics, sporadic border, fluctuating color composition, and different distances across. By extracting these features, the classification of melanoma and benign skin lesions can be done effortlessly and accurately.
Bhowmik et al . have executed a computer-supported melanoma skin cancer detection utilizing image processing. The input for the system was the image of the skin lesion which is suspected to be a melanoma lesion. This image was then preprocessed to enhance the image quality. The automatic thresholding process and edge detection are utilized for image segmentation. The segmented image is given to the feature extraction block which consists of lesion region investigation for its geometrical features and ABCD features. The geometrical features are proposed since they were the most conspicuous features of the skin cancer lesion.
Sigurdsson et al . have proposed Raman spectroscopy for medical diagnostics from in vitro bio liquid examines to in vivo cancer detection. The utilization of excitation lasers with wavelengths in the noticeable and close infrared regions grants efficient coupling of Raman spectrometers with optical microscopes. Such Raman microspectrometers permit mapping of the molecular properties of the examples with diffraction-restricted spatial resolution. The most common technique for acquiring Raman spectral images is raster scanning that was obtained through scanning the laser spot across the specimen and after that applying a uni- or multi-variate spectral model to each Raman spectrum. Raman spectral imaging based on line mapping (i.e., laser beam extended to shape a line spot on the specimen surface) could considerably decrease the imaging time compared to single-point raster scanning, up to a factor equivalent to the number of scanning at the same time of measured testing points and finally gave the laser control which used to maintain a single-point mapping over the entire laser line.
Schmid and Philippe have described segmentation of digitized dermatoscopic images by two-dimensional (2D) color clustering. The primary processing step was the segmentation of the images based on color or texture information. Texture examination was important for the investigation of PSLs since the presence of specific pigmented structures may influence the determination significantly. They were important features for the analysis of malignant melanoma and different types of skin cancer. Since color also conveys significant information, a color-based segmentation scheme appeared to be more suited as a first processing step. It should take into consideration the isolation of the lesion from the sound skin and for the separation between different homogeneously colored regions.
Glaister et al. have implemented a multistage illumination modeling algorithm to correct the basic light variety in skin lesion photographs. The principal stage was to compute an initial estimate of the illumination map of the photograph utilizing Monte Carlo nonparametric demonstrating technique. The second stage was to obtain a final estimate of the illumination map through parametric modeling strategy, where the initial nonparametric estimate was usedas a priori . Finally, the corrected photograph was obtained using the final illumination map estimated.
Jeniva and Santhi have proposed a lesion segmentation algorithm based on the concept of texture distinctiveness to identify skin lesions in photographs. The distinct nonlesion zones happened because of common pigmentation and texture characteristics of the skin are highlighted and classification can be done using the Texture Distinctiveness Lesion Segmentation algorithm. The representative texture distribution and texture distinctiveness were highlighted using texture map that gives the dissimilarity between a pair of texture distribution. The texture distribution image was divided into a large number of small regions using Statistical Region Merging algorithm. The regions were classified as normal and lesion skin based on the result of Otsu's thresholding.
Glaister et al . have implemented a texture-based skin lesion segmentation algorithm. A set of representative texture distributions were learned from illumination-corrected photograph and texture distinctiveness metric was calculated for each distribution. Next, based on the occurrence of representative texture distributions, regions in the image were classified as normal skin or lesion. The proposed segmentation framework was evaluated by comparing lesion segmentation, melanoma classification, and the results of other state-of-the-art algorithm. The framework has higher segmentation precision contrasted with all other tested algorithms.
Korotkov and Garcia have described a review of reliable programmed instruments for perceiving skin disease from images gained in vivo connected to microscopic (dermoscopic) and macroscopic (clinical) images of PSLs. The audit aims to: (1) give an extensive prolog and illuminate ambiguities in the literature and (2) categorize and group relevant references so as to simplify literature searches on a specific subtopic. The existing literature was arranged by the nature of production (clinical or PC vision articles) and separating between individual and various PSL image analyses. The importance of the difference in content between dermoscopic and clinical images is highlighted.
Various methodologies for actualizing PSL CAD systems and their standard work process components are reviewed and summary tables are provided. An extended categorization of PSL feature descriptors was also proposed, associating them with the specific methods for diagnosing melanoma and then separating images with two modalities.
Yingding and Xie have suggested a method of automatic skin lesion division in view of texture analysis and supervised learning. It first included the grouping of training image into homogeneous regions utilizing mean-shift; using Gabor and GLCM feature, the fusion texture features are extracted from each clustered based region; next, the classifier model was produced through supervised learning base on Library for Support vector Machines (LIBSVM); finally, lesion regions of the unseen image were automatically predicted out by produced classifier. The method was compared with three state-of-the-art methods and results were demonstrated that the presented method achieves both robust and accurate lesion segmentation in dermoscopy images.
Detecting anomalous growth of skin lesion using proposed method
A melanoma is a cancer that begins in skin cells called melanocytes which are normally responsible for producing the skin pigment melanin. A melanoma can spread resulting in cancers in different organs of the body, if left untreated. It is important to accurately differentiate the normal lesions and the abnormal lesions. Detection of skin cancer is essential for the reduction of death rate and depth of the disease. However, accurate detection of skin cancer from digital images is a big challenge. Due to illumination variation, digital images may contain shadows and bright areas throughout the image so that segmenting digital images of skin lesions is a more difficult problem. Special segmentation algorithms are required to take into account illumination variation. These segmentation algorithms tend to identify areas with shadows as a part of the skin lesion because of illumination variation. In addition, the utilization of these algorithms expands the complexity and time for segmentation. Hence, preprocessing is required, which corrects illumination variation and removes the shadows and bright areas from the input digital image. Before extracting features from the skin lesion and classifying the lesion as malignant or benign, the location of the lesion border must be recognized by utilizing a segmentation algorithm. It is important that the skin lesion segmentation is accurate, as the resulting segmentation is used as an input to feature extraction and classification algorithms. There are several existing methods used mainly based on the color of the skin for segmentation and classification. Some of the existing segmentation techniques are region based, edge based, feature-based clustering, and model-based segmentation. All the above segmentation techniques are failed to produce accurate results in the presence of noisy image. Similarly, KNN classifier is one of the efficient classifiers used in the past decades which is based on calculating the distance/similarity function for all nearest neighbors and then choose k-neighbors for classification that increase complexity and time. To overcome those problems, we propose a new hybrid Fuzzy K-Nearest Neighbor (FKNN) classifier which initially selects k-neighbors and then calculate the distance/similarity function only for the selected k-neighbors. The proposed block diagram which explains the overall proposed method for skin cancer segmentation and classification.
As given below in [Figure 1] the proposed methodology includes four main stages: preprocessing using homomorphic filtering, threshold-based segmentation, extraction of ABCD features, and classification and detection which is based on FKNN classifier. The quality of an image is to be enhanced by the removal of noise from an input image during the preprocessing phase. Next, the image gets converted from red, green, and blue (RGB) to gray which helps in image resizing. Based on the intensity values, the threshold-based segmentation is performed and the size of the image gets reduced. This makes the further processing steps to concentrate more on the area which is affected by skin cancer and reduce the number of unwanted computations. The features such as asymmetry, border irregularity, color variation, and diameter are extracted from the segmented image and these extracted features are given to the FKNN classifier, which characterizes the given input image as normal or abnormal.
Accurate detection of skin cancer from digital images is a big challenge due to illumination variation. Digital images may contain shadows and bright areas throughout the image because of this illumination variation. In our proposed method, a digital image is taken as input image to detect and classify the area affected by skin cancer. The digital image may contain some generic problems such as noise, shadows, and bright areas. Furthermore, if the skin lesion is covered by many hairs, the segmentation will be affected by the high gradient at the hairs so that segmenting digital images of skin lesions is a more difficult problem. Special segmentation algorithms are required to take into account illumination variation. These segmentation algorithms tend to identify areas with shadows as a part of the skin lesion because of illumination variation. Use of these algorithms increases the complexity and increases time in segmentation. Hence, preprocessing is required, which corrects illumination variation and removes the shadows and bright areas from the input digital image. In our proposed method, preprocessing is done using homomorphic filtering for illumination correction and noise removal.
Homomorphic filtering is a frequency domain filtering process. In general, an image can be viewed as a 2D function of the form I(x , y ), whose value at spatial directions ( x , y ) is a positive scalar quantity whose physical significance is controlled by the source of the image. Assume that if we are processing with the grayscale images, we can say that when an image is created from a physical procedure, its qualities are corresponding to energy radiated by a physical source. By the product of brightening (the measure of source light episode on the scene being seen) and reflectance, the intensity can be obtained. If we denote illumination as f(x , y ) and reflectance as r(x , y ), then an image I(x , y ) can be expressed as follows:
I ( x, y ) = r ( x, y ) . f ( x, y ) (1)
When the image is captured, illumination resulting from the lighting condition is present, and the illumination can change when lighting condition is changed. Reflectance results from the way the objects in an image reflect light and is determined by the intrinsic properties of an object itself, which do not change. For our given problem, when we are reducing the contribution of illumination, the reflectance would be enhanced, hence we need to somehow separate the two components from Equation 1; if we could somehow transform the equation 1, the problem of high-pass filtering would become trivial when we use multiplication to addition, so we could use the multiplication or convolution property of the Fourier transform ᵹ. An undeniable approach to take care of this issue is to take a characteristic logarithm of both sides of Equation 1.
Z ( x, y) = ln( I [ x , y ]) = ln( r [ x , y ]. f [ x , y ]) = ln( r [ x , y ]) + ln( f [ x , y ]) (2)
Taking Fourier transform for Equation 2,
F(z [ x , y ]) = F (ln[ r ( x, y ) ]) + F (ln[ f ( x, y ) ]) (3)
Z ( u , v ) = F r (u , v ) + F f (u , v )(4)
Where F r (u , v ) and F f( u , v ) are the Fourier transforms of ln( r [ x , y ]) and ln( f [ x , y ]), respectively. Now we can high pass the Z(u , v ) by means of a filter function H(u , v ) in frequency domain and obtain a filtered version S ( u , v ),
S ( u , v ) = H ( u , v ), Z ( u , v ) = H ( u , v ). F r( u , v ) + H ( u , v ). F f( u , v ) (5)
Taking inverse Fourier transform for Equation 5,
A Butterworth high-pass filter is normally used for this purpose, which is defined as,
Where n defines the order of the filter. D 0 is the cutoff distance from the center and D ( u , v ) is given by,
D ( u , v ) = ([ u , − M /2]2+ [ v – N /2]2)½ (8)
Where M and N are the number of rows and columns of the original image. Then, the desired filtered image I '( x , y ) can be obtained by the exponential operation,
I '( x , y ) = es(x, y) (9)
The whole process performed in homomorphic filtering is shown in [Figure 2].
As a result of the above preprocessing steps, the illumination variations present in the input image get reduced and the enhancement of the contrast and quality of the image is achieved.
Image segmentation is the fundamental step to analyze images and extract data from them. Image segmentation directly influences the overall success to understand the image. Using segmentation algorithm, the location of the lesion border must be identified and then extracting features from the skin lesion and classifying the lesion as malignant or benign should be done. It is important that the skin lesion segmentation is accurate. The segmentation result is used as an input to feature extraction and classification algorithms. In our proposed method, image segmentation is used to segment the input image into a number of pixels. To improve the accuracy of lesion detection, the feature extraction and classification is performed for each and every pixel which increases the detection time. Hence, RGB to gray conversion of image is performed. Using a threshold-based segmentation algorithm, the pixels which are having the values less than the threshold value are considered as unaffected area by skin cancer and get neglected. As a result of segmentation, we can detect whether any area affected by skin cancer is present in the given image or not. In short, our proposed segmentation method can be explained by the following steps.
- RGB to gray conversion
- Image resizing and segmenting based on intensity and threshold values.
Red, green, and blue to gray conversion
In RGB color model, each color appears in its primary spectral components of RGB. The color of a pixel is comprised of three parts; RGB, portrayed by their relating intensities. Color components are otherwise called as color channels or color planes. A color image can be represented by the intensity function in the RGB color model.
I'( x , y )R, G, B= ( F R, F G, F B) (10)
Where F R( x, y ) is the intensity of the pixel ( x , y ) in the red channel, F G(x , y ) is the intensity of pixel (x , y ) in the green channel, and FB(x , y ) is the intensity of pixel (x , y ) in the blue channel. The intensity of every color channel is utilizing eight bits, which demonstrates that the quantization level is 256. That is, a pixel in a color image requires an aggregate storage of 24 bits. A 24-bit memory can be expressed as 224 = 256 × 256 × 256 = 16777216 distinct colors. The quantity of hues is ought to sufficiently meet the presentation impact of general images. Such images might be called true color images, where data of every pixel are kept by utilizing a 24-bit memory.
The color images are transformed into grayscale images if only the brightness information is needed. The grayscale transformation can be done using the following equation:
I y( x, y ) = 0.333 F r+ 0.5 F g+ 0.1666 F b (11)
Where F r, F g, and F b are the intensity of RGB components, respectively, and I y is the intensity of equivalent gray-level image of RGB image.
In RGB pixel information, there are three components (RGB) and each component has a fix intensity of 190, 183, and 175, respectively. When RGB image is converted into gray image, then the intensity of pixel (1, 1) can be calculated using the pixel values of RGB image in the above transformation.
I y(x , y ) = 0.333 × 187 + 0.5 × 179 + 0.166 × 176 = 181.15 (12)
In this way, we can calculate all the gray-level values by using the above transformation.
Image resizing and segmenting
After RGB to gray scale conversion, the size of the image gets reduced based on the threshold values. In general, the pixels containing skin lesions and normal skin differ from their intensity values. Based on this concept, a threshold value is fixed at the start of the segmentation process. After RGB to gray scale conversion, the intensity values which are less than the fixed threshold value are considered as zero and the pixel is neglected to perform further processing steps. This will reduce the size of the input image and avoids unwanted calculations and reduces the processing time. The input image is an image of a particular area of skin that contains both the affected and unaffected areas of skin cancer. Finally, the resized image is segmented.
Extraction of ABCD features
After segmentation, feature extraction is the important step to detect skin lesions. The segmented image is given as input to feature extraction. There are several existing methods used mainly based on the color feature of the skin. It leads to inaccurate detection because there is a possibility to the skin to get affected by other external injuries which may also look like skin cancer lesions. Image feature extraction for diagnosis of skin cancer requires the detection and localization in an image. The common features extracted from an image are asymmetry, border irregularity, color variation, and diameter (ABCD features). To check for the degree of symmetry, the asymmetry feature is extracted. To calculate the border irregularity, there are different measures such as compact index (CI), fractal index, and edge abruptness is calculated. Color variation is one of the early signs of skin cancer since skin cancer cells are often colorful around brown, or black, depending on the production of the melanin pigment at different depths in the skin. A skin cancer cell tends to grow larger than common cells and in an irregular shape. By measuring the diameter of different pixels, the skin lesions can be identified.
An important feature for skin cancer lesion detection is asymmetry. In general, the shape of the skin cancer lesion is irregular. To detect the boundary of the skin cancer lesion, the asymmetry index is calculated. Asymmetry index is calculated using the following equation:
Where AI I(x,y) is the asymmetry index, A ( I x,y) is the area of the total image I ( x, y ) , and Δ A ( I x, y) is the area difference between total image and the lesion area.
The border irregularity can be calculated by different measures such as CI, fractal index, and edge abruptness.
Density index (CI) is the estimation of the type of barrier which estimates unanimous 2D objects. However, this CI is very sensitive to noise along with the boundary. This can be determined by the following equation:
Where, P L( I x,y) is the perimeter of the lesion in image I ( x, y ) . A L(Ix, y) is the area of the lesion in image I ( x, y ) .
Dimension size is generally an integer. For example, dimension size for line is 1, for field is 2, for cube is 3, and so on. However, the fractal dimension is strange as it may worth fraction. This fractal dimension can be utilized as a characteristic of an image. Fractal dimension can be calculated by the method of calculation of the box. To find the fractal dimension of an image, the Hausdorff dimension calculation method is simple and an effective one. Let us discuss this method with the help of an example. Let N ( e ) is the smallest of e-sided cubes that can cover a line. The dimension of this line is then,
Using Equation 15, fractal dimension of an image can be easily calculated.
Lesion with irregular boundaries (abruptness edge) has large difference in radial distance. Barrier irregularity can be estimated by analyzing the distribution of radial distance difference.
Where, m d is the mean distance of d2 between the centered point and the barrier.
One of the early signs of skin cancer is the emergence of color variations. Since the cancer cells grow in pigment, they are often colorful around brown or black depending on the production of the melanin pigment at different depth in the skin. From different color channels such as average value and standard deviation of the RGB or hue, saturation, lightness, the descriptors of color such as statistical parameter are calculated. In this article, color fluctuation of the RGB image has been calculated using HSV channel.
Skin cancer tends to grow larger than common moles and the diameter is 6 mm. Since the wound is frequently unpredictable structures, to find the diameter, a line can be drawn from all the edge pixels to the pixel edges through the midpoint and the middle value can be found.
Classification and detection based on Fuzzy K-Nearest Neighbor classifier
Finding an accurate estimate of the lesion border is important because of the types of features used for classification. The number of features is extracted from the segmented image which is given as input for our proposed classifier. The existing KNN classifier fails to classify the skin lesions accurately, if the number of features is reduced. To reach a complete detection of skin cancer, an efficient classifier is used that classifies the affected area of skin cancer by estimating the lesions' border. In our proposed method, a new hybrid model of FKNN classifier is used for classification, which improves the classification accuracy and performance. This hybrid FKNN classifier operates on the premises that perform the classification of unknown instances and it can be done by relating the unknown to the known according to some distance/similarity function. Thus, the affected skin area from the input image is located and classified accurately. [Figure 3] shows the overall process performed in our proposed method.
KNN classification algorithm
One of the most straightforward instance-based learning algorithms is the nearest neighbor algorithm. The KNN is based on the principle of choosing the instances with close proximity, i.e., similar properties to other instances within a dataset. By observing the class of its nearest neighbors, the value of the label of an unclassified instance can be determined when the instances are tagged with a classification label. The KNN locates the k nearest instances to the query instance and determines its class by identifying the most frequent class label. The training phase for KNN consists of all known instances and their class labels.
Firefly optimization algorithm
Firefly Algorithm (FA) is based on the flashing patterns and behavior of fireflies. FA works based on the following three idealized rules:
- Fireflies are unisex so that one firefly will be pulled into various fireflies paying little regard to their sex
- The attractiveness is proportional to the brightness, and they both diminish as their separation increments. In this way, for any two blazing fireflies, the less brilliant one move toward the brighter one. In the event that there is no brighter one than a specific firefly, it will move arbitrarily
- The brightness of a firefly is controlled by landscape of the objective function.
A firefly's attractiveness is proportional to the light intensity, which is denoted by adjacent fireflies. The variation of attractiveness β can be defined with distance r by,
Where βo is the attractiveness at r = 0. The movement of a firefly i is attractive to another high attractive (brighter) firefly j which is determined by,
Where the second term denotes the attraction, third term is randomization with αt being the randomization parameter, and is a vector of random numbers drawn from Gaussian distribution or uniform distribution at time t . If βo= 0, it becomes a simple random walk. On the other hand, if γ =0, it reduces a variant of particle swarm optimization.
Proposed Fuzzy K-Nearest Neighbor classifier
This article aims to hybridize the FA with KNN classifier algorithm to accurately diagnose the cancer lesions present in the input image. Consider if the input image consists of N number of pixels after segmentation, the image can be resized, then four features are extracted for each pixel. Totally, N*4 features are extracted from the image and given as input to the classifier. FKNN classifier works based on the training provided to it. FA is a randomized search and optimization technique guided by the principles of evolution and natural genetics, having a large amount of implicit parallelism. FAs perform search in complex, large, and multimodal landscapes and provide optimal solutions for objective or fitness function of an optimization problem. FA chooses K neighbors based on similarity function by the fitness function which is given by
Where m = 10, d = 1, 2… n . Combination of firefly optimization with KNN classifier provides the optimal features to the classifier and makes the classification accurate. The KNN classification algorithm predicts the test sample's category according to the K training samples, which are the nearest neighbors to the test sample, and judges it to that category which has the largest category probability. The process of KNN algorithm to classify sample X is as follows.
- Suppose there are j training categories C 1, C 2,…, C j and the sum of training samples is N , after feature reduction, they become m-dimension feature vector
- Make test X to be the same element vector of the structure ( X 1, X 2… X m), as all training samples
- Calculate the similarities between all training samples and X . Taking the ith sample d i (di1, di2… dim) as an example, the similarity SIM (X , d i) is as follows:
- Choose k samples which are larger from N similarities of SIM (X , d i), ( i = 1, 2… N), and treat them as a KNN, i.e., collection of X . Then, calculate the probability of X which belongs to each category with the following formula:
Where y( d i,Cj) is a category attribute function which is satisfied.
- Judge sample X to be the category which has the largest P ( X , C j).
The proposed FKNN classifier produces the output based on the similarities and the distance function of the nearest neighbors. The basic idea here is that using FA instead of calculating the similarities based on the distances between all training and test samples and then choosing k-neighbors for classification. Only k-neighbors are chosen at each iteration and then the similarities based on distances are calculated. After that, the test samples are classified with these neighbors and the accuracy is calculated. This process is repeated for L number of times to reach high accuracy, hence the calculation complexity of KNN is reduced and there is no need to consider the weight of the samples.
| > Results and Discussions|| |
In this article, we have proposed a method for detection of skin cancer lesions present in the input digital image using segmentation and classification by a new FKNN classifier.
The system configuration is as follows: Operating System: Windows 8; Processor: Intel Core i3; RAM: 4 GB; and Platform: Mat lab.
In this section, the proposed hybrid techniques are used for the classification and detection of skin lesions. A total of 100 input images taken for detecting and classifying the skin lesion were used in the proposed method. From that images, [Figure 4] shows thirty input images and the corresponding results.
|Figure 4: Obtained results of the data sets (a) input images, (b) red, green, and blue to gray image, (c) preprocessed output, (d) resized image, (e) segmented image, (f) edge-detected output|
Click here to view
As per our proposed method, the input image is converted from RGB to gray color model. And then, the converted images are resized based on some intensity and threshold values, which helps us to mainly focus only on the skin cancer-affected areas. The size of the image is reduced to avoid unwanted calculations and reduce the detection time. The given input image is converted from RGB color model to gray scale model. The output image obtained after RGB to gray conversion is shown in [Figure 4]b. The quality of the image is enhanced by illumination variation correction done by preprocessing. The preprocessed output obtained is shown in [Figure 4]c. The segmentation results concentrating more on the affected area are shown in [Figure 4]e. The affected area and unaffected area are differentiated based on the intensity values. The accurate edges of the skin cancer image are detected which makes the classification of normal and abnormal image easier which is given in [Figure 4]f. The detected edges of the given input image are shown in [Figure 4]g. Furthermore, the areas affected by the skin cancer are marked separately in the output image. In addition to that, the condition of the lesion (normal or abnormal) is also indicated as a result by the classifier. The simulation results produce the accurate segmented output and also classify the images as normal or abnormal.
The performance of our proposed method is evaluated in terms of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), false-positive rate, false-negative rate, false discovery rate (FDR), accuracy, F 1 score, and Matthew's correlation coefficient (MCC). [Table 1] shows the contingency table for the classification.
Where TP is the number of true-positive pixels, FP is the number of false-positive pixels, TF is the number of true-negative pixels, and FN is the number of false-negative pixels.
1. Sensitivity or true positive rate:
It is calculated by the following equation:
It indicates disease (positive).
2. Specificity or true negative rate:
It is calculated by the following equation:
It indicates non disease (negative).
It is calculated by the following equation:
It is calculated by the following equation:
5. False positive rate:
It is calculated by the following equation:
6. False negative rate:
It is calculated by the following equation:
It is calculated by the following equation:
It is calculated by the following equation:
It indicates whether the diagnostic test is performed correctly.
9. F 1 score:
It is the harmonic mean score of the precision and sensitivity. It is calculated by the following equation:
It is calculated by the following equation:
In the existing methods, skin cancer lesion detection and classification has been done using many conventional classifiers. However, many of them fail to produce accurate and effective results because of the low-quality image and illumination variations. Our proposed method is mainly used to concentrate on these two factors and produce better results than the other methods. [Table 1] shows the comparison between our proposed FKNN classifier with the existing classifiers such as ANN and Probabilistic Neural Network (PNN) in terms of sensitivity, specificity, and accuracy. The sensitivity, specificity, and accuracy values of different classifiers are based on the true-positive, true-negative, false-positive, and false-negative values. Other parameters include PPV, NPV, false-positive rate, false-negative rate, FDR, F 1 score, and MCC. Comparison of proposed classifier with other existing classifiers is shown in [Table 2].
From [Table 2], it is evident that our proposed classifiers produce accurate results with greater sensitivity and specificity than the ANN and KNN classifiers. [Figure 5] shows the comparison graph between the proposed FKNN classifier with the other existing classifiers such as ANN and PNN.
|Figure 5: Comparison between positive predictive value of different classifiers|
Click here to view
The comparison graph shown in [Figure 5], [Figure 6], [Figure 7], [Figure 8], [Figure 9], [Figure 10], [Figure 11], [Figure 12], [Figure 13], [Figure 14] differentiates the three types of classifiers and their performances and also it is clear that our proposed classifier produces better results than the others.
|Figure 9: Comparison between negative predictive value of different classifiers|
Click here to view
|Figure 10: Comparison between False Predictive Value (FPV) of different classifiers|
Click here to view
|Figure 11: Comparison between False Negative Rate (FNR) of different classifiers|
Click here to view
|Figure 12: Comparison between false discovery rate of different classifiers|
Click here to view
|Figure 14: Comparison between Matthew's correlation coefficient of different classifiers|
Click here to view
| > Discussion|| |
This work presents a segmentation and classification method for skin lesions in digital images. This technique preprocesses the input image with homomorphic filtering, in order to preserve and enhance useful information in the lesion boundaries. The threshold-based segmentation method is utilized for the segmentation of the skin lesion from the original skin image. The detection of skin cancer lesion using feature extraction and classification through a new model of hybrid FKNN is proposed in this article. Then, the segmentation is performed to avoid the unwanted computation of unaffected image pixels, and RGB to gray conversion is performed. Next, feature extraction is carried out for the segmented image pixel by pixel. For skin cancer detection, the ABCD parameters are extracted and given to the classifier as input. Based on the performance evaluation and comparison tables, it is clear that our proposed classification method is better than the other existing classification methods in terms of sensitivity, specificity, and accuracy. One advantage of our method is simplicity because computational cost is low to solve complex mathematical models. FKNN classifier efficiently classifies the benign from malignant and also works faster than other classifiers so that the proposed classifier can be used to detect the abnormal skin lesions from the given input image which is highly accurate and efficient. With the use of image processing in medical field, prediction of skin cancer is done and the mortality percentage due to skin cancer can be reduced.
| > Conclusion|| |
KNN is one of the most popular neighborhood classifiers used for skin cancer lesion detection. However, it has some issues related to skin cancer detection such as computational complexity, fully dependent only on the training set, and there is no weight difference between each class. To compact this, a hybrid approach to improve the classification performance of KNN using firefly optimization algorithm (FKNN) is proposed. Initially, the input image is preprocessed to enhance the contrast and quality of an image. The preprocessed image is segmented and the unaffected area present in the image is neglected based on some threshold values. Computation time and complexity is reduced due to the preprocessing and segmentation processes. Extracted features are given as input to the classifiers. Our proposed FKNN classifier is applied for classification, then the test samples are classified with the neighbors and the accuracy is calculated. This process is repeated for L number of times to reach high accuracy; hence, the calculation complexity of KNN is reduced and there is no need to consider the weight of the samples. The experimental result shows that our proposed method not only reduces the computational complexity, but also increases the classification accuracy. From the results and comparison, it was observed that our proposed method is technically viable option among the available other skin cancer detection techniques.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| > References|| |
Cheng L, Mandal M. Automated analysis and diagnosis of skin melanoma on whole slide histopathological images. Pattern Recognit 2015;48:2738-50.
Shimizu K, Iyatomi H, Celebi ME, Norton KA, Tanaka M. Four-class classification of skin lesions with task decomposition strategy. IEEE Trans Biomed Eng 2015;62:274-83.
Maglogiannis I, Doukas CN. Overview of advanced computer vision systems for skin lesions characterization. IEEE Trans Inf Technol Biomed 2009;13:721-33.
Schaefer G, Rajab MI, Celebi ME, Iyatomi H. Colour and contrast enhancement for improved skin lesion segmentation. Comput Med Imaging Graph 2011;35:99-104.
Rogers HW, Weinstock MA, Harris AR, Hinckley MR, Feldman SR, Fleischer AB, et al.
Incidence estimate of nonmelanoma skin cancer in the United States, 2006. Arch Dermatol 2010;146:283-7.
Housman TS, Feldman SR, Williford PM, Fleischer AB Jr., Goldman ND, Acostamadiedo JM, et al.
Skin cancer is among the most costly of all cancers to treat for the medicare population. J Am Acad Dermatol 2003;48:425-9.
Fengying X, Bovik CA. Automatic segmentation of dermoscopy images using self-generating neural networks seeded by genetic algorithm. Pattern Recognit 2013;46:1012-9.
Adam H, Shun-Yuen K, Wen-Yu C, Min-Yin L, Min-Hsiu C, Gwo-Shing C. A robust hair segmentation and removal approach for clinical images of skin lesions. Med Biol 2013;3315-8.
Wighton P, Lee TK, Lui H, McLean DI, Atkins MS. Generalizing common tasks in automated skin lesion diagnosis. IEEE Trans Inf Technol Biomed 2011;15:622-9.
Chung DH, Sapiro G. Segmenting skin lesions with partial-differential-equations-based image processing algorithms. IEEE Trans Med Imaging 2000;19:763-7.
Li H, Jiang T, Zhang K. Efficient and robust feature extraction by maximum margin criterion. IEEE Trans Neural Netw 2006;17:157-65.
Schmid-Saugeona P, Guillodb J, Thirana JP. Towards a computer-aided diagnosis system for pigmented skin lesions. Comput Med Imaging Graph 2003;27:65-78.
Ganster H, Pinz A, Röhrer R, Wildling E, Binder M, Kittler H, et al.
Automated melanoma recognition. IEEE Trans Med Imaging 2001;20:233-9.
Omar A, Barkana B, Faezipour M. Non-Invasive Real-Time Automated Skin Lesion Analysis System for Melanoma Early Detection and Prevention; 2015.
Sujitha S, Lakshmi Priya M, Premaladha J, Ravichandran KS. A combined segmentation approach for melanoma skin cancer diagnosis. Commun Inf Syst 2015;11-6.
Shivangi J, Pise N. Computer aided melanoma skin cancer detection using image processing. Procedia Comput Sci 2015;48:736-41.
Bhowmik A, Repaka R, Mulaveesala R, Mishra SC. Suitability of frequency modulated thermal wave imaging for skin cancer detection-A theoretical prediction. J Therm Biol 2015;51:65-82.
Sigurdsson S, Philipsen PA, Hansen LK, Larsen J, Gniadecka M, Wulf HC, et al.
Detection of skin cancer by classification of Raman spectra. IEEE Trans Biomed Eng 2004;51:1784-93.
Schmid P. Segmentation of digitized dermatoscopic images by two-dimensional color clustering. IEEE Trans Med Imaging 1999;18:164-71.
Glaister J, Amelard R, Wong A, Clausi DA. MSIM: Multistage illumination modeling of dermatological photographs for illumination-corrected skin lesion analysis. IEEE Trans Biomed Eng 2013;60:1873-83.
Jeniva S, Santhi C. An efficient skin lesion segmentation analysis using statistical texture distinctiveness algorithms. International Journal of Advanced Research Trends In Engineering and Technology 2015;2:111-6.
Glaister J, Wong A, Clausi DA. Segmentation of skin lesions from digital images using joint statistical texture distinctiveness. IEEE Trans Biomed Eng 2014;61:1220-30.
Korotkov K, Garcia R. Computerized analysis of pigmented skin lesions: A review. Artif Intell Med 2012;56:69-90.
Yingding H, Xie F. Automatic Skin Lesion Segmentation Based on Texture Analysis and Supervised Learning. In Computer Vision-ACCV2012. Berlin, Heidelberg: Springer; 2013. p. 330-41.
Ma Z, Tavares JM. A novel approach to segment skin lesions in dermoscopic images based on a deformable model. IEEE J Biomed Health Inform 2016;20:615-23.
Jain S, Pise N. Computer aided melanoma skin cancer detection using image processing. Procedia Comput Sci 2016;48:735-40.
[Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5], [Figure 6], [Figure 7], [Figure 8], [Figure 9], [Figure 10], [Figure 11], [Figure 12], [Figure 13], [Figure 14]
[Table 1], [Table 2]