

ORIGINAL ARTICLE 

Year : 2013  Volume
: 9
 Issue : 3  Page : 456466 

A research about breast cancer detection using different neural networks and KMICA algorithm
AA Kalteh^{1}, Payam Zarbakhsh^{2}, Meysam Jirabadi^{3}, Jalil Addeh^{4}
^{1} Department of Electrical Engineering, Aliabadkatoul Branch, Islamic Azad University, Aliabadkatoul, Iran ^{2} Department of Electrical Engineering, Ahar Branch, Islamic Azad University, Ahar, Iran ^{3} Department of Medical Science, Ardabil Branch, Islamic Azad University, Ardabil, Iran ^{4} Bargh Gostar Baharan Golestan Corporation, Gonbad Kavus, Iran
Date of Web Publication  8Oct2013 
Correspondence Address: Jalil Addeh Bargh Gostar Baharan, Golestan Corporation, P.O. Box 4971684981, Gonbad Kavus Iran
Source of Support: None, Conflict of Interest: None  Check 
DOI: 10.4103/09731482.119350
Breast cancer is the second leading cause of death for women all over the world. The correct diagnosis of breast cancer is one of the major problems in the medical field. From the literature it has been found that different pattern recognition techniques can help them to improve in this domain. This paper presents a novel hybrid intelligent method for detection of breast cancer. The proposed method includes two main modules: Clustering module and the classifier module. In the clustering module, first the input data will be clustered by a new technique. This technique is a suitable combination of the modified imperialist competitive algorithm (MICA) and Kmeans algorithm. Then the Euclidean distance of each pattern is computed from the determined clusters. The classifier module determines the membership of the patterns using the computed distance. In this module, several neural networks, such as the multilayer perceptron, probabilistic neural networks and the radial basis function neural networks are investigated. Using the experimental study, we choose the best classifier in order to recognize the breast cancer. The proposed system is tested on Wisconsin Breast Cancer (WBC) database and the simulation results show that the recommended system has high accuracy. Keywords: Breast cancer, clustering, Kmeans algorithm, modified imperialist competitive algorithm, neural networks
How to cite this article: Kalteh A A, Zarbakhsh P, Jirabadi M, Addeh J. A research about breast cancer detection using different neural networks and KMICA algorithm. J Can Res Ther 2013;9:45666 
How to cite this URL: Kalteh A A, Zarbakhsh P, Jirabadi M, Addeh J. A research about breast cancer detection using different neural networks and KMICA algorithm. J Can Res Ther [serial online] 2013 [cited 2019 Nov 17];9:45666. Available from: http://www.cancerjournal.net/text.asp?2013/9/3/456/119350 
> Introduction   
Each year it is estimated that nearly 200,000 women will be diagnosed with breast cancer (BC) and more than 40,000 will die. Approximately 1,700 men will also be diagnosed with breast cancer and 450 will die each year. The evaluation of men with breast masses is similar to that in women, including mammography. The goal of mammography is the early detection of breast cancer, typically through detection of characteristic masses and/or micro calcifications. A mammogram can find breast cancer before it can be felt. However, it is not perfect. It is still challenging for radiologists to differentiate between benign and malignant tumors. The existence of breast tumor is usually reflected in the mammogram. The current procedure assumes that the image is recorded on an Xray film and that it is interpreted by a human expert. Obviously this suffers from the human error and error with visual inspection, which may further be enhanced by poor quality of the mammogram images. ^{[1],[2],[3]} Many investigators believe that automation of mammogram screening analysis increases the rate of early detection. ^{[4]}
Several approaches have been proposed for breast cancer detection. In ^{[5]} artificial metaplasticity neural network was used to classify WBC database. In ^{[6],[7],[8],[9],[10],[11]} neurofuzzy approach was used in the diagnosis breast cancer. In ^{[12]} associated rules and neural networks was used to classify the WBC. In ^{[13]} WBC dataset was classified using multilayer perceptron neural network, combined neural network, probabilistic neural network, recurrent neural network and support vector machine. In ^{[14]} artificial neural network and multivariate adaptive regression splines approach was used to classify the breast cancer patterns. In ^{[15]} hybrid hidden Markov model (HMM)fuzzy was used in the breast cancer identification. In ^{[16]} support vector machine (SVM) was used in the breast cancer detection.
This paper presents a novel hybrid intelligent method for detection of the breast cancer patterns. The proposed method includes two main modules: clustering module and the classifier module. In the clustering module, first the input data will be clustered by a new technique. This technique is a suitable combination of the modified imperialist competitive algorithm (MICA) and Kmeans algorithm. Then the Euclidean distance of each pattern is computed from the determined clusters. It is used as the characteristic feature. The classifier module determines the membership of the patterns using the computed distance. Literature review shows the systems that use artificial neural networks (ANNs) as the classifiers have high performances. In this module, several neural networks, such as the multilayer perceptron, probabilistic neural networks and the radial basis function neural networks are investigated.
The rest of paper is organized as follows. Section 2 presents the WBC database. Section 3 explains the general scheme of the proposed method. Section 4 and Section 5 describe the main parts of the proposed method, i.e., clustering technique and classifier module, respectively. Section 6, shows some simulation results and finally Section 7 concludes the paper.
> Wisconsin Breast Cancer (WBC) Database   
Breast cancer is the most common cancer among women; excluding non melanoma skin cancers. This cancer affects one in eight women during their lives. It occurs in both men and women, although male breast cancer is rare. Breast cancer is a malignant tumor that has developed from cells of the breast. Although scientists know some of the risk factors (e.g. ageing, genetic risk factors, family history, menstrual periods, not having children, obesity) that increase a woman's chance of developing breast cancer, they do not yet know what causes most breast cancers or exactly how some of these risk factors cause cells to become cancerous. Research is under way to learn more and scientists are making great progress in understanding how certain changes in DNA can cause normal breast cells to become cancerous. ^{[12],[13]} In this study, the WBC database was used and analyzed. They have been collected by Dr. William H. Wolberg (19891991) at the University of WisconsinMadison Hospitals. There are 699 records in this database. Each record in the database has nine attributes. The nine attributes detailed in [Table 1] are graded on an interval scale from a normal state of 0.11, with 1 being the most abnormal state. In this database, 241 (65.5%) records are malignant and 458 (34.5%) records are benign. ^{[12],[13]} The WBC features are illustrated in [Figure 1]. As seen in this figure, features of WBC database have a close value. Thus diagnoses of benign and malignant tumors are difficult with these features.  Figure 1: Nine feature of WBC. red line represent malignant tumor and blue line represent benign tumor (a) Clump thickness (b) Uniformity of cell Size (c) Uniformity of cell Shape (d) Marginal adhesion (f) Single epithelial cell Size (g) Bare nuclei (h) Bland chromatin (i) Normal nucleoli (j) Mitoses
Click here to view 
> General Structure of the Proposed Method   
The proposed method includes two main modules: clustering module and the classifier module. [Figure 2] shows the general scheme of this method. In the clustering module, first the input patterns data will be clustered by a new technique. This technique is a suitable combination of the modified imperialist competitive algorithm (MICA) and Kmeans algorithm. It is named the KMICA clustering technique. ^{[17]} More details regarding the KMICA clustering technique can be found in. ^{[17],[18],[19],[20],[21],[22],[23],[24],[25],[26]} Then the Euclidean distance of each pattern (benign and malignant) is computed from the determined clusters. It is used as the characteristic feature. It is obvious that the Euclidean distance of each pattern from its own cluster center is smaller than the value of Euclidean distance of that pattern from other cluster center. Subsequently, each pattern is shown as a 1*2 vector, which one of the rows is a small value and the remaining row has a large value. Application of this approach causes that the dimension of the classifier is decreased. Also we have a more efficient feature that is better than the raw data. The classifier module determines the membership of the patterns using the computed distance. As mentioned, in the classifier module, several neural networks such as the multilayer perceptron (MLP), ^{[27],[28],[29]} radial basis function (RBF) ^{[30]} and probabilistic neural network (PNN) ^{[31]} are used.
> KMICA Clustering Technique   
Clustering is an important problem that must often be solved as a part of more complicated tasks in pattern recognition, image analysis and other fields of science and engineering. Clustering procedures partition a set of objects into clusters such that objects in the same cluster are more similar to each other than objects in different clusters according to some predefined criteria. In this section the applied clustering algorithm is described.
Kmeans
Kmeans is one of the simplest unsupervised learning algorithms. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume K clusters) fixed as a priori. The main idea is to determine K centroids. These centroids should be placed in a cunning way as different locations cause different results. So, the best choice is to place them as much as possible far away from each other. The next step is to take each point referring to a given data set and associate it to the nearest centroid. When no point is left out, the first step is completed and an early grouping is done. At this point we need to recalculate K new centroids as centers of the clusters resulting from the previous step. After we have these K new centroids, a new binding has to be done between the same data set points and the nearest new centroid. A loop has been generated. As a result of this loop, we may notice that the K centroids change their location step by step until no more changes are done. In other words, centroids do not move any further. Finally, the goal of this algorithm is to minimize an objective function, which in this case is a squared error function. ^{[18],[19]}
The objective function has been calculated as follows:
where ǀ ǀY _{i}  X _{i}ǀ ǀ is a chosen distance measurement between the data input Y _{i} and the cluster center X _{i} N and K are the number of input data and the number of cluster centers, respectively.
The algorithm is composed of the following steps:
 Place K points into the space represented by the objects that are clustered. These points represent initial group centroids.
 Assign each object to the group that has the closest centroid.
 When all objects have been assigned, recalculate the positions of the K centroids.
 Repeat Steps 2 and 3 until the centroids are immobilized.
Although it can be proved that the procedure will always terminate, the kmeans algorithm does not necessarily find the most optimal configuration, corresponding to the global objective function minimum. The algorithm is also significantly sensitive to the initial randomly selected cluster centers. The kmeans algorithm can be run multiple times to reduce this effect.
Imperialist Competitive Algorithm (ICA)
ICA is a populationbased stochastic search algorithm. It has been introduced by Atashpaz and Lucas. ^{[20],[21]} Since then, it is used to solve some kinds of optimization problem. The algorithm is inspired by imperialistic competition. It attempts to present the social policy of imperialisms to control more countries and use their sources when colonies are dominated by some rules. If one empire loses its power, the rest of them will compete to take its possession. In ICA, this process is simulated by individuals that are known as countries.
This algorithm starts with a randomly initial population and objective function which is computed for them. The most powerful countries are selected as imperialists and the others are colonies of these imperialists .Then the competition between imperialists take place to get more colonies .The best imperialist has more chance to possess more colonies. Then one imperialist with its colonies makes an empire. [Figure 3] shows the initial populations of each empire. ^{[20],[21],[22]} If the empire is bigger, its colonies are greater and the weaker ones are less. In this figure Imperialist 1 is the most powerful and has the greatest number of colonies.
After dividing colonies between imperialists, these colonies approach their related imperialist countries. [Figure 4] represents this movement. Based on this concept each colony moves toward the imperialist by a units and reaches its new position. Where a is a random variable with uniform (or any proper) distribution, β, a number greater than 1, causes colonies move toward their imperialists from different direction and S is the distance between colony and imperialist.
α ≈ U (0, β × S ) (2)
If after this movement one of the colonies possesses more power than its relevant imperialist, they will exchange their positions. To begin the competition between empires, total objective function of each empire should be calculated. It depends on objective function of both an imperialist and its colonies. Then the competition starts, the weakest empire loses its possession and powerful ones try to gain it. The empire that has lost all its colonies will collapse. At last the most powerful empire will take the possession of other empires and wins the competition.
α ≈ U (0, β × S ) (3)
If after this movement one of the colonies possesses more power than its relevant imperialist, they will exchange their positions. To begin the competition between empires, total objective function of each empire should be calculated. It depends on objective function of both an imperialist and its colonies. Then the competition starts, the weakest empire loses its possession and powerful ones try to gain it. The empire that has lost all its colonies will collapse. At last the most powerful empire will take the possession of other empires and wins the competition. The last Imperialist is the solution of the problem.
Modified ICA (MICA)
In order to improve the convergence velocity and accuracy of the ICA ^{[17]} recommends a modified imperialist competitive algorithm (MICA). Premature convergence may occur under different situation: the population converges to local optima of the objective function or the search algorithm proceeds slowly or does not proceed at all. In ^{[17]} a new mutation operator is proposed. Mutation is a powerful strategy which diversifies the ICA population and improves the ICA's performance on preventing premature convergence to local minima. The mathematical expressions for MICA are provided in the appendix A.
Hybrid KMICA
As mentioned before, Kmeans is used for its easiness and simplicity for applications. However, it has some drawbacks. First, its result may depend on initial values. Also, it may converge to local minimum. Recently, numerous ideas have been used to alleviate this drawback by using global optimization algorithms such as GA ^{[23]} hybrid PSOSA, ^{[24]} hybrid PSOACOK ^{[25]} and HBMO. ^{[26]} In this paper, we used method that delivered in ^{[17]} that called Hybrid KMICA. According to original ICA, first primary population generates and then empires with their possessions take place. Applying Kmeans to each empire causes us to improve the initial population of colonies. It makes the hybrid algorithm converges more quickly and prevents it from falling into local optima. The outputs of Kmeans form the initial empires of modified ICA.
To improve the income of algorithm, it is better that the powerless imperialist would not be removed when it loses all possessions. This imperialist is one of the best answers and that can be contributed in imperialistic competition as a weak colony or to be given to the powerful empire. Pseudocode of this algorithm is shown in [Figure 5].
> Classifiers   
This section briefly describes the neural network classifiers.
Multilayer perceptron (MLP) neural network
An MLP neural network consists of an input layer (of source nodes), one or more hidden layers (of computation nodes) and an output layer. The recognition basically consists of two phases: training and testing. In the training stage, weights are calculated according to the chosen learning algorithm. The issue of learning algorithm and its speed is very important for the MLP model. In this study the following learning algorithms are considered.
5.1.1. Backpropagation with momentum (BP with momentum)
The BP algorithm makes use of gradient descent with a momentum term to smooth out oscillation. ^{[27]} Eq. (4) gives the weight update for BP with momentum:
where w _{ij} represents the weight value from neuron j to neuron i, ε is the learning rate parameter, and E represents the error function. It adds an extra momentum parameter μ, to the weight changes.
Resilient backpropagation (RPROP) algorithm
RPROP considers the sign of derivatives as the indication for the direction of the weight update. ^{[28]} In doing so, the size of the partial derivative does not influence the weight step. The following equation shows the adaptation of the update values of Δij (weight changes) for the RPROP algorithm. For initialization, all are set to small positive values:
where η^{0} = 0,0 < η^{} < 1 < η^{+} , η^{,0,+} are known as the update factors. Whenever the derivative of the corresponding weight changes its sign, this implies that the previous update value is too large and it has skipped a minimum. Therefore, the update value is then reduced (η^{} ), as shown above. However, if the derivative retains its sign, the update value is increased (η^{+} ). This will help to accelerate convergence in shallow areas. To avoid overacceleration, in the epoch following the application of (η^{+} ), the new update value is neither increased nor decreased (η^{0} ) from the previous one. Note that the values of Δ_{ij} remain nonnegative in every epoch. This update value adaptation process is then followed by the actual weight update process, which is governed by the following equations:
The values of the training parameters adopted for the algorithms were determined empirically.
5.1.3. LevenbergMarquardt (LM) algorithm
The LM algorithm ^{[29]} uses the approximation to the Hessian matrix in the following Newtonlike update:
where J is the Jacobian matrix, e a vector of network errors and μ a constant.
Radial basis function neural networks (RBFNNs)
RBF neural networks with their structural simplicity and training efficiency are a good candidate to perform a nonlinear mapping between the input and the output vector space. RBFNN is fully connected feed forward structure and consist of three layers namely, an input layer, a single layer of nonlinear processing units, and an output layer. Input layer is composed of input nodes that are equal to the dimension of the input vector x. The output of the jth hidden neuron with Gaussian transfer function can be calculated as:
where h _{j} is the output of the jth neuron, x ∈ ℜ^{n×1} is an input vector, c _{j} ∈ ℜ^{n×1} is the jth RBF center, σ the center spread parameter, which controls the width of the RBF, and ǀǀ.ǀǀ ^{2} represent the Euclidean norm. The output of any neuron at the output layer of RBF network is calculated as:
where w _{ij} is the weight connecting hidden neuron j to output neuron i and k the number of hidden layer neurons.
Probabilistic neural networks (PNN)
Probabilistic neural network is a kind of radial basis network suitable for classification problems. PNN networks have three layers: input, pattern and summation. The input layer has as many elements as there are individual parameters needed to describe the samples to be classified. ^{[30]} The pattern layer organizes the training set such a way that an individual processing element represents each input vector. The pattern units in the probabilistic network are used to store pattern examples, taken directly from the training data. The entire set of training data is used, and so the number of units in the first hidden layer is set equal to the number of training cases. The summation layer has as many processing elements as there are classes to be recognized and simply collects the outputs from all hidden neurons of each respective class. The products of the summation layer are forwarded to the output (one neuron for each data class), where the estimated probability of the new pattern being a member of that data class is computed. The transfer function is radial basis function for the first layer and is competitive function for the second layer. Only the first layer has biases. Training of the probabilistic neural network is much easier than with backpropagation. It can be simply finished by setting the weights of the network using the training set. The outputs of summary layer are binary values y _{i} is larger than input of other neurons (which means that this input pattern belongs to class i), y _{i} is set to 1, otherwise it is set to zero.
> Simulation Results   
In this section we evaluate the performance of proposed recognizer. This study has used WBC database. The computer program was performed on MATLAB (version 7.8.0.347 (R2009)) environment by using the neural networks toolbox. In order to compare the performance of classifiers, the kfold cross validation technique is used. The kfold cross validation technique proposed by Salzburg ^{[32]} was employed in the experiments with k = 3. The data set was thus split into 3 portions, with each part of the data sharing the same proportion of each class of data. 2 data portion were used in the training process, while the remaining part was used in the testing process. The ANNtraining methods were run 3 times to allow each slice of the data to take turn as a testing data. The classification accuracy rate is calculated by summing up the individual accuracy rate for each run of testing, and then dividing of the total by 3. All the obtained results are the average of 50 independent runs.
We apply KMICA for finding the optimum cluster centers. The parameters of the clustering algorithm used in this study are shown in [Table 2]. These values were selected for the best performance after several experiments.
Patterns' Euclidean distances from cluster centers are shown in the [Figure 6]. Horizontal axis shows number of patients and vertical axis shows Euclidean distance of patterns from cluster centers. As an example, as shown in the [Figure 6](a), the benign patterns have smaller values due to malignant pattern. Also as shown in the [Figure 6](b), the malignant patterns have smaller values due to benign pattern.  Figure 6: Euclidean distance of signals. (a) Calculated based on benign cluster (b) Calculated based on malignant cluster
Click here to view 
Performance comparison of different neural network and proposed feature
The training parameters and the configuration of the MLP used in this study are shown in [Table 3]. The MLP classifiers were tested with various neurons for a single hidden layer and the best networks are selected. For the RBFNNs Gaussian activation function and a single hidden layer with 18 radial basis functions considered. For the PNN Gaussian activation function and a single hidden layer with 23 radial basis functions considered. These values were selected for the best performance after several experiments.
[Table 4] and [Table 5] show the recognition accuracy (RA) of different systems. In these tables, NNHL means the number neurons in the hidden layers. The obtained results are the average of 50 independent runs. As it is depicted in [Table 4], using various neural network and notprecede data (WBC database), the highest accuracy is 97.44%, which is achieved by MLP neural network with resilient backpropagation training algorithm. From [Table 5] it can be found that using the proposed features as the input of the classifier, the accuracy of system is increased to 99.11%. This rate of accuracy is achieved by RBFNNs.  Table 4: Performance comparisons of different classifiers with WBC database
Click here to view 
 Table 5: Performance comparisons of different classifiers with Euclidean distance of patterns from cluster center
Click here to view 
Comparison between [Table 4] and [Table 5] shows the efficiency of proposed feature. As it can be recognized from these tables the accuracy of MLP (BP with momentum), MLP (RP), MLP (LM), PNN and RBF neural network are increased to 97.10%, 97.96%, 97.64%, 99.11%, and 94.63% respectively which shows significant role of feature in classifier performance.
In order to indicate the details of the recognition for each pattern, the confusion matrix of the recognizer for best result is shown by [Table 6]. As we know, the values in the diagonal of confusion matrix show the correct performance of recognizer for each pattern. In other words, these values show that how many of considered pattern are recognized correctly by the system. The other values show the mistakes of system. For example, look at the first row of this matrix. The value of 98.22% shows the percentage of correct recognition of benign pattern and the value of 1.78% shows that this type of pattern is wrongly recognized with malignant pattern. In order to achieve the recognition accuracy (RA) of system, it is needed to compute the average value of that appears in diagonal.
Performance comparison of different clustering algorithms
The performance of the classifiers has been compared with different clustering algorithms. For this purpose, subtractive clustering, ^{[33],[34]} Kmean clustering ^{[18],[19]} and genetic algorithm clustering ^{[35]} are considered. [Table 7] shows the RA of different systems. From the [Table 5] and [Table 7] it can be seen that KMICA clustering based system achieves higher recognition accuracy of 99.11%.
Effects of the parameters of the KMICA on the performance of the method
In this subsection we have analyzed the sensitivity of the recognition system with respect to the N _{imp}, N _{pop}, β, ξ and γ, which control the behavior, and thus the goodness of the KMICA search process. The achieved results for 10 set of parameters are shown in [Table 8]. For surveying the influence of parameters in each case, RBFNNs is tested 50 times independently with various numbers of radial basis functions in hidden layer. The average of best obtained value is depicted in [Table 8]. It illustrates that this hybrid system have a little dependency on variation of the parameters.  Table 8: Recognition accuracy of the recognizer for different values of parameters
Click here to view 
> Comparison and Discussion   
For comparison purposes, [Table 9] gives the classification accuracies of our method and previous methods applied to the same database. As can be seen from the results, proposed method obtains a excellent classification accuracy.  Table 9: Classification accuracies obtained with proposed method and other classifiers from literature.
Click here to view 
> Conclusion   
Accurate recognition of breast cancer tumor is very important for sufficient treatment. This study has investigated the design of an automatic and accurate system for recognition of breast cancer. In this study, the usage of Euclidean distance of patterns from cluster centers are proposed as efficiency input of classifier. The proposed input (Euclidean distance of signals from cluster center) are applied to different neural network that using RBF neural networks, the highest rate of accuracy 99.11% is achieved for surveying the effect of clustering algorithm over system accuracy, various kind of clustering algorithms are performed which best performance is related to KMICA algorithm.
> Appendix:   
A. MICA
B. KMICA
To apply the ICA for clustering, the following steps have to be taken [17]:
Step 1: Generate an initial population
An initial population of input data is generated by chaos initialization as follows:
where center _{j} is jth cluster center for ith country. X _{i} is one of the countries. N _{initial} is the number of population and d is the dimension of each cluster center. x _{j}^{max} and x _{j}^{min} (each feature of center) are the maximum and minimum value of each point referring to the jth cluster center which are in order. K is the number of clusters. H is the number of state variables. X_{0} is an initial solution.
Step 2: Calculate objective function value.
Suppose that we have N sample feature vectors. The objective function is evaluated for each country as follows:
Step 21: i = 1 and Objec = 0
Step 22: select the ith sample.
Step 23: calculate the distances between the ith sample and Center _{j} (j = 1, 2, ..., K).
Step 24: add the value of Object with the minimum distance calculated in Step 23
Step 25: if all samples have been selected, go to the next step, otherwise i = i + 1 and return to step 22.
Step 26: Cost (X) = Objec.
The objective function is calculated mathematically as below:
(N=number of input data)
Step 3: Sort the initial population based on the objective function values.
The initial population is ascended based on the value of their objective function.
Step 4: Select the imperialist states.
Countries that have the minimum objective function are selected as the imperialist states and the remaining ones form the colonies of these imperialists.
Step 5: Divide colonies among imperialist.
Based on power of each imperialist the colonies are divided among them. The power of each imperialist calculated as followings:
In above equations, cost _{n} is the cost of nth imperialist and C _{n} the normalized cost of each one. The normalized power of each imperialist introduced as P _{n}, then the initial number of colonies for each empire will be C_{n} ^{norm} where Ncol and N _{imp} are number of all colonies and imperialists.
Step 6: Use Kmeans algorithm for each empire.
Step 7: Move colonies toward their imperialist states as described in Section 3.
Step 8: Use mutation to change the direction of colonies. It is mentioned in modified ICA.
Step 9: Check the cost of all colonies in each empire.
During the previous steps cost of each colony might have changed. Check the cost of all colonies of an empire if there is one that has a lower cost than its relevant imperialist, exchange their position.
Step 10: Check total cost of each empire.
Cost of each empire depends on power of both imperialist and its colonies. It is calculated as follows:
According to this equation cumulative probability for n = 1 is equal to its probability, while for the last n it corresponds to one.
Then a random number with uniform distribution generates and compares with all C_{Pn}.
Each sector with higher probability will have more chance to be chosen. Therefore the winner empire will specify.
As it is mentioned to use Roulette Wheel algorithm, computing cumulative distribution function is essential. To reduce this time consuming step an approach has been presented as below:
P is the vector of possession probability of all empires and R is a vector with uniformly distributed random numbers. Maximum index in D shows Winner Empire that gets the colony.
After realizing the winner empire, the weakest colony of the weakest empire will be given to the winner one. Then we should subtract one of the populations of this weak empire and add one to the winner's population.
Step 12: Remove weakest empire.
If there is any empire without colony, eliminate it. Replace one of the weakest colonies of best empire (low cost) with this imperialist.
Step 13: Apply chaotic local search (CLS) to search around the global solution. ICA has gained much attention and widespread applications in different fields. However, it often converges to local optima. In order to avoid this shortcoming, a CLS algorithm is used to search around the global solution in the paper. CLS is based on the logistic equation as follows:
where cx_{i} ^{j} indicates the jth chaotic variable, N_{chaos} is the number of individuals for CLS. rand(.) is a random number between zero and one.
The objective function is evaluated for all individuals of CLS. One country selected randomly is replaced with the best solution among them.
Step 14: Check number of empire.
If there is just one empire remained, stop. Else go to step 7.
^{[43]}
> References   
1.  Baguia S, Baguib S, Palc K, Pald N. Breast cancer detection using rank nearest neighbor classification rules. Pattern Recognit 2003;36:2534. 
2.  Furundzic D, Djordjevic M, Bekic A. Neural networks approach to early breast cancer detection. J Syst Architect 1998;44:61733. 
3.  Dengler J, Sabine B, Desaga JF. Segmentation of micro calcifications in mammograms, IEEE Trans Med Imaging 1993;12:63442. 
4.  Lame AF, Schuler S, Fan J, Huda W. Mammographic feature enhancement by multi scale analysis. IEEE transaction 1994. Imag13 (4). 
5.  MarcanoCedeno J, QuintanillaDominguez D, Andina. WBCD breast cancer database classification applying artificial metaplasticity neural network. Expert Syst Appl 2011;38:95739. 
6.  Mitra S, Hayashi Y. Neurofuzzy rule generation: Survey in soft computing framework. IEEE Transaction on Neural Networks 2000;11:74857. 
7.  Nieto J, Torres A. Midpoint for fuzzy sets and their application in medicine. Artif Intell Med 2003;27:32155. 
8.  Russo m, Jain L. Fuzzy learning and application. Englewood Cliffs, NJ: PrenticeHall; 2001. 
9.  Verma K, Zakos J. A computeraided diagnosis system for digital mammograms based on fuzzyneural and feature extraction techniques. IEEE Trans Inform Technol Biomed 2000;16:21923. 
10.  Keles A, Keles A, Yavuz U. Expert system based on neurofuzzy rules for diagnosis breast cancer. Expert Syst Appl 2011;38:571926. 
11.  Mousa R, Munib Q, Moussa A. Breast cancer diagnosis system based on wavelet analysis and fuzzyneural. Expert Syst Appl 2005;28:71323. 
12.  Karabatak M, Ince M. An expert system for detection of breast cancer based on association rules and neural network. Expert Syst Appl 2009;36:34659. 
13.  Ubeyli ED. Implementing automated diagnostic systems for breast cancer detection. Expert Syst Appl 2007;33:105462. 
14.  Choua SM, Leeb TS, Shaoc YE, Chenb IF. Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines. Expert Syst Appl 2004;27:13342. 
15.  Hassan MR, Hossain MM, Begg RK, Ramamohanarao K, Morsi Y. BreastCancer identification using HMMfuzzy approach. Comput Biol Med 2010;40:24051. [PUBMED] 
16.  Addeh J, Ebrahimzadeh A. Breast cancer recognition using a novel hydride intelligent method. journal of medical signals and sensors. JMSS 2012;2:2230. 
17.  Niknam T, Fard ET, Pourjafarian, Rousta A. An efficient hybrid algorithm based on modified imperialist competitive algorithm and Kmeans for data clustering. Eng Appl Artif Intell 2011;24:30617. 
18.  Morales AK, Erazo FR. A search space reduction methodology for data mining in large databases. Eng Appl Artif Intell 2009;22:5765. 
19.  Sakai T, Imiya A. Unsupervised cluster discovery using statistics in scale space. Eng Appl Artif Intell 2009;22:92100. 
20.  AtashpazGargari E, Lucas C. Designing an optimal PID controller using Colonial Competitive Algorithm. In: Proceedings of the First Iranian Joint Congress on Fuzzy and Intelligent Systems, Mashhad, Iran. 2007 
21.  AtashpazGargari E, Lucas C. Imperialist competitive algorithm: an algorithm for optimization inspired by imperialistic competition. In: Proceedings of the IEEE Congress on Evolutionary Computation, Singapore 2007. p. 46617. 
22.  Rajabioun R, Hashemzadeh F, AtashpazGargari E. Colonial competitive algorithm A novel approach for PID controller design in MIMO distillation column process. Int J Intell Comput Cybernet 2008;3:33755. 
23.  Krishna K, Murty M. Genetic kmeans Algorithm. IEEE Transactions on Systems, Man and Cybernetics B Cybernet 1999;29:4339. 
24.  Niknam T, Amiri B, Olamaie J, Arefi A. An efficient hybrid evolutionary optimization algorithm based on PSO and SA for clustering. J Zhejiang Univ Sci 2009;10:5129. 
25.  Firouzi B, Sadeghi MS, Niknam T. A new hybrid algorithm based on PSO, SA, and Kmeans for cluster analysis. Int J Innov Comput Inform Control 2010;6:110. 
26.  Fathian M, Amiri B. A honeybee mating approach on clustering. Int J Adv Manufac Technol 2007;38:80921. 
27.  Haykin S. Neural networks: A comprehensive foundation. New York: MacMillan; 1999. 
28.  Riedmiller M, Braun H. A direct adaptive method for faster backpropagation learning: The rprop algorithm. In: Proc I CN N.; 1993. p. 58691. 
29.  Hagan MT, Menhaj M. Training feedforward networks with the Marquardt algorithm. IEEE Trans. Neural Networks 1994;5:98993. 
30.  Poggio T, Girosi F. Networks for approximation and learning. In: Proc. IEEE 1990;78:148197. 
31.  Specht DF. Probabilistic neural networks. Neural Networks 1990;3:10918. 
32.  Salzberg SL. On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Min Knowl Discov 1997;1:31728. 
33.  Chiu S. Fuzzy Model Identification Based on Cluster Estimation. J Intell Fuzzy Syst 1994;2:3. 
34.  Yager R, Filev D. Generation of fuzzy rules by mountain clustering. J Intell Fuzzy Syst 1994;2:20919. 
35.  Murthy CA, Chowdhury N. In search of optimal clusters using genetic algorithms. Pattern Recogn Lett 1996;17:82532. 
36.  Quinlan JR. Improved use of continuous attributes in C4.5. J Artif Intell Res 1996;4:77909. 
37.  Hamiton HJ, Shan N, Cercone N. RIAC: A rule induction algorithm based on approximate classification. In International conference on engineering applications of neural networks, University of Regina, 1996. 
38.  Ster B, Dobnikar A0 . Neural networks in medical diagnosis: Comparison with other methods. In Proceedings of the international conference on engineering applications of neural networks 1996; 42730. 
39.  Nauck D, Kruse R. Obtaining interpretable fuzzy classification rules from medical data. Artif Intell Med 1999;16:14969. [PUBMED] 
40.  PenaReyes CA, Sipper M. A fuzzygenetic approach to breast cancer diagnosis. Artif Intell Med 1999;17:13155. [PUBMED] 
41.  Setiono R. Generating concise and accurate classification rules for breast cancer diagnosis. Artif Intell Med 2000;18:20517. [PUBMED] 
42.  Abonyi J, Szeifert F. Supervised fuzzy clustering for the identification of fuzzy classifiers. Pattern Recogn Lett 2003;14:2195207. 
43.  GuijarroBerdias B, FontenlaRomero O, PerezSanchez B, Fraguela P. A linear learning method for multilayer perceptrons using least squares. Lecture Notes Comput Sci 2007;3:36574. 
[Table 5], [Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5], [Figure 6]
[Table 1], [Table 2], [Table 3], [Table 4], [Table 6], [Table 7], [Table 8], [Table 9]
