Joao Paulo Papa edited Introduction.tex  over 8 years ago

Commit id: e2585a06b38a9a8473f0294a89c6af2ea6d903f3

deletions | additions      

       

The aforementioned scenario turns $k$-means algorithm more prone to be addressed by means of optimization techniques, mainly those based on nature- and evolutionary-oriented mechanisms. Actually, not only $k$-means but a number of other techniques have used the framework of meta-heuristic-based optimization to cope with problems that somehow can be modeled as a task of finding decision variables that maximize/minimize some certain fitness function. Chen et al.~\cite{Chen_2009}, for instance, employed Genetic Algorithms (GAs) and neural networks to classify both land-use and landslide zones in eastern Taiwan, being the former used to compute the set of weights that combine some landslide incidence factors. Nakamura et al.~\cite{Nakamura_2014} dealt with the task of band selection in hyper-spectral imagery through nature-inspired techniques. Truly speaking, the idea is to model the problem of finding the most important bands as a feature selection task. Without loss of generality, both problems are the very same one when the brightness of each pixel is used to represent it.  Very recently, Goel et al.~\cite{Goel_2015} tackled the problem of remote sensing image classification using some nature-inspired techniques, say that Cuckoo Search and Artificial Bee Colony. Senthilnatha et al.~\cite{Senthilnatha_2014} used GAs, Particle Swarm Optimization and Firefly Algorithm for the automatic image registering of multi-temporal remote sensing data. In short, the idea is to perform image registration while minimizing some criterion function  (Mutual Information in that case). The theory about Artificial Immune Systems has been used to classify remote sensing data as well~\cite{Kheddam_2014}, in which a multi-band image covering the area of northeastern part of Algiers was used for validation purposes. Coming back to the $k$-means technique, Chandran and Nazeer~\cite{Chandran_2011} proposed to solve the problem of minimizing the distance from each dataset sample to its nearest centroid using the Harmony Search, which is a meta-heuristic optimization technique based on the way musicians create songs in order to obtain the best harmony. Forsati et al.~\cite{Forsati_2008} employed a similar approach, but in the context of web page clustering. clustering, while  Lin et al.~\cite{Lin_2012} proposed a hybrid approach concerning the task of $k$-means clustering and Particle Swarm Optimization. Later on, Kuo et al.~\cite{Kuo_2013} integrated $k$-means and Artificial Immune Systems for dataset clustering, and Saida et al.~\cite{Saida_2014} employed the Cuckoo Search to optimize $k$-means aiming at classifying documents. Finally, a comprehensive study about the application of nature-inspired techniques to boost $k$-means was presented by Fong et al.~\cite{Fong_2014}. Despite all aforementioned works aimed at enhancing $k$-means using meta-heuristic techniques, there is a little concern about the application of hyper-heuristic techniques for that purpose, as well as only a very few works attempted at dealing with $k$-means optimization in the context of land-use land-use/cover  classification. The term "hyper-heuristics" was coined to address new algorithms designed to solve general problems by combining known meta-heuristics, in such a way each technique may compensate the weaknesses of others~\cite{Ross_2005}. In such context, Papa et al.~\cite{Papa_2015} were one of the first that focused on the application of hyper-heuristics to optimize $k$-means, being the proposed approach validated in the background of both satellite- and radar-based land-cover classification\footnote{This work was presented at IGARSS'2015.}. That work employed Genetic Programming to combine five variations of the Harmony Search algorithm with promising results. In this paper, we extend the work by Papa et al.~\cite{Papa_2015} with a deeper experimental analysis, in which Particle Swarm Optimization, Bat Algorithm and Firefly Algorithm are also considered together with Harmony Search and its variants for combination purposes through Genetic Programming. The results obtained in this paper outperformed the previous work by Papa et al.~\cite{Papa_2015}, thus emphasizing the benefits of the hyper-heuristic-based framework. The remainder of this paper is organized as follows. Sections~\ref{s.proposed} and~\ref{s.material} present the proposed approach and the experimental setup, respectively. Section~\ref{s.experiments} discuss the experiments, and Section~\ref{s.conclusions} states conclusions and future works.