You are currently browsing the tag archive for the ‘Optimization’ tag.

Four different snapshots (click to enlarge) from one of my latest books, recently published in Japan: *Ajith Abraham*, *Crina Grosan*, *Vitorino Ramos* (Eds.), “*Swarm Intelligence in Data Mining*” (**群知能と データマイニング**), *Tokyo Denki University* press [TDU], Tokyo, Japan, July 2012.

Figure – ** Attractor basins** (fig.2 pp.6 on Mária Ercsey-Ravasz and Zoltán Toroczkai, “

*Optimization hardness as transient chaos in an analog approach to constraint satisfaction*“, Nature Physics, vol. 7, p. 966-970, 2011.)

*Mária Ercsey-Ravasz* and *Zoltán Toroczkai* have proposed a way of mapping satisfiability problems to differential equations and a deterministic algorithm that solves them in polynomial continuous time at the expense of exponential energy functions (so the discrete approximation of the algorithm does not run in polynomial time, and an analogue system would need exponential resources).

The map assigns a phase space to a problem; the algorithm chooses random initial conditions from within that phase space. In the graphs above and below, they pick a 2-d subspace of the phase space and for each initial point in that space they illustrate 1) the particular solution the algorithm finds, 2) the corresponding “solution cluster”, an equivalence class of solutions that identifies two solutions if they differ in exactly one variable assignment, and 3) the time it takes to solve the problem. Each row adds another clause to satisfy.

The especially interesting part of the paper is the notion of an ** escape rate**, the proportion of the trajectories still searching for a solution after a time

*t*. In a companion paper, they show that the escape rate for

*Sudoku*combinatorial instances (

*The Chaos Within Sudoku*,

*Nature*, August 2012) correlates strongly with human judgements of hardness. This

*escape rate*is similar to the

*Kolmogorov complexity*in that it gives a notion of hardness to individual problem instances rather than to classes of problems. Full paper could be retrieved from arXiv:

*Mária Ercsey-Ravasz*and

*Zoltán Toroczkai*, “

**Optimization hardness as transient chaos in an analog approach to constraint satisfaction**“, Nature Physics, vol. 7, p. 966-970, 2011. (at arXiv on August 2012).

Figure – ** Attractor basins for 3-XORSAT** (fig.8 pp.18 on Mária Ercsey-Ravasz and Zoltán Toroczkai, “

*Optimization hardness as transient chaos in an analog approach to constraint satisfaction*“, Nature Physics, vol. 7, p. 966-970, 2011.)

Fig.1 – (click to enlarge) The optimal shortest path among *N*=1265 points depicting a Portuguese *Navalheira* crab as a result of one of our latest Swarm-Intelligence based algorithms. The problem of finding the shortest path among *N* different points in space is *NP-hard*, known as the *Travelling Salesmen Problem* (*TSP*), being one of the major and hardest benchmarks in Combinatorial Optimization (link) and Artificial Intelligence. (*V. Ramos*, *D. Rodrigues*, 2012)

This summer my kids just grab a tiny Portuguese *Navalheira *crab on the shore. After a small photo-session and some baby-sitting with a lettuce leaf, it was time to release it again into the ocean. He not only survived my kids, as he is now entitled into a new World Wide Web on-line life. After the *Shortest path Sardine* (link) with 1084 points, here is the *Crab* with 1265 points. The algorithm just run as little as 110 iterations.

Fig. 2 – (click to enlarge) Our 1265 initial points depicting a TSP Portuguese *Navalheira* crab. Could you already envision a minimal tour between all these points?

As usual in *Travelling Salesmen problems* (*TSP*) we start it with a set of points, in our case 1084 points or cities (fig. 2). Given a list of cities and their pairwise distances, the task is now to find the *shortest possible tour* that visits each city exactly once. The problem was first formulated as a mathematical problem in 1930 and is one of the most intensively studied problems in optimization. It is used as a benchmark for many optimization methods.

Fig. 3 – (click to enlarge) Again the shortest path *Navalheira* crab, where the optimal contour path (in black: first fig. above) with 1265 points (or cities) was filled in dark orange.

TSP has several applications even in its purest formulation, such as planning, logistics, and the manufacture of microchips. Slightly modified, it appears as a sub-problem in many areas, such as DNA sequencing. In these applications, the concept city represents, for example, customers, soldering points, or DNA fragments, and the concept distance represents travelling times or cost, or a similarity measure between DNA fragments. In many applications, additional constraints such as limited resources or time windows make the problem considerably harder.

What follows (fig. 4) is the original crab photo after image segmentation and just before adding *Gaussian* noise in order to retrieve several data points for the initial TSP problem. The algorithm was then embedded with the extracted *x*,*y* coordinates of these data points (fig. 2) in order for him to discover the minimal path, in just 110 iterations. For extra details, pay a visit onto the *Shortest path Sardine* (link) done earlier.

Fig. 4 – (click to enlarge) The original crab photo after some image processing as well as segmentation and just before adding *Gaussian* noise in order to retrieve several data points for the initial TSP problem.

Figure (click to enlarge) – Cover from one of my books published last month (10 July 2012) “*Swarm Intelligence in Data Mining*” recently translated and edited in Japan (by *Tokyo Denki University press* [TDU]). Cover image from Amazon.co.jp (url). Title was translated into **群知能と データマイニング**. Funny also, to see my own name for the first time translated into Japanese – wonder if it’s Kanji. A brief synopsis follow:

(…) ** Swarm Intelligence** (SI) is an innovative distributed intelligent paradigm for solving optimization problems that originally took its inspiration from the biological examples by swarming, flocking and herding phenomena in vertebrates.

*Particle Swarm Optimization*(PSO) incorporates swarming behaviours observed in flocks of birds, schools of fish, or swarms of bees, and even human social behaviour, from which the idea is emerged.

*Ant Colony Optimization*(ACO) deals with artificial systems that is inspired from the foraging behaviour of real ants, which are used to solve discrete optimization problems. Historically the notion of finding useful patterns in data has been given a variety of names including data mining, knowledge discovery, information extraction, etc. Data Mining is an analytic process designed to explore large amounts of data in search of consistent patterns and/or systematic relationships between variables, and then to validate the findings by applying the detected patterns to new subsets of data. In order to achieve this, data mining uses computational techniques from statistics, machine learning and pattern recognition. Data mining and Swarm intelligence may seem that they do not have many properties in common. However, recent studies suggests that they can be used together for several real world data mining problems especially when other methods would be too expensive or difficult to implement. This book deals with the application of swarm intelligence methodologies in data mining. Addressing the various issues of swarm intelligence and data mining using different intelligent approaches is the novelty of this edited volume. This volume comprises of 11 chapters including an introductory chapters giving the fundamental definitions and some important research challenges. Chapters were selected on the basis of fundamental ideas/concepts rather than the thoroughness of techniques deployed. (…) (more)

Fig.1 – (click to enlarge) The optimal shortest path among *N*=1084 points depicting a Portuguese sardine as a result of one of our latest Swarm-Intelligence based algorithms. The problem of finding the shortest path among *N* different points in space is *NP-hard*, known as the *Travelling Salesmen Problem* (*TSP*), being one of the major and hardest benchmarks in Combinatorial Optimization (link) and Artificial Intelligence. (*D. Rodrigues*, *V. Ramos*, 2011)

Almost summer time in Portugal, great weather as usual, and the perfect moment to eat sardines along with friends in open air esplanades; in fact, a lot of grilled sardines. We usually eat grilled sardines with a tomato-onion salad along with barbecued cherry peppers in salt and olive oil. That’s tasty, believe me. But not tasty enough however for me and one of my colleagues, *David Rodrigues* (blog link/twitter link). We decided to take this experience a little further on, creating the first ** shortest path sardine**.

Fig. 2 – (click to enlarge) Our 1084 initial points depicting a TSP Portuguese sardine. Could you already envision a minimal tour between all these points?

As usual in *Travelling Salesmen problems* (*TSP*) we start it with a set of points, in our case 1084 points or cities (fig. 2). Given a list of cities and their pairwise distances, the task is now to find the *shortest possible tour* that visits each city exactly once. The problem was first formulated as a mathematical problem in 1930 and is one of the most intensively studied problems in optimization. It is used as a benchmark for many optimization methods. TSP has several applications even in its purest formulation, such as planning, logistics, and the manufacture of microchips. Slightly modified, it appears as a sub-problem in many areas, such as DNA sequencing. In these applications, the concept city represents, for example, customers, soldering points, or DNA fragments, and the concept distance represents travelling times or cost, or a similarity measure between DNA fragments. In many applications, additional constraints such as limited resources or time windows make the problem considerably harder. (link)

Fig. 3 – (click to enlarge) A well done and quite grilled shortest path sardine, where the optimal contour path (in blue: first fig. above) with 1084 points was filled in black colour. Nice T-shirt!

Even for toy-problems like the present 1084 *TSP sardine*, the amount of possible paths are incredible huge. And only one of those possible paths is the optimal (minimal) one. Consider for example a TSP with *N*=4 cities, *A*, *B*, *C*, and *D*. Starting in city *A*, the number of possible paths is 6: that is 1) A to B, B to C, C to D, and D to A, 2) A-B, B-D, D-C, C-A, 3) A-C, C-B, B-D and D-A, 4) A-C, C-D, D-B, and B-A, 5) A-D, D-C, C-B, and B-A, and finally 6) A-D, D-B, B-C, and C-A. I.e. there are (*N*–*1*)! [i.e., *N*–*1 factorial*] possible *paths*. For *N*=3 cities, 2×1=2 possible paths, for *N*=4 cities, 3x2x1=6 possible paths, for *N*=5 cities, 4x3x2x1=24 possible paths, … for *N*=20 cities, 121.645.100.408.832.000 possible paths, and so on.

The most direct solution would be to try all permutations (ordered combinations) and see which one is cheapest (using computational brute force search). The running time for this approach however, lies within a polynomial factor of *O*(*n*!), the factorial of the number of cities, so this solution becomes impractical even for only 20 cities. One of the earliest and oldest applications of dynamic programming is the *Held–Karp algorithm* which only solves the problem in time *O*(*n*^{2}2^{n}).

In our present case (*N*=1084) we have had to deal with 1083 factorial possible paths, leading to the astronomical number of 1.19×10^{2818 }possible solutions. That’s roughly 1 followed by 2818 zeroes! – better now to check this *Wikipedia* entry on very* large numbers*. Our new *Swarm-Intelligent* based algorithm, running on a normal PC was however, able to formulate a minimal solution (fig.1) within just several minutes. We will soon post more about our novel self-organized stigmergic-based algorithmic approach, but meanwhile, if you enjoyed these drawings, do not hesitate in asking us for a grilled cherry pepper as well. We will be pleased to deliver you one by email.

p.s. – This is a joint twin post with *David Rodrigues*.

Fig. 4 – (click to enlarge) Zoom at the end sardine tail optimal contour path (in blue: first fig. above) filled in black, from a total set of 1084 initial points.

** On Bilateral Monopolies**: […] Mary has the world’s only apple, worth fifty cents to her. John is the world’s only customer for the apple, worth a dollar to him. Mary has a monopoly on selling apples, John has a monopoly (technically, a

*monopsony*, a buying monopoly) on buying apples. Economists describe such a situation as

*bilateral monopoly*.

**What happens?**Mary announces that her price is ninety cents, and if John will not pay it, she will eat the apple herself. If John believes her, he pays. Ninety cents for an apple he values at a dollar is not much of a deal but better than no apple. If, however, John announces that his maximum price is sixty cents and Mary believes him, the same logic holds. Mary accepts his price, and he gets most of the benefit from the trade. This is not a fixed-sum game. If John buys the apple from Mary, the sum of their gains is fifty cents, with the division determined by the price. If they fail to reach an agreement, the summed gain is zero. Each is using the threat of the zero outcome to try to force a fifty cent outcome as favorable to himself as possible. How successful each is depends in part on how convincingly he can commit himself, how well he can persuade the other that if he doesn’t get his way the deal will fall through. Every parent is familiar with a different example of the same game. A small child wants to get her way and will throw a tantrum if she doesn’t. The tantrum itself does her no good, since if she throws it you will refuse to do what she wants and send her to bed without dessert. But since the tantrum imposes substantial costs on you as well as on her, especially if it happens in the middle of your dinner party, it may be a sufficiently effective threat to get her at least part of what she wants. Prospective parents resolve never to give in to such threats and think they will succeed. They are wrong. You may have thought out the logic of bilateral monopoly better than your child, but she has hundreds of millions of years of evolution on her side, during which offspring who succeeded in making parents do what they want, and thus getting a larger share of parental resources devoted to them, were more likely to survive to pass on their genes to the next generation of offspring. Her commitment strategy is hardwired into her; if you call her bluff, you will frequently find that it is not a bluff. If you win more than half the games and only rarely end up with a bargaining breakdown and a tantrum, consider yourself lucky.

Herman Kahn, a writer who specialized in thinking and writing about unfashionable topics such as thermonuclear war, came up with yet another variant of the game: the Doomsday Machine. The idea was for the United States to bury lots of very dirty thermonuclear weapons under the Rocky Mountains, enough so that if they went off, their fallout would kill everyone on earth. The bombs would be attached to a fancy Geiger counter rigged to set them off if it sensed the fallout from a Russian nuclear attack. Once the Russians know we have a Doomsday Machine we are safe from attack and can safely scrap the rest of our nuclear arsenal. The idea provided the central plot device for the movie Doctor Strangelove. The Russians build a Doomsday Machine but imprudently postpone the announcement they are waiting for the premier’s birthday until just after an American Air Force officer has launched a unilateral nuclear attack on his own initiative. The mad scientist villain was presumably intended as a parody of Kahn. Kahn described a Doomsday Machine not because he thought we should build one but because he thought we already had. So had the Russians. Our nuclear arsenal and theirs were Doomsday Machines with human triggers. Once the Russians have attacked, retaliating does us no good just as, once you have finally told your daughter that she is going to bed, throwing a tantrum does her no good. But our military, knowing that the enemy has just killed most of their friends and relations, will retaliate anyway, and the knowledge that they will retaliate is a good reason for the Russians not to attack, just as the knowledge that your daughter will throw a tantrum is a good reason to let her stay up until the party is over. Fortunately, the real-world Doomsday Machines worked, with the result that neither was ever used.

For a final example, consider someone who is big, strong, and likes to get his own way. He adopts a policy of beating up anyone who does things he doesn’t like, such as paying attention to a girl he is dating or expressing insufficient deference to his views on baseball. He commits himself to that policy by persuading himself that only sissies let themselves get pushed around and that not doing what he wants counts as pushing him around. Beating someone up is costly; he might get hurt and he might end up in jail. But as long as everyone knows he is committed to that strategy, other people don’t cross him and he doesn’t have to beat them up. Think of the bully as a Doomsday Machine on an individual level. His strategy works as long as only one person is playing it. One day he sits down at a bar and starts discussing baseball with a stranger also big, strong, and committed to the same strategy. The stranger fails to show adequate deference to his opinions. When it is over, one of the two is lying dead on the floor, and the other is standing there with a broken beer bottle in his hand and a dazed expression on his face, wondering what happens next. The Doomsday Machine just went off. With only one bully the strategy is profitable: Other people do what you want and you never have to carry through on your commitment. With lots of bullies it is unprofitable: You frequently get into fights and soon end up either dead or in jail. As long as the number of bullies is low enough so that the gain of usually getting what you want is larger than the cost of occasionally having to pay for it, the strategy is profitable and the number of people adopting it increases. Equilibrium is reached when gain and loss just balance, making each of the alternative strategies, bully or pushover, equally attractive. The analysis becomes more complicated if we add additional strategies, but the logic of the situation remains the same.

This particular example of bilateral monopoly is relevant to one of the central disputes over criminal law in general and the death penalty in particular: Do penalties deter? One reason to think they might not is that the sort of crime I have just described, a barroom brawl ending in a killing more generally, a crime of passion seems to be an irrational act, one the perpetrator regrets as soon as it happens. How then can it be deterred by punishment? The economist’s answer is that the brawl was not chosen rationally but the strategy that led to it was. The higher the penalty for such acts, the less profitable the bully strategy. The result will be fewer bullies, fewer barroom brawls, and fewer “irrational” killings. How much deterrence that implies is an empirical question, but thinking through the logic of bilateral monopoly shows us why crimes of passion are not necessarily undeterrable. […]

in Chapter 8, ** David D. Friedman**, “

*Law’s Order: What Economics Has to Do With Law and Why it Matters*“, Princeton University Press, Princeton, New Jersey, 2000.

Note – Further reading should include David D. Friedman’s “*Price Theory and Hidden Order*“. Also, a more extensive treatment could be found on “*Game Theory and the Law*“, by Douglas G. Baird, Robert H. Gertner and Randal C. Picker, Cambridge, Mass: Harvard University Press, 1994.

Abraham, Ajith; Grosan, Crina; Ramos, Vitorino (Eds.), *Stigmergic Optimization*, Studies in Computational Intelligence (series), Vol. 31, Springer-Verlag, ISBN: 3-540-34689-9, 295 p., Hardcover, 2006.

TABLE OF CONTENTS (short /full) / CHAPTERS:

[1] Stigmergic Optimization: Foundations, Perspectives and Applications.

[2] Stigmergic Autonomous Navigation in Collective Robotics.

[3] A general Approach to Swarm Coordination using Circle Formation.

[4] Cooperative Particle Swarm Optimizers: a powerful and promising approach.

[5] Parallel Particle Swarm Optimization Algorithms with Adaptive

Simulated Annealing.

[6] Termite: a Swarm Intelligent Routing algorithm for Mobile

Wireless ad-hoc Networks.

[7] Linear Multiobjective Particle Swarm Optimization.

[8] Physically realistic Self-Assembly Simulation system.

[9] Gliders and Riders: A Particle Swarm selects for coherent Space-time Structures in Evolving Cellular Automata.

[10] Stigmergic Navigation for Multi-agent Teams in Complex Environments.

[11] Swarm Intelligence: Theoretical proof that Empirical techniques are Optimal.

[12] Stochastic Diffusion search: Partial function evaluation in Swarm Intelligence Dynamic Optimization.

With the current ongoing dramatic need of Africa to have contemporary maps (currently, Google promises to launch his first and exhaustive world-wide open-access digital cartography of the African continent very soon), back in 1999-2000 we envisioned a very simple idea into a research project (over my previous lab. – CVRM IST). Instead of producing new maps in the regular standard way, which are costly (specially for African continent countries) as well as time consuming (imagine the amount of money and time needed to cover the whole continent with high resolution aerial photos) the idea then was to hybridize trough an automatic procedure (with the help of Artificial Intelligence) new current data coming from satellites with old data coming from the computational analysis of images of old colonial maps. For instance, old roads segmented in old maps will help us finding the new ones coming from the current satellite images, as well as those that were lost. The same goes on for bridges, buildings, numbers, letters at the map, etc. However in order to do this, several preparatory steps were needed. One of those crucial steps was to obtain (segment – know to be one of the hardest procedures in image processing) the old roads, buildings, airports, at the old maps. Back in 1999-2000 while dealing with several tasks at this research project (**AUTOCARTIS **– ** Automatic Methods for Updating Cartographic Maps**) I started to think of using evolutionary computation in order to tackle and surpass this precise problem, in what then later become one of the first usages of Genetic Algorithms in image analysis. The result could be checked below. Meanwhile, the experience gained with AUTOCARTIS was then later useful not only for digital old books (

**Magazine, March 2002), as well as for helping us finding water in Mars (at the**

*Visão***MARS EXPRESS**European project –

*newspaper, May 2003) from which CVRM lab. was one of the European partners. Much often in life simple ideas (I owe it to Prof.*

**Expresso***Fernando Muge*and Prof.

*Pedro Pina*) are the best ones. This is particularly true in science.

Figure – One original image (left – Luanda, Angola map) and two segmentation examples, rivers and roads respectively obtained through the Genetic Algorithm proposed (low resolution images). [at the same time this precise Map of *Luanda*, was used by me along with the face of *Einstein* to benchmark several dynamic image adaptive perception versus memory experiments via ant-like artificial life systems over what I then entitled Digital *Image Habitats*]

[] Vitorino Ramos, Fernando Muge, Map Segmentation by Colour Cube Genetic K-Mean Clustering, Proc. of ECDL´2000 – 4th European Conference on Research and Advanced Technology for Digital Libraries, J. Borbinha and T. Baker (Eds.), ISBN 3-540-41023-6, Lecture Notes in Computer Science, Vol. 1923, pp. 319-323, Springer-Verlag -Heidelberg, Lisbon, Portugal, 18-20 Sep. 2000.

Segmentation of a colour image composed of different kinds of texture regions can be a hard problem, namely to compute for an exact texture fields and a decision of the optimum number of segmentation areas in an image when it contains similar and/or non-stationary texture fields. In this work, a method is described for evolving adaptive procedures for these problems. In many real world applications data clustering constitutes a fundamental issue whenever behavioural or feature domains can be mapped into topological domains. We formulate the segmentation problem upon such images as an optimisation problem and adopt evolutionary strategy of Genetic Algorithms for the clustering of small regions in colour feature space. The present approach uses k-Means unsupervised clustering methods into Genetic Algorithms, namely for guiding this last Evolutionary Algorithm in his search for finding the optimal or sub-optimal data partition, task that as we know, requires a non-trivial search because of its NP-complete nature. To solve this task, the appropriate genetic coding is also discussed, since this is a key aspect in the implementation. Our purpose is to demonstrate the efficiency of Genetic Algorithms to automatic and unsupervised texture segmentation. Some examples in Colour Maps are presented and overall results discussed.

(to obtain the respective PDF file follow link above or visit chemoton.org)

Springer book “Swarm Intelligence in Data Mining” (*Studies in Computational Intelligence *Series, Vol. 34) published in late 2006, is receiving a fair amount of attention, so much so, that early this year, Tokyo Denki University press (TDU) decided to negotiate with Springer the translation rights and copyrights in order to released it over their country in Japanese language. The Japanese version will now become shortly available, and I do hope – being one of the scientific editors – it will receive increasing attention as well in Japan, being it one of the most difficult and extraordinary real-world areas we could work nowadays among computer science. *Multiple Sequence Alignment* (MSA) within Bio-informatics is just one recent example, Financial Markets another. The amount of data – 100000 DVD’s every year -, CERN’s* Large Hadron Collider* (LHC) will collect is yet another. In order to transform data into information, and information into useful and critical knowledge, reliable and robust Data Mining is more than ever needed, on our daily life.

Meanwhile, I wonder how the Japanese cover design will be?! Starting with it’s own title, which appears to be pretty hard to translate. According to Yahoo BabelFish the Japanese characters (**群れの知性**) – derived among other language scripts from Kanji – correspond to the English sentence “** Swarm Intelligence**“. I wonder if this translation is correct or not, since “

*swarm*” in itself, is kind of difficult to translate. Some meanings of it point out to a

*spaghetti dish*, as well, which kind of makes some logic too. Moreover, the technical translation of it is also difficult. I guess the best person to handle the translation (at least from the list of colleagues around the world I know) is Claus Aranha. (IBA Lab., University of Tokyo). Not only he works in Japan for several years now, as well as some of his works focus this precise area.

SIDM book (*Swarm Int. in Data Mining*) focus on the hybridization of these two areas. As you may probably now, *Data Mining* (see also; *Knowledge Extraction*) refers to a collection of techniques – many of them classical – that envisions to tackle large amounts of data, in order to perform classification, clustering, sorting, feature selection, search, forecasting, decision, meaningful extraction, association rule discovery, sequential pattern discovery, etc. In recent years however (1985-2000), state of the art Artificial Intelligence such as Evolutionary Computation was also used, since some of his problems could be seen as – or properly translated to – optimization problems (namely, combinatorial). The same now happens with Swarm Intelligence, since some of it’s unique self-organizing distributed features (allowing direct applications over *Grid Computing*) seems ideal to tackle some of the most complex data mining problems we may face today.

For those willing for more, I will leave you with it’s contents (chapters), a foreword to this book by *James Kennedy* (one of the founding fathers of **PSO **– *Particle Swarm Optimization*, along with Russell C. Eberhart, and Yuhui Shi) which I vividly recommend (starting with the sentence “** Science is a Swarm**“!), as well as a more detailed description to it:

Swarm Intelligence (SI) is an innovative distributed intelligent paradigm for solving optimization problems that originally took its inspiration from the biological examples by swarming, flocking and herding phenomena in vertebrates. Particle Swarm Optimization (PSO) incorporates swarming behaviors observed in flocks of birds, schools of fish, or swarms of bees, and even human social behavior, from which the idea is emerged. Ant Colony Optimization (ACO) deals with artificial systems that is inspired from the foraging behavior of real ants, which are used to solve discrete optimization problems. Historically the notion of finding useful patterns in data has been given a variety of names including data mining, knowledge discovery, information extraction, etc. Data Mining is an analytic process designed to explore large amounts of data in search of consistent patterns and/or systematic relationships between variables, and then to validate the findings by applying the detected patterns to new subsets of data. In order to achieve this, data mining uses computational techniques from statistics, machine learning and pattern recognition. Data mining and Swarm intelligence may seem that they do not have many properties in common. However, recent studies suggests that they can be used together for several real world data mining problems especially when other methods would be too expensive or difficult to implement. This book deals with the application of swarm intelligence methodologies in data mining. Addressing the various issues of swarm intelligence and data mining using different intelligent approaches is the novelty of this edited volume. This volume comprises of 11 chapters including an introductory chapters giving the fundamental definitions and some important research challenges. Chapters were selected on the basis of fundamental ideas/concepts rather than the thoroughness of techniques deployed.

The eleven chapters are organized as follows. In Chapter 1, Grosan et al. present the biological motivation and some of the theoretical concepts of swarm intelligence with an emphasis on particle swarm optimization and ant colony optimization algorithms. The basic data mining terminologies are explained and linked with some of the past and ongoing works using swarm intelligence techniques. Martens et al. in Chapter 2 introduce a new algorithm for classification, named AntMiner+, based on an artificial ant system with inherent selforganizing capabilities. AntMiner+ differs from the previously proposed AntMiner classification technique in three aspects. Firstly, AntMiner+ uses a MAX-MIN ant system which is an improved version of the originally proposed ant system, yielding better performing classifiers. Secondly, the complexity of the environment in which the ants operate has substantially decreased. Finally, AntMiner+ leads to fewer and better performing rules. In Chapter 3, Jensen presents a feature selection mechanism based on ant colony optimization algorithm to determine a minimal feature subset from a problem domain while retaining a suitably high accuracy in representing the original features. The proposed method is applied to two very different challenging tasks, namely web classification and complex systems monitoring. Galea and Shen in the fourth chapter present an ant colony optimization approach for the induction of fuzzy rules. Several ant colony optimization algorithms are run simultaneously, with each focusing on finding descriptive rules for a specific class. The final outcome is a fuzzy rulebase that has been evolved so that individual rules complement each other during the classification process. In the fifth chapter Tsang and Kwong present an ant colony based clustering model for intrusion detection. The proposed model improves existing ant-based clustering algorithms by incorporating some meta-heuristic principles. To further improve the clustering solution and alleviate the curse of dimensionality in network connection data, four unsupervised feature extraction algorithms are also studied and evaluated. Omran et al. in the sixth chapter present particle swarm optimization algorithms for pattern recognition and image processing problems. First a clustering method that is based on PSO is discussed. The application of the proposed clustering algorithm to the problem of unsupervised classification and segmentation of images is investigated. Then PSO-based approaches that tackle the color image quantization and spectral unmixing problems are discussed.

In the seventh chapter Azzag et al. present a new model for data clustering, which is inspired from the self-assembly behavior of real ants. Real ants can build complex structures by connecting themselves to each others. It is shown is this paper that this behavior can be used to build a hierarchical tree-structured partitioning of the data according to the similarities between those data. Authors have also introduced an incremental version of the artificial ants algorithm. Kazemian et al. in the eighth chapter presents a new swarm data clustering method based on Flowers Pollination by Artificial Bees (FPAB). FPAB does not require any parameter settings and any initial information such as the number of classes and the number of partitions on input data. Initially, in FPAB, bees move the pollens and pollinate them. Each pollen will grow in proportion to its garden flowers. Better growing will occur in better conditions. After some iterations, natural selection reduces the pollens and flowers and the gardens of the same type of flowers will be formed. The prototypes of each gardens are taken as the initial cluster centers for Fuzzy C Means algorithm which is used to reduce obvious misclassification errors. In the next stage, the prototypes of gardens are assumed as a single flower and FPAB is applied to them again. Palotai et al. in the ninth chapter propose an Alife architecture for news foraging. News foragers in the Internet were evolved by a simple internal selective algorithm: selection concerned the memory components, being finite in size and containing the list of most promising supplies. Foragers received reward for locating not yet found news and crawled by using value estimation. Foragers were allowed to multiply if they passed a given productivity threshold. A particular property of this community is that there is no direct interaction (here, communication) amongst foragers that allowed us to study compartmentalization, assumed to be important for scalability, in a very clear form. Veenhuis and Koppen in the tenth chapter introduce a data clustering algorithm based on species clustering. It combines methods of particle swarm optimization and flock algorithms. A given set of data is interpreted as a multi-species swarm which wants to separate into single-species swarms, i.e., clusters. The data to be clustered are assigned to datoids which form a swarm on a two-dimensional plane. A datoid can be imagined as a bird carrying a piece of data on its back. While swarming, this swarm divides into sub-swarms moving over the plane and consisting of datoids carrying similar data. After swarming, these sub swarms of datoids can be grouped together as clusters. In the last chapter Yang et al. present a clustering ensemble model using ant colony algorithm with validity index and ART neural network. Clusterings are visually formed on the plane by ants walking, picking up or dropping down projected data objects with different probabilities. Adaptive Resonance Theory (ART) is employed to combine the clusterings produced by ant colonies with different moving speeds. We are very much grateful to the authors of this volume and to the reviewers for their tremendous service by critically reviewing the chapters. The editors would like to thank Dr. Thomas Ditzinger (Springer Engineering Inhouse Editor, Studies in Computational Intelligence Series), Professor Janusz Kacprzyk (Editor-in-Chief, Springer Studies in Computational Intelligence Series) and Ms. Heather King (Editorial Assistant, Springer Verlag, Heidelberg) for the editorial assistance and excellent cooperative collaboration to produce this important scientific work. We hope that the reader will share our excitement to present this volume on ‘Swarm Intelligence in Data Mining’ and will find it useful.

April, 2006

*Ajith Abraham*, Chung-Ang University, Seoul, Korea

*Crina Grosan*, Cluj-Napoca, Babes-Bolyai University, Romania

*Vitorino Ramos*, IST Technical University of Lisbon, Portugal

Fig. – (Above) A 3D toroidal fast changing landscape describing a Dynamic Optimization (DO) Control Problem (8 frames in total). (Bellow) A self-organized swarm emerging a characteristic flocking migration behaviour surpassing in intermediate steps some local optima over the 3D toroidal landscape (above), describing a Dynamic Optimization (DO) Control Problem. Over each foraging step, the swarm self-regulates his population and keeps tracking the extrema (44 frames in total). [extra details + PDF]

[] Vitorino Ramos, Fernandes, C., Rosa, A.C., Abraham, A., Computational Chemotaxis in Ants and Bacteria over Dynamic Environments, in CEC´07 – Congress on Evolutionary Computation, IEEE Press, USA, ISBN 1-4244-1340-0, pp. 1009-1017, Sep. 2007.

Chemotaxis can be defined as an innate behavioural response by an organism to a directional stimulus, in which bacteria, and other single-cell or multicellular organisms direct their movements according to certain chemicals in their environment. This is important for bacteria to find food (e.g., glucose) by swimming towards the highest concentration of food molecules, or to flee from poisons. Based on self-organized computational approaches and similar stigmergic concepts we derive a novel swarm intelligent algorithm. What strikes from these observations is that both eusocial insects as ant colonies and bacteria have similar natural mechanisms based on stigmergy in order to emerge coherent and sophisticated patterns of global collective behaviour. Keeping in mind the above characteristics we will present a simple model to tackle the collective adaptation of a social swarm based on real ant colony behaviors (SSA algorithm) for tracking extrema in dynamic environments and highly multimodal complex functions described in the well-know De Jong test suite. Later, for the purpose of comparison, a recent model of artificial bacterial foraging (BFOA algorithm) based on similar stigmergic features is described and analyzed. Final results indicate that the SSA collective intelligence is able to cope and quickly adapt to unforeseen situations even when over the same cooperative foraging period, the community is requested to deal with two different and contradictory purposes, while outperforming BFOA in adaptive speed. Results indicate that the present approach deals well in severe Dynamic Optimization problems.

(to obtain the respective PDF file follow link above or visit chemoton.org)

Video – Thousands of starlings birds gathering in flocks, flying in formations while emerging complex patterns on S.W. Scotland (more photos & video by/at Fresh Pics, 2007). Here for an artificial version with different purposes. They are not birds, instead an entirely different new animal.

[…] In contrast to negative feedback, positive feedback (PF) generally promotes changes in the system (the majority of self-organizing SO systems use them). The explosive growth of the human population provides a familiar example of the effect of positive feedback. The snowballing autocatalytic effect of PF takes an initial change in a system (due to amplification of fluctuations; a minimal and natural local cluster of objects could be a starting point) and reinforces that change in the same direction as the initial deviation. Self-enhancement, amplification, facilitation, and autocatalysis are all terms used to describe positive feedback [9]. Another example could be provided by the clustering or aggregation of individuals. Many birds, such as seagulls nest in large colonies. Group nesting evidently provides individuals with certain benefits, such as better detection of predators or greater ease in finding food. The mechanism in this case is imitation (1): birds preparing to nest are attracted to sites where other birds are already nesting, while the behavioral rule could be synthesized as “I nest close where you nest”. The key point is that aggregation of nesting birds at a particular site is not purely a consequence of each bird being attracted to the site per se. Rather, the aggregation evidently arises primarily because each bird is attracted to others (check for further references on [7,9]). On social insect societies, PF could be illustrated by the pheromone reinforcement on trails, allowing the entire colony to exploit some past and present solutions. Generally, as in the above cases, positive feedback is imposed implicitly on the system and locally by each one of the constituent units. Fireflies flashing in synchrony [49] follow the rule, “I signal when you signal”, fish traveling in schools abide by the rule, “I go where you go”, and so forth. In humans, the “infectious” quality of a yawn of laughter is a familiar example of positive feedback of the form, “I do what you do”. Seeing a person yawning (2), or even just thinking of yawning, can trigger a yawn [9]. There is however one associated risk, generally if PF acts alone without the presence of negative feedbacks, which per si can play a critical role keeping under control this snowballing effect, providing inhibition to offset the amplification and helping to shape it into a particular pattern. Indeed, the amplifying nature of PF means that it has the potential to produce destructive explosions or implosions in any process where it plays a role. Thus the behavioral rule may be more complicated than initially suggested, possessing both an autocatalytic as well as an antagonistic aspect. In the case of fish [9], the minimal behavioral rule could be “I nest where others nest, unless the area is overcrowded” (*HEY* !! here we go again to the El Farol Bar problem!). In this case both positive and negative feedback may be coded into the behavioral rules of the fish. Finally, in other cases one finds that the inhibition arises automatically, often simply from physical constraints. […]

in, V. Ramos et al., “Social Cognitive Maps, Swarm Collective Perception and Distributed Search on Dynamic Landscapes“.

(1) See also on this subject the seminal sociological work of Gabriel Tarde; Tarde, G., Les Lois de l’Imitation, Eds. du Seuil (2001), 1st Edition, Eds. Alcan, Paris, 1890.

(2) Similarly, Milgram et al (Milgram, Bickerman and Berkowitz, “Note on the Drawing Power of Crowds of Different Size”, Journal of Personality and Social Psychology, 13, 1969) found that if one person stood in a Manhattan street gazing at a sixth floor window, 20% of pedestrians looked up; if five people stood gazing, then 80% of people looked up.

(to obtain the respective PDF file follow this link or visit chemoton.org)

## Recent Comments