You are currently browsing the tag archive for the ‘Clustering’ tag.

Octavio Aburto David and Goliath CaboPulmo NatGeo2012

During several years, Octavio Aburto thought of one photo. Now, he finally got it. The recently published photograph by Aburto, titled “David and Goliath” (it his in fact David Castro, one of his research science colleagues at the center of this stunning image) has been widely shared over the last few weeks. It was taken at Cabo Pulmo National Park (Mexico) and submitted to the National Geographic photo contest 2012. Here, he captures the sheer size of fish aggregations in perspective with a single human surrounded by abundant marine life. On a recent interview, he explains:

[…] … this “David and Goliath” image is speaking to the courtship behavior of one particular species of Jack fish. […] Many people say that a single image is worth a thousand words, but a single image can also represent thousands of data points and countless statistical analyses. One image, or a small series of images can tell a complicated story in a very simple way. […] The picture you see was taken November 1st, 2012. But this picture has been in my mind for three years — I have been trying to capture this image ever since I saw the behavior of these fish and witnessed the incredible tornado that they form during courtship. So, I guess you could say this image took almost three years. […], in mission-blue.org , Dec. 2012.

Video – Behind the scenes of David and Goliath image. This photo was taken at Cabo Pulmo National Park and submitted to the National Geographic photo contest 2012. You can see more of his images from this place and about Mexican seas on Octavio‘s web link.

There’s Plenty of Room at the Bottom“, ~ Richard Feynman (referring to NanoTechnology).

There are huge life scales in our world with which we are not acquainted to. While some prefer to wonder about “alien life” on movie theaters simultaneously eating popcorn, right here at Planet Earth, some lakes and rivers are full of them. What you see above is a tiny water flea ‘Crown Thorns‘ photographed by zoologist Jan Michels (Christian Albrecht University in Kiel, Germany). It was nominated as the best microscopic life image of 2009, last week, at BioScapes (short for Biological landscapes – a competition sponsored by Olympus  in order to recognize microscope photos of plants, animals, and other life-forms that capture the “fascinating minutia of life”).

The snaking ridge at top left took top honors in the 2009 BioScapes microscope imaging contest. If water flea parents sense that their habitat is shared by their main predators, tadpole shrimp, the flea offspring sport these pointy crowns – which are unappetizing to the shrimp. Jan Michels, added a dye to reveal the tiny animal’s exoskeleton (green) and cellular nuclei (blue smudges). The blue-and-red dots are one of the animal’s compound eyes, like those of a fly.

This image, kind of remembers me of another one I used in the past for a series of Artificial Intelligence conferences I have held in the past, during 2004 (Budapest, Hungary), 2005 (Muroran, Japan) and 2006 (Jinan, China) (SIP workshop series Swarm Intelligence and Patterns). This image below was used as the conference symbol; a termite head scanned trough SEM (Scanning Electron Microscope) taken by University of Toronto, Canada.

But probably one of the images I most love at this nano-scale  is one  of a red ant grabbing  a tiny electronic circuit board (microchip) on his mouth (Science Museum, UK). Reason is simple. This image (below) could have several readings. By using SEM, image is formed by focusing an electron beam onto the sample surface.  As the beam scans across the surface the sample emits secondary electrons which are then detected and used to modulate the image signal much like a television.  More electrons is translated into a brighter image.  As the beam scans the surface each point is mapped out just as the electron beam in a television maps the image onto the screen.  Here we are able to see all the details of one of natures smallest denizens holding one of mankind’s smallest creations, a silicon microchip (the building blocks of digital electronics).

What’s funny is that ant colonies are known (among many other interesting features) for their remarkable cemetery organization capabilities, that is, their sequential clustering task of corpses and objects (as this microchip below). Ant colonies do show that the coordination and regulation of building activities do not depend on the workers themselves but are mainly achieved by the nest structure: a stimulating object configuration triggers the response of a termite worker, transforming the configuration into another configuration that may trigger in turn another (possibly different) action performed by the same termite or any other worker in the colony.

Ants do all this by simple manipulating objects using stigmergic capabilities. Ants form piles of items such as dead bodies (corpses), larvae, or grains of sand. Initially, they deposit items at random locations. When other ants perceive deposited items, they are stimulated to deposit items next to them, being this type of cemetery clustering action, organization, and brood sorting a type of self-organization and adaptive behavior. Some  bio-inspired branches of computer science use this kind of behaviors to solve highly complex problems, such as Data Mining, Data analysis and classification, Data clustering, Image retrieval, among many others.

Indeed, life on its own is the ultimate science-fiction. And, as Richard Feynman mentioned once, there is plenty of room below!

Figure – Book cover of Toby Segaran’s, “Programming Collective Intelligence – Building Smart Web 2.0 Applications“, O’Reilly Media, 368 pp., August 2007.

{scopus online description} Want to tap the power behind search rankings, product recommendations, social bookmarking, and online matchmaking? This fascinating book demonstrates how you can build Web 2.0 applications to mine the enormous amount of data created by people on the Internet. With the sophisticated algorithms in this book, you can write smart programs to access interesting data-sets from other web sites, collect data from users of your own applications, and analyze and understand the data once you’ve found it. Programming Collective Intelligence takes you into the world of machine learning and statistics, and explains how to draw conclusions about user experience, marketing, personal tastes, and human behavior in general — all from information that you and others collect every day. Each algorithm is described clearly and concisely with code that can immediately be used on your web site, blog, Wiki, or specialized application.

{even if I don’t totally agree, here’s a “over-rated” description – specially on the scientific side, by someone “dwa” – link above} Programming Collective Intelligence is a new book from O’Reilly, which was written by Toby Segaran. The author graduated from MIT and is currently working at Metaweb Technologies. He develops ways to put large public data-sets into Freebase, a free online semantic database. You can find more information about him on his blog:  http://blog.kiwitobes.com/. Web 2.0 cannot exist without Collective Intelligence. The “giants” use it everywhere, YouTube recommends similar movies, Last.fm knows what would you like to listen and Flickr which photos are your favorites etc. This technology empowers intelligent search, clustering, building price models and ranking on the web. I cannot imagine modern service without data analysis. That is the reason why it is worth to start read about it. There are many titles about collective intelligence but recently I have read two, this one and “Collective Intelligence in Action“. Both are very pragmatic, but the O’Reilly’s one is more focused on the merit of the CI. The code listings are much shorter (but examples are written in Python, so that was easy). In general these books comparison is like Java vs. Python. If you would like to build recommendation engine “in Action”/Java way, you would have to read whole book, attach extra jar-s and design dozens of classes. The rapid Python way requires reading only 15 pages and voila, you have got the first recommendations. It is awesome!

So how about rest of the book, there are still 319 pages! Further chapters say about: discovering groups, searching, ranking, optimization, document filtering, decision trees, price models or genetic algorithms. The book explains how to implement Simulated Annealing, k-Nearest Neighbors, Bayesian Classifier and many more. Take a look at the table of contents (here: http://oreilly.com/catalog/9780596529321/preview.html), it does not list all the algorithms but you can find more information there. Each chapter has about 20-30 pages. You do not have to read them all, you can choose the most important and still know what is going on. Every chapter contains minimum amount of theoretical introduction, for total beginners it might be not enough. I recommend this book for students who had statistics course (not only IT or computing science), this book will show you how to use your knowledge in practice _ there are many inspiring examples. For those who do not know Python – do not be afraid _ at the beginning you will find short introduction to language syntax. All listings are very short and well described by the author _ sometimes line by line. The book also contains necessary information about basic standard libraries responsible for xml processing or web pages downloading. If you would like to start learn about collective intelligence I would strongly recommend reading “Programming Collective Intelligence” first, then “Collective Intelligence in Action”. The first one shows how easy it is to implement basic algorithms, the second one would show you how to use existing open source projects related to machine learning.

Video – Merci! (referred also as Bodhisattva in metro), short film by Belgian director Christine Rabette awarded in 2003 with a Golden Wave for best Short Film (Court-Métrage), now climbing to more than a half-million views on YouTube. Along with yawning and the flu, few things are as contagious and viral as laughter. After all, we are humans not androids, for god’s sake!

[…] In contrast to negative feedback, positive feedback (f+) generally promotes changes in the system (the majority of SO systems use them). The explosive growth of the human population provides a familiar example of the effect of positive feedback. The snowballing autocatalytic effect of f+ takes an initial change in a system (due to amplification of fluctuations; a minimal and natural local cluster of objects could be a starting point) and reinforces that change in the same direction as the initial deviation. Self-enhancement, amplification, facilitation, and autocatalysis are all terms used to describe positive feedback [9]. Another example could be provided by the clustering or aggregation of individuals. Many birds, such as seagulls nest in large colonies. Group nesting evidently provides individuals with certain benefits, such as better detection of predators or greater ease in finding food. The mechanism in this case is imitation2: birds preparing to nest are attracted to sites where other birds are already nesting, while the behavioral rule could be synthesized as “I nest close where you nest”. The key point is that aggregation of nesting birds at a particular site is not purely a consequence of each bird being attracted to the site per se. Rather, the aggregation evidently arises primarily because each bird is attracted to others (check for further references on [7,9]). On social insect societies, f+ could be illustrated by the pheromone reinforcement on trails, allowing the entire colony to exploit some past and present solutions. Generally, as in the above cases, positive feedback is imposed implicitly on the system and locally by each one of the constituent units. Fireflies flashing in synchrony [49] follow the rule, “I signal when you signal”, fish traveling in schools abide by the rule, “I go where you go”, and so forth. In humans, the “infectious” quality of a yawn of laughter is a familiar example of positive feedback of the form, “I do what you do”. Seeing a person yawning3, or even just thinking of yawning, can trigger a yawn [9]. There is however one associated risk, generally if f+ acts alone without the presence of negative feedbacks, which per si can play a critical role keeping under control this snowballing effect, providing inhibition to offset the amplification and helping to shape it into a particular pattern. Indeed, the amplifying nature of  f+ means that it has the potential to produce destructive explosions or implosions in any process where it plays a role. Thus the behavioral rule may be more complicated than initially suggested, possessing both an autocatalytic as well as an antagonistic aspect. In the case of fish [9], the minimal behavioral rule could be “I nest where others nest, unless the area is overcrowded”. In this case both the positive and negative feedback may be coded into the behavioral rules of the fish. Finally, in other cases one finds that the inhibition arises automatically, often simply from physical constraints. […], in, Social Cognitive Maps, Swarm Collective Perception and Distributed Search on Dynamic Landscapes.

With the current ongoing dramatic need of Africa to have contemporary maps (currently, Google promises to launch his first and exhaustive world-wide open-access digital cartography of the African continent very soon), back in 1999-2000 we envisioned a very simple idea into a research project (over my previous lab. – CVRM IST). Instead of producing new maps in the regular standard way, which are costly (specially for African continent countries) as well as time consuming (imagine the amount of money and time needed to cover the whole continent with high resolution aerial photos) the idea then was to hybridize trough an automatic procedure (with the help of Artificial Intelligence) new current data coming from satellites with old data coming from the computational analysis of images of old colonial maps. For instance, old roads segmented in old maps will help us finding the new ones coming from the current satellite images, as well as those that were lost. The same goes on for bridges, buildings, numbers, letters at the map, etc. However in order to do this, several preparatory steps were needed. One of those crucial steps was to obtain (segment – know to be one of the hardest procedures in image processing) the old roads, buildings, airports, at the old maps. Back in 1999-2000 while dealing with several tasks at this research project (AUTOCARTIS Automatic Methods for Updating Cartographic Maps) I started to think of using evolutionary computation in order to tackle and surpass this precise problem, in what then later become one of the first usages of Genetic Algorithms in image analysis. The result could be checked below. Meanwhile, the experience gained with AUTOCARTIS was then later useful not only for digital old books (Visão Magazine, March 2002), as well as for helping us finding water in Mars (at the MARS EXPRESS European project – Expresso newspaper, May 2003) from which CVRM lab. was one of the European partners. Much often in life simple ideas (I owe it to Prof. Fernando Muge and Prof. Pedro Pina) are the best ones. This is particularly true in science.

Figure – One original image (left – Luanda, Angola map) and two segmentation examples, rivers and roads respectively obtained through the Genetic Algorithm proposed (low resolution images). [at the same time this precise Map of Luanda, was used by me along with the face of Einstein to benchmark several dynamic image adaptive perception versus memory experiments via ant-like artificial life systems over what I then entitled Digital Image Habitats]

[] Vitorino Ramos, Fernando Muge, Map Segmentation by Colour Cube Genetic K-Mean Clustering, Proc. of ECDL´2000 – 4th European Conference on Research and Advanced Technology for Digital Libraries, J. Borbinha and T. Baker (Eds.), ISBN 3-540-41023-6, Lecture Notes in Computer Science, Vol. 1923, pp. 319-323, Springer-Verlag -Heidelberg, Lisbon, Portugal, 18-20 Sep. 2000.

Segmentation of a colour image composed of different kinds of texture regions can be a hard problem, namely to compute for an exact texture fields and a decision of the optimum number of segmentation areas in an image when it contains similar and/or non-stationary texture fields. In this work, a method is described for evolving adaptive procedures for these problems. In many real world applications data clustering constitutes a fundamental issue whenever behavioural or feature domains can be mapped into topological domains. We formulate the segmentation problem upon such images as an optimisation problem and adopt evolutionary strategy of Genetic Algorithms for the clustering of small regions in colour feature space. The present approach uses k-Means unsupervised clustering methods into Genetic Algorithms, namely for guiding this last Evolutionary Algorithm in his search for finding the optimal or sub-optimal data partition, task that as we know, requires a non-trivial search because of its NP-complete nature. To solve this task, the appropriate genetic coding is also discussed, since this is a key aspect in the implementation. Our purpose is to demonstrate the efficiency of Genetic Algorithms to automatic and unsupervised texture segmentation. Some examples in Colour Maps are presented and overall results discussed.

(to obtain the respective PDF file follow link above or visit chemoton.org)

[...] People should learn how to play Lego with their minds. Concepts are building bricks [...] V. Ramos, 2002.

@ViRAms on Twitter

Archives

Blog Stats

  • 245,323 hits