EASEA platform

De
Révision datée du 28 juin 2010 à 13:27 par Admin (discussion | contributions)
(diff) ← Version précédente | Voir la version actuelle (diff) | Version suivante → (diff)
Aller à la navigation Aller à la recherche
Welcome to the EASEA platform

The EASEA massively parallel platform is developed by the SONIC (Stochastic Optimisation and Nature Inspired Computing) theme of the FDBT team at Université de Strasbourg.

EASEA (EAsy Specification of Evolutionary Algorithms) is an Artificial Evolution platform that allows scientists with only basic skills in computer science to exploit the massive parallelism of many-core architectures in order to optimize virtually any real-world problems (continous, discrete or combinatorial), typically allowing for speedups of around x1,000 on a $5,000 machine (numbers valid in June 2010).

The EASEA code is also compatible with standard CPU machines and a distributed version is under development that will allow to parallelize over a loosely coupled heterogenous architecture (typically PCs under Windows or Linux and Macintosh, with or without GPGPU cards).

EASEA Platform

When EASEA meets massive parallelism

The EASEA platform is the outcome from the emergence of low cost ($50 to $400) massively parallel GPGPU (General Purpose Graphic Processing Unit) cards developed by the gaming industry and the "old" EASEA language that was developed in 1999 by Pierre Collet at INRIA, along with Evelyne Lutton, Marc Schoenauer and Jean Louchet (Collet, 2000, 2001).

A brief history of EASEA

Back in 1998, a French INRIA "Cooperative Research Action" ARC called EVOLAB started between four French research laboratories with the aim to come up with a software platform that would help out non expert programmers to try out evolutionary algorithms to optimise their applied problems.

A first prototype of the EASEA (EAsy Specification for Evolutionary Algorithms, pronounce [i:zi:]) language was demonstrated during the EA'99 conference in Dunkerque, and new versions regularly came out on its web-based sourceforge software repository until 2003.

In the meantime, the EASEA language (along with GUIDE (Collet, 2003), its dedicated Graphic User Interface) was used for many research projects and at least 4 PhDs, as well as for teaching evolutionary computation in several universities. Before 2000, the output of the EASEA compiler was a C++ source file that was using either the GALib or the EO libraries for their implementation of the evolutionary operators.

By 2000, the DREAM (Distributed Resource Evolutionary Algorithm Machine) European Project was starting (Arenas, 2002), and it needed a programming language. EASEA became the programming language of the project, and it was modified so as to output java sourcecode for the DREAM, while feeding on the same input files, for this was another nice feature of the EASEA language: the same .ez file could be compiled to create code for GALib, EO (Lutton, 2002) or then the DREAM, depending on a single option flag. This functionality allowed replicability across platforms and computers, as a published .ez program could be recompiled and executed with minimal effort on a Mac, a PC running under Windows or Linux, or any other machine where one of the target GALib / EO / DREAM libraries was installed.

Development of the EASEA language stopped with the DREAM project in 2003, and even though it was not supported anymore, EASEA is still regularly downloaded from its sourceforge repository, with more than 5,000 downloads since March 2001.

Development of the EASEA language stopped with the end of the DREAM project in 2003 and even though it was not supported anymore, EASEA was still regularly downloaded, with more than 5,000 downloads since March 2001.

With the advent of GPGPUs, its ability to generate a fully functional and compilable source code from a simple C-like description of the evolutionary operators (namely initialiser, evaluator, crossover operator and mutator) that were needed for a particular problem made it look like a wonderful solution to allow non-expert programmers to use GPGPUs. Due to their origins and history, these cards need a long time to understand, and are quite difficult to program.

A brief history of GPGPU cards

Back in 1965, Moore predicted that the evolution of technology would allow to double the number of transistors per square millimeter of silicium every year, and later on every two years (Moore, 1965). At first, most people mistook this law for the doubling of computer power every two years. This was not a big mistake because even though being able to put twice as many transistors on the same surface did not imply that computers would run twice as fast, it happened so that clock frequency (and more generally single core computer chips) more or less did.

But what was true in the past may not hold forever, and one must observe that it is now many years since the clock frequency of personal computers is stuck below 4GHz and now even 3GHz (independently of the verification of Moore's law). Therefore, even if clock speed is not everything, and certainly not the best performance indicator, the speed of a single core of a CPU does not increase as much as before, even though (thanks to Moore's law) the number of cores that chips can hold still increases at the same rate.

Even though a dual-core chip does not make the computer run twice as fast, manufacturers can claim that their power is multiplied by two, which is only correct if an application is run that is parallel enough so that it can use both cores at the same time. In computer graphics, many algorithms such as vertex or fragment shaders are inherently parallel, so graphics cards have been able to fully benefit from Moore's law, by massively multiplying the cores on the card to calculate in parallel the colour of many fragments. In 2005, some people realised that such cards, containing as many as 128 cores and claiming a computer power of several hundred gigaflops for less than a thousand dollars could be used for other calculations than vertex shading.

In November 2006, NVidia launched CUDA (Compute Unified Device Architecture), that provides a Software Development Kit and Application Programming Interface that allowed a programmer to use a variation of the C language to code algorithms for GeForce 8 series GPUs. Then NVidia came out with TESLA calculators, which are graphics cards without a graphic output, totally dedicated to scientific processing, that (in 2009) boasted 4.3 TeraFlops for around $8.000.

Exploiting massive parallelism to optimize any kinds of problems

Unfortunately, most human-designed algorithms cannot exploit parallelism as they are sequential by essence, meaning that they rely on instructions that feed on the result of previous calculations. Right now, algorithms that can benefit from massively parallel hardware are quite rare, and most of the time, it is not the whole algorithm that can be parallelized, but very small parts of it.

So, apart from their primary use in the games industry and some rare scientific applications, it may seem that massively parallel hardware cannot be used to solve generic problems. Fortunately, this is where evolutionary algorithms (and the EASEA platform) chime in, as the workflow of image rendering algorithms (that lead to the development of GPGPU cards) is virtually the same as that of evolutionary algorithms: for graphic rendering, the same pixel shader algorithm must be executed in parallel on millions of different pixels of the image, whereas in standard artificial evolution, the same "fitness function" can be executed in parallel to evaluate thousands of "individuals" of the "population".

GPGPU hardware and the EASEA compiler

In terms of hardware, why is it possible for GPGPU chips to host several hundred cores (480 cores for an nVidia GTX480 card in 2010) where "standard" CPUs are still limited to 4 cores (Intel Core I7 in 2010) ?

Well, GPU chip designers exploit the fact that the same rendering algorithm must be used on millions of different pixels or vertices. This allow to drastically simplify the architecture, since, for instance, one Fetch and Dispatch unit (which provides code instructions to arithmetic and logic units ALUs) can be shared by many ALUs (that must then execute the same instruction at the same time, but on different data). This special kind of parallelism falls in Flynn's Single Instruction Multiple Data (SIMD) category. On the opposite, multi-core CPUs cannot sharing the same Fetch and Dispatch units because they must be able to execute different instructions (i.e. a different code) in parallel. Multi-core CPUs fall in Flynn's Multiple Instruction Multiple Data (MIMD) category.

Then, since rendering algorithms work on millions of pixels, and only hundreds of cores are available, the same core will need to execute the very same program over and over again on thousands of different pixels, meaning that heavy pipelining can be envisaged. This allows to do without a memory cache and use the saved space for yet more cores: high latency memory is not a problem if enough threads can be stacked up on one core. Once more, a solution that is acceptable due to the specificites of graphics rendering algorithms is not acceptable for one core of a MIMD multi-core CPUs, which needs cache memory as more standard algorithms are not often required to run several hundred times on different data.

All in all, the hardware specificities of GPGPU cards makes them hard to program in an efficient way, especially for applied scientists who may not be computer geeks, and as for evolutionary algorithms, this is where the EASEA language can provide some help.

In its original implementation, the EASEA compiler was wrapping a complete evolutionary algorithm around basic functions representing a problem to be optimized. The modernized compiler of the EASEA massively parallel platform still does the same, but automatically parallelizes the evaluation of the population over possibly several GPGPU cards in a quite efficient way.

Running the EASEA platform

Basically, in order to implement an evolutionary algorithm with the EASEA language, the user only needs to write problem-related code in a my_project.ez file, containing:

  • a description of the structure of a potential solution (individual),
  • an initialisation operator in order to create the initial population,
  • a function to evaluate the quality of an individual (fitness function),
  • a crossover operator, allowing to create an offspring out of two parents, and
  • a mutation operator, allowing to mutate the created offspring.

Once this is done EASEA compiles the project_name.ez file into a set of c++ files that wrap a complete evolutionary algorithm around the provided code.

If compiled with the -cuda option, the EASEA compiler automatically parallelises the code for nVidia cards allowing speedups depending on the type of card hosted by the computer and the speed of the host CPU.

In order to get the EASEA platform to run on your machine, please follow the EASEA Quick Installation Guidelines.

Expected speedups

Typically, a reasonable guideline is that a GPGPU core is twice as slow as a CPU core of the same generation. Therefore, one can expect a speedup of around x60 on a 2006 nVidia 8800GTX with 128 cores vs a single core of an Intel CPU of the same era, and a speedup of around x250 on a GTX480 vs a single core of an Intel Core I7 (Maitre 2009a, 2009b, 2010a, 2010b, Kruger 2010).

The PC shown on the top of the page holds 4 $400 GTX480 cards, and is therefore capable of a x1000 speedup, which (thanks to the ability of evolutionary algorithms to address generic problems) allows to tackle problems that only very large clusters could hope to envisage, as one hour on this machine is equivalent to 50 days of computation of a sequential algorithm on a Core I7, and a 3 day week-end is equivalent to nearly 10 years !

If your computer uses an nVidia graphics card or if you plan to buy one, this page is a good reference to evaluate the kind of speedup that your card can reach.

Contact

The EASEA platform is being developed by the SONIC theme of the FDBT team, lead by Prof Pierre Collet.

Should you need some help, please contact Frédéric Kruger, Lidia Yamamoto or Ogier Maitre.

References

  1. (Arenas, 2002) Arenas M. G., Collet P., Eiben A. E., Jelasity M., Merelo J. J., Paechter B., Preuss M. and Marc Schoenauer, "A Framework for Distributed Evolutionary Algorithms", Parallel Problem Solving from Nature — PPSN VII, Springer LNCS 2439, ISBN 978-3-540-44139-7, pp 665-675.
  2. (Collet, 2000) Collet P., Lutton E., Schoenauer M., Louchet J., "Take It EASEA", in Parallel Problem Solving from Nature PPSN VI, Springer LNCS 1917, ISBN 978-3-540-41056-0, pp 891--901.
  3. (Collet, 2001) Collet P., Lutton E., Schoenauer M., Louchet J., "EASEA : un langage de spécification pour les algorithmes évolutionnaires", RR-4218, INRIA, Projet Fractales.
  4. (Kruger, 2010) Krüger F., Maitre O., Jimenez S., Baumes L., Collet P., "Speedups between x70 and x120 for a generic local search (memetic algorithm) on a single GPGPU chip", EvoApplications, Springer LNCS 6024, ISBN 978-3-642-12238-5, pp 501-511.
  5. (Lutton, 2002) Lutton E., Collet P., Louchet J., "EASEA Comparisons on Test Functions: GALib versus EO", in Artificial Evolution, Springer LNCS 2310, ISBN 978-3-540-43544-0, pp 219--230.
  6. (Maitre, 2009a) Maitre O., Baumes L., Lachiche N., Corma A., Collet P., "Coarse Grain Parallelization of Evolutionary Algorithms on GPGPU cards with EASEA", Gecco'09, ACM, pp 1403-1410.
  7. (Maitre, 2009b) Maitre O., Lachiche N., Clauss P., Baumes L., Corma A., Collet P., "Efficient Parallel Implementation of Evolutionary Algorithms on GPGPU Cards", EuroPar Parallel Processing, Springer LNCS 5704, ISBN 978-3-642-03868-6, pp 974-985.
  8. (Maitre, 2010a) Maitre O., Lachiche N., Collet P., "Fast Evaluation of GP Trees on GPGPU by Optimizing Hardware Scheduling", Genetic Programming, Springer LNCS 6021, ISBN 978-3-642-12147-0, pp 301-312.
  9. (Maitre, 2010b) Maitre O., Querry S., Lachiche N., Collet P., EASEA Parallelization of Tree-Based Genetic Programming, IEEE CEC 2010, in press.
  10. (Moore, 1065) Moore G., "Cramming More Components onto Integrated Circuits", Electronics Magazine, 38(8), April 19 1965.