Différences entre les versions de « Parallelizing over GPU cards »

De
Aller à la navigation Aller à la recherche
Ligne 11 : Ligne 11 :
 
  |NUMBER|    TIME      | EVALUATION NB | EVALUATION NB |    FITNESS    |FITNESS|FITNESS|  DEV  |
 
  |NUMBER|    TIME      | EVALUATION NB | EVALUATION NB |    FITNESS    |FITNESS|FITNESS|  DEV  |
 
  ------------------------------------------------------------------------------------------------
 
  ------------------------------------------------------------------------------------------------
      0         0.989s           2048           2048 1.248121109e+02 1.4e+02 4.6e+00 1.5e+02
+
      0         0.989s           2048           2048 1.232901840e+02 1.4e+02 4.8e+00 1.6e+02
 
  ...
 
  ...
      99         96.814s         204800         204800 4.464106750e+01 4.5e+01 8.7e-02 4.5e+01
+
    99         97.065s         204800         204800 5.162325287e+01 5.2e+01 1.4e-01 5.3e+01
  
 
Now, when compiled with:  
 
Now, when compiled with:  

Version du 5 mai 2020 à 11:30

Without data transfer to the GPGPU card

The EASEA parallelization of an evolutionary algorithm over GPGPU cards is straightforward (just add the "-cuda" option on the easena compile line) provided the evaluation function does not need data to evaluate individuals.

For example, the weierstrass example (in the examples directory) can be compiled with either:

$ make easeaclean ; easena weierstrass.ez ; make

On a PC with an Intel Core i7-9700K overclocked to 4.6GHz and an NVIDIA GEFORCE RTX,2080 Ti, the runtime over 100 generations is the following:

------------------------------------------------------------------------------------------------
|GENER.|    ELAPSED    |    PLANNED    |     ACTUAL    |BEST INDIVIDUAL|  AVG  | WORST | STAND |
|NUMBER|     TIME      | EVALUATION NB | EVALUATION NB |    FITNESS    |FITNESS|FITNESS|  DEV  |
------------------------------------------------------------------------------------------------
     0	         0.989s	           2048	           2048	1.232901840e+02	1.4e+02	4.8e+00	1.6e+02
	...
    99	        97.065s	         204800	         204800	5.162325287e+01	5.2e+01	1.4e-01	5.3e+01

Now, when compiled with:

$ make easeaclean ; easena weierstrass.ez -cuda ; make

the result becomes:

------------------------------------------------------------------------------------------------
|GENER.|    ELAPSED    |    PLANNED    |     ACTUAL    |BEST INDIVIDUAL|  AVG  | WORST | STAND |
|NUMBER|     TIME      | EVALUATION NB | EVALUATION NB |    FITNESS    |FITNESS|FITNESS|  DEV  |
------------------------------------------------------------------------------------------------
     0	         0.083s	           2048	           2048	1.232649612e+02	1.4e+02	4.8e+00	1.6e+02
	...
    99	         1.802s	         204800	         204800	5.043983459e+01	5.1e+01	1.5e-01	5.1e+01

so a x52,96 speedup is observed. One can notice that the result is not identical even though the seed is the same, because the random number generator used on the CPU and on the GPU are different.