Nahajate se tukaj

High Performance Linpack

HPL (High Performance LINPACK) je standardni industrijski test za HPC, ki rešuje (naključni) gosti linearni sistem v dvojni ločljivosti (64 bitov) na računalnikih s porazdeljenim spominom.

HPL test je primeren za testiranje sestavov predvsem zaradi tega, ker se pri testiranju upošteva procesorska moč, celotna količina spomina in povezava med vozlišči. Vhodni parametri programa niso vnaprej predpisani in se jih lahko prireja tako, da bo dobimo čim boljši rezultat, ki je izpisan kot GFlops. Teoretična zmogljivost  zmogljivost sestava HPCFS je

Rpeak = 2.93 GHz * 768 jeder * 4 operacije/takt = 9001 Gflops = 9Tflops

Slika kaže potek testa na sestavu HPCFS z 768 procesorji (modra krivulja) za različne velikost sitema. Dosežena največja zmogljivost je Rpeak=8Tflops od teoretičnih 9Tflops. Večja količina spomina oz večji problemi N dajejo nekoliko boljše rezultate. Predvsem pa je večja zmogljivost dosežena z večjim številom procesorskih jeder. Če uporabimo manjše število jeder je zmogljivost HPL  proporcionalno manjša, kar je prikazano z rdečo krivuljo, ki predstavlja polovično zmogljivost 380 procesov. Največja velikost problema je prav tako manjša. Vse med seboj omejuje še komunikacija, zato pri večanju problema rezultati izzvenijo. Ko se bližamo mejni vrednosti razpoložljivega spomina se čas reševanja problema enormno poveča, kar pomeni da je začel sistem delovati v ostranjevanju (swapping). Testi so pokazali, da vklopljen SMT in Turbo način ne zmanjšuje hitrosti klasičnih MPI programov kot je HPL.

Najboljši doseženi rezultat HPL na sestavu HPCFS je 8Tflops, kar ustreza 89% vzporedni izkoriščenosti sestava. Rezultat je primerljiv s podobnimi sestavi.

 

Naslednji izpis Linpacka je bil narejen na 768 jedrih z vklopljenim SMT in turbo s prevajalnikom Intel 11.1, MKL 10.2 in OpenMPI 1.4.3

 ================================================================================
HPLinpack 2.0  --  High-Performance Linpack benchmark  --   September 10, 2008
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :    1008    10080    20160    30072    50064   100128   150024   200424
          300048   200424   300048   400008   418488   500136   600096   610008
          613872   619920   620928   623112
NB     :     168
PMAP   : Row-major process mapping
P      :      24
Q      :      32
PFACT  :   Crout
NBMIN  :       4
NDIV   :       2
RFACT  :   Crout
BCAST  :  2ringM
DEPTH  :       0
SWAP   : Mix (threshold = 128)
L1     : transposed form
U      : transposed form
EQUIL  : yes
ALIGN  : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               2.220446e-16
- Computational tests pass if scaled residuals are less than                16.0

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4        1008   168    24    32               0.20              3.493e+00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0029225 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4       10080   168    24    32               1.74              3.917e+02
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0019798 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4       20160   168    24    32               2.93              1.867e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0017076 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4       30072   168    24    32               6.86              2.642e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0013021 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4       50064   168    24    32              21.22              3.942e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0011398 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      100128   168    24    32             109.84              6.093e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0008548 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      150024   168    24    32             319.56              7.044e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0007470 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      200424   168    24    32             727.34              7.379e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0007286 ...... PASSED
================================================================================
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      300048   168    24    32            2328.99              7.732e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0007188 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      200424   168    24    32             728.05              7.372e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0007286 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      300048   168    24    32            2331.25              7.725e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0007188 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      400008   168    24    32            5401.88              7.899e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0005789 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      418488   168    24    32            6167.65              7.922e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0005077 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      500136   168    24    32           10481.26              7.957e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0005143 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      600096   168    24    32           18004.10              8.002e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0005549 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      610008   168    24    32           18923.17              7.997e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0005519 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      613872   168    24    32           19276.84              8.000e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0005749 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      619920   168    24    32           23018.07              6.900e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0006658 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      620928   168    24    32           22583.46              7.067e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0004815 ...... PASSED
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR03C2C4      623112   168    24    32           29785.19              5.415e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0005175 ...... PASSED
================================================================================

Finished     20 tests with the following results:
             20 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
===============================================================================

Izpis LSF poročila:

 Date: Sat, 8 Jan 2011 10:58:34 +0100
X-Authentication-Warning: cn52.hpc: leon set sender to lsfadmin using -f
To: leon@cn52.hpc
From: LSF <lsfadmin@hpc.fs.uni-lj.si>
Subject: Job 552: <HPL 2.0> Done

Job <HPL 2.0> was submitted from host <prelog> by user <leon> in cluster <hpcfs>.
Job was executed on host(s) <12*cn52>, in queue <normal>, as user <leon> in cluster <hpcfs>.
                            <12*cn53>
...
                            <12*cn39>
                            <12*cn03>
                            <12*cn02>
</home/leon> was used as the home directory.
</home/leon/hpl/bin/ompiicc> was used as the working directory.
Started at Thu Jan  6 14:02:16 2011
Results reported at Sat Jan  8 10:58:33 2011

Your job looked like:

------------------------------------------------------------
# LSBATCH: User input
#!/bin/sh
#BSUB -a openmpi
#BSUB -n 768
#BSUB -J "HPL 2.0"
#BSUB -N
module ()
{
    eval `/usr/local/Modules/default/bin/modulecmd bash $*`
}
ulimit -l unlimited
module load intel/11.1 intel-mkl/10.2 openmpi/1.4.3
mpirun  xhpl