HPL (High Performance LINPACK) je standardni industrijski test za HPC, ki rešuje (naključni) gosti linearni sistem v dvojni ločljivosti (64 bitov) na računalnikih s porazdeljenim spominom.
HPL test je primeren za testiranje sestavov predvsem zaradi tega, ker se pri testiranju upošteva procesorska moč, celotna količina spomina in povezava med vozlišči. Vhodni parametri programa niso vnaprej predpisani in se jih lahko prireja tako, da bo dobimo čim boljši rezultat, ki je izpisan kot GFlops. Teoretična zmogljivost zmogljivost sestava HPCFS je
Rpeak = 2.93 GHz * 768 jeder * 4 operacije/takt = 9001 Gflops = 9Tflops
Slika kaže potek testa na sestavu HPCFS z 768 procesorji (modra krivulja) za različne velikost sitema. Dosežena največja zmogljivost je Rpeak=8Tflops od teoretičnih 9Tflops. Večja količina spomina oz večji problemi N dajejo nekoliko boljše rezultate. Predvsem pa je večja zmogljivost dosežena z večjim številom procesorskih jeder. Če uporabimo manjše število jeder je zmogljivost HPL proporcionalno manjša, kar je prikazano z rdečo krivuljo, ki predstavlja polovično zmogljivost 380 procesov. Največja velikost problema je prav tako manjša. Vse med seboj omejuje še komunikacija, zato pri večanju problema rezultati izzvenijo. Ko se bližamo mejni vrednosti razpoložljivega spomina se čas reševanja problema enormno poveča, kar pomeni da je začel sistem delovati v ostranjevanju (swapping). Testi so pokazali, da vklopljen SMT in Turbo način ne zmanjšuje hitrosti klasičnih MPI programov kot je HPL.
Najboljši doseženi rezultat HPL na sestavu HPCFS je 8Tflops, kar ustreza 89% vzporedni izkoriščenosti sestava. Rezultat je primerljiv s podobnimi sestavi.
Naslednji izpis Linpacka je bil narejen na 768 jedrih z vklopljenim SMT in turbo s prevajalnikom Intel 11.1, MKL 10.2 in OpenMPI 1.4.3
================================================================================
HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008
Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================
An explanation of the input/output parameters follows:
T/V : Wall time / encoded variant.
N : The order of the coefficient matrix A.
NB : The partitioning blocking factor.
P : The number of process rows.
Q : The number of process columns.
Time : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.
The following parameter values will be used:
N : 1008 10080 20160 30072 50064 100128 150024 200424
300048 200424 300048 400008 418488 500136 600096 610008
613872 619920 620928 623112
NB : 168
PMAP : Row-major process mapping
P : 24
Q : 32
PFACT : Crout
NBMIN : 4
NDIV : 2
RFACT : Crout
BCAST : 2ringM
DEPTH : 0
SWAP : Mix (threshold = 128)
L1 : transposed form
U : transposed form
EQUIL : yes
ALIGN : 8 double precision words
--------------------------------------------------------------------------------
- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be 2.220446e-16
- Computational tests pass if scaled residuals are less than 16.0
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 1008 168 24 32 0.20 3.493e+00
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0029225 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 10080 168 24 32 1.74 3.917e+02
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0019798 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 20160 168 24 32 2.93 1.867e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0017076 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 30072 168 24 32 6.86 2.642e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0013021 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 50064 168 24 32 21.22 3.942e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0011398 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 100128 168 24 32 109.84 6.093e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0008548 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 150024 168 24 32 319.56 7.044e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0007470 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 200424 168 24 32 727.34 7.379e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0007286 ...... PASSED
================================================================================
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 300048 168 24 32 2328.99 7.732e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0007188 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 200424 168 24 32 728.05 7.372e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0007286 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 300048 168 24 32 2331.25 7.725e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0007188 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 400008 168 24 32 5401.88 7.899e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0005789 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 418488 168 24 32 6167.65 7.922e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0005077 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 500136 168 24 32 10481.26 7.957e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0005143 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 600096 168 24 32 18004.10 8.002e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0005549 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 610008 168 24 32 18923.17 7.997e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0005519 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 613872 168 24 32 19276.84 8.000e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0005749 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 619920 168 24 32 23018.07 6.900e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0006658 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 620928 168 24 32 22583.46 7.067e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0004815 ...... PASSED
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR03C2C4 623112 168 24 32 29785.19 5.415e+03
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0005175 ...... PASSED
================================================================================
Finished 20 tests with the following results:
20 tests completed and passed residual checks,
0 tests completed and failed residual checks,
0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------
End of Tests.
===============================================================================
Izpis LSF poročila:
Date: Sat, 8 Jan 2011 10:58:34 +0100
X-Authentication-Warning: cn52.hpc: leon set sender to lsfadmin using -f
To: leon@cn52.hpc
From: LSF <lsfadmin@hpc.fs.uni-lj.si>
Subject: Job 552: <HPL 2.0> Done
Job <HPL 2.0> was submitted from host <prelog> by user <leon> in cluster <hpcfs>.
Job was executed on host(s) <12*cn52>, in queue <normal>, as user <leon> in cluster <hpcfs>.
<12*cn53>
...
<12*cn39>
<12*cn03>
<12*cn02>
</home/leon> was used as the home directory.
</home/leon/hpl/bin/ompiicc> was used as the working directory.
Started at Thu Jan 6 14:02:16 2011
Results reported at Sat Jan 8 10:58:33 2011
Your job looked like:
------------------------------------------------------------
# LSBATCH: User input
#!/bin/sh
#BSUB -a openmpi
#BSUB -n 768
#BSUB -J "HPL 2.0"
#BSUB -N
module ()
{
eval `/usr/local/Modules/default/bin/modulecmd bash $*`
}
ulimit -l unlimited
module load intel/11.1 intel-mkl/10.2 openmpi/1.4.3
mpirun xhpl