View Single Post
Old 2019-02-22, 12:03   #10
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

22·13·61 Posts
Default

Compiled it on the usual c5d.9xlarge with 18 cores and 36 threads:
gcc -c -O3 -march=skylake-avx512 -DUSE_AVX512 -DUSE_THREADS ../src/*.c >& build.log
grep -i error build.log
[Assuming above grep comes up empty]
gcc -o Mlucas *.o -lm -lpthread -lrt


-DCARRY_16_WAY is not needed in v18 right?

This time all 18 cores was fastest for some reason.

Code:
18.0
./Mlucas -fftlen 4608 -iters 10000 -nthread 36
      4608  msec/iter =    3.24  ROE[avg,max] = [0.246743758, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

./Mlucas -fftlen 4608 -iters 10000 -nthread 34
      4608  msec/iter =    3.18  ROE[avg,max] = [0.246743758, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

./Mlucas -fftlen 4608 -iters 10000 -nthread 32
      4608  msec/iter =    3.15  ROE[avg,max] = [0.246743758, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

./Mlucas -fftlen 4608 -iters 10000 -nthread 30
      4608  msec/iter =    3.07  ROE[avg,max] = [0.246740330, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

./Mlucas -fftlen 4608 -iters 10000 -nthread 28
      4608  msec/iter =    3.03  ROE[avg,max] = [0.246740330, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

./Mlucas -fftlen 4608 -iters 10000 -nthread 26
      4608  msec/iter =    3.08  ROE[avg,max] = [0.246740330, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

./Mlucas -fftlen 4608 -iters 10000 -cpu 0:17
      4608  msec/iter =    2.96  ROE[avg,max] = [0.246740330, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

./Mlucas -fftlen 4608 -iters 10000 -cpu 0:16
      4608  msec/iter =    3.12  ROE[avg,max] = [0.246740330, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

./Mlucas -fftlen 4608 -iters 10000 -cpu 0:15
      4608  msec/iter =    3.09  ROE[avg,max] = [0.246740330, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

./Mlucas -fftlen 4608 -iters 10000 -cpu 0:14
      4608  msec/iter =    4.05  ROE[avg,max] = [0.246727988, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

./Mlucas -fftlen 4608 -iters 10000 -cpu 0:13
      4608  msec/iter =    4.18  ROE[avg,max] = [0.246727988, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

./Mlucas -fftlen 4608 -iters 10000 -cpu 18:35
      4608  msec/iter =    3.00  ROE[avg,max] = [0.246740330, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

./Mlucas -fftlen 4608 -iters 10000 -cpu 0:34:2
      4608  msec/iter =    4.27  ROE[avg,max] = [0.246740330, 0.312500000]  radices = 144 16 32 32  0  0  0  0  0  0	10000-iteration Res mod 2^64, 2^35-1, 2^36-1 = 13BB5C9DDF0CD3D6, 15982066709, 51703797107

From the README.html should this be -cpu 0:n-1 ?

Quote:
Hyperthreaded x86 CPUs: If Intel, use -cpu 0:n, where n is the number of physical cores on your system

Last fiddled with by ATH on 2019-02-22 at 12:07
ATH is offline   Reply With Quote