mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2018-08-11, 08:26   #562
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

53×11 Posts
Default

Quote:
Originally Posted by SELROC View Post
I just tried out mfakto compilation on debian:
Code:
g++ sieve.o timer.o parse.o read_config.o mfaktc.o checkpoint.o signal_handler.o filelocking.o output.o mfakto.o gpusieve.o perftest.o menu.o kbhit.o -m64  -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -L/home/sel/AMDAPPSDK-3.0/lib/x86_64 -L/opt/rocm/opencl/lib/x86_64 -lOpenCL -o ../mfakto
lto1: fatal error: bytecode stream in file ‘sieve.o’ generated with LTO version 5.2 instead of the expected 7.0
compilation terminated.
lto-wrapper: fatal error: g++ returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make: *** [Makefile:83: ../mfakto] Error 1
This may indicate that you are trying to link together object files (.o) from different compilations. Try cleaning up by carefully removing all .o (starting with "sieve.o"), and rebuild.
preda is offline   Reply With Quote
Old 2018-08-11, 09:30   #563
SELROC
 

2·5·347 Posts
Default

Quote:
Originally Posted by preda View Post
This may indicate that you are trying to link together object files (.o) from different compilations. Try cleaning up by carefully removing all .o (starting with "sieve.o"), and rebuild.

Thank you, make clean was needed. Mfakto is compiled now, though with some warning:


Code:
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL -funroll-all-loops -funsafe-loop-optimizations -fira-region=all -fsched-spec-load -fsched-stalled-insns=10 -fsched-stalled-insns-dep=10 -fno-align-labels  -c sieve.c -o sieve.o
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL -c timer.c -o timer.o
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL -c parse.c -o parse.o
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL -c read_config.c -o read_config.o
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL -c mfaktc.c -o mfaktc.o
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL -c checkpoint.c -o checkpoint.o
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL -c signal_handler.c -o signal_handler.o
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL -c filelocking.c -o filelocking.o
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL -c output.c -o output.o
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL  -c mfakto.cpp -o mfakto.o
mfakto.cpp: In function ‘int init_CL(int, cl_int*)’:
mfakto.cpp:553:83: warning: ‘_cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]
   commandQueue = clCreateCommandQueue(context, devices[*devnumber], props, &status);
                                                                                   ^
In file included from my_types.h:22,
                 from mfakto.h:23,
                 from mfakto.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1359:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
mfakto.cpp:553:83: warning: ‘_cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]
   commandQueue = clCreateCommandQueue(context, devices[*devnumber], props, &status);
                                                                                   ^
In file included from my_types.h:22,
                 from mfakto.h:23,
                 from mfakto.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1359:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
mfakto.cpp:557:85: warning: ‘_cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]
     commandQueue = clCreateCommandQueue(context, devices[*devnumber], props, &status);
                                                                                     ^
In file included from my_types.h:22,
                 from mfakto.h:23,
                 from mfakto.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1359:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
mfakto.cpp:557:85: warning: ‘_cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]
     commandQueue = clCreateCommandQueue(context, devices[*devnumber], props, &status);
                                                                                     ^
In file included from my_types.h:22,
                 from mfakto.h:23,
                 from mfakto.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1359:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
mfakto.cpp:571:86: warning: ‘_cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]
   commandQueuePrf = clCreateCommandQueue(context, devices[*devnumber], props, &status);
                                                                                      ^
In file included from my_types.h:22,
                 from mfakto.h:23,
                 from mfakto.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1359:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
mfakto.cpp:571:86: warning: ‘_cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]
   commandQueuePrf = clCreateCommandQueue(context, devices[*devnumber], props, &status);
                                                                                      ^
In file included from my_types.h:22,
                 from mfakto.h:23,
                 from mfakto.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1359:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
mfakto.cpp: In function ‘int run_mod_kernel(cl_ulong, cl_ulong, cl_ulong, cl_float, cl_ulong*, cl_ulong*)’:
mfakto.cpp:1649:26: warning: ‘cl_int clEnqueueTask(cl_command_queue, cl_kernel, cl_uint, _cl_event* const*, _cl_event**)’ is deprecated [-Wdeprecated-declarations]
                  &mod_evt);
                          ^
In file included from my_types.h:22,
                 from mfakto.h:23,
                 from mfakto.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1373:1: note: declared here
 clEnqueueTask(cl_command_queue  /* command_queue */,
 ^~~~~~~~~~~~~
mfakto.cpp:1649:26: warning: ‘cl_int clEnqueueTask(cl_command_queue, cl_kernel, cl_uint, _cl_event* const*, _cl_event**)’ is deprecated [-Wdeprecated-declarations]
                  &mod_evt);
                          ^
In file included from my_types.h:22,
                 from mfakto.h:23,
                 from mfakto.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1373:1: note: declared here
 clEnqueueTask(cl_command_queue  /* command_queue */,
 ^~~~~~~~~~~~~
mfakto.cpp: In function ‘int run_kernel15(cl_kernel, cl_uint, int75, int, cl_uint8, cl_mem, cl_int, cl_int)’:
mfakto.cpp:1717:5: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
 int run_kernel15(cl_kernel l_kernel, cl_uint exp, int75 k_base, int stream, cl_uint8 b_in, cl_mem res, cl_int shiftcount, cl_int bin_max)
     ^~~~~~~~~~~~
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL  -c gpusieve.cpp -o gpusieve.o
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL  -c perftest.cpp -o perftest.o
perftest.cpp:1737:18: warning: invalid suffix on literal; C++11 requires a space between literal and string macro [-Wliteral-suffix]
     std::cerr << "\nKernel file \""KERNEL_FILE"\" not found, it needs to be in the same directory as the executable.\n";
                  ^
perftest.cpp: In function ‘GPUKernels test_cpu_tf_kernels(cl_uint)’:
perftest.cpp:1017:10: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘cl_ulong’ {aka ‘long unsigned int’} [-Wformat=]
   printf("exponent=%u, %lldM FCs (sieved: %lldM FCs) each, ",
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     mystuff.exponent, num_fcs >> 20, ((cl_ulong)num_loops*mystuff.threads_per_grid)>>20);
                       ~~~~~~~~~~~~~
perftest.cpp:1017:10: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 4 has type ‘cl_ulong’ {aka ‘long unsigned int’} [-Wformat=]
perftest.cpp:1021:10: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘cl_ulong’ {aka ‘long unsigned int’} [-Wformat=]
   printf("k=%llu, %f GHz-days (assignment), %f GHz-days (per test): ", k, ghzd, ghzdt); fflush(stdout);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~
perftest.cpp: In function ‘GPUKernels test_gpu_tf_kernels(cl_uint)’:
perftest.cpp:1153:10: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘cl_ulong’ {aka ‘long unsigned int’} [-Wformat=]
   printf("exponent=%u, %lldM FCs each, ", mystuff.exponent, num_fcs>>20);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                    ~~~~~~~~~~~
perftest.cpp:1156:10: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘cl_ulong’ {aka ‘long unsigned int’} [-Wformat=]
   printf("k=%llu, %f GHz-days (assignment), %f GHz-days (per test): ", k, ghzd, ghzdt); fflush(stdout);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ~
perftest.cpp: In function ‘void CL_test(cl_int)’:
perftest.cpp:1692:82: warning: ‘_cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]
   commandQueue = clCreateCommandQueue(context, devices[devnumber], props, &status);
                                                                                  ^
In file included from perftest.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1359:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
perftest.cpp:1692:82: warning: ‘_cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]
   commandQueue = clCreateCommandQueue(context, devices[devnumber], props, &status);
                                                                                  ^
In file included from perftest.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1359:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
perftest.cpp:1696:84: warning: ‘_cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]
     commandQueue = clCreateCommandQueue(context, devices[devnumber], props, &status);
                                                                                    ^
In file included from perftest.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1359:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
perftest.cpp:1696:84: warning: ‘_cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]
     commandQueue = clCreateCommandQueue(context, devices[devnumber], props, &status);
                                                                                    ^
In file included from perftest.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1359:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
perftest.cpp:1707:85: warning: ‘_cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]
   commandQueuePrf = clCreateCommandQueue(context, devices[devnumber], props, &status);
                                                                                     ^
In file included from perftest.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1359:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
perftest.cpp:1707:85: warning: ‘_cl_command_queue* clCreateCommandQueue(cl_context, cl_device_id, cl_command_queue_properties, cl_int*)’ is deprecated [-Wdeprecated-declarations]
   commandQueuePrf = clCreateCommandQueue(context, devices[devnumber], props, &status);
                                                                                     ^
In file included from perftest.cpp:25:
/home/sel/AMDAPPSDK-3.0/include/CL/cl.h:1359:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
perftest.cpp:1767:3: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
   if (mystuff.CompileOptions[0])  // if mfakto.ini defined compile options, override the default with them
   ^~
perftest.cpp:1770:5: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
     printf("Compiling kernels (build options: \"%s\").", program_options);
     ^~~~~~
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL  -c menu.cpp -o menu.o
gcc -m64 -Wall -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/home/sel/AMDAPPSDK-3.0/include -DBUILD_OPENCL  -c kbhit.cpp -o kbhit.o
g++ sieve.o timer.o parse.o read_config.o mfaktc.o checkpoint.o signal_handler.o filelocking.o output.o mfakto.o gpusieve.o perftest.o menu.o kbhit.o -m64  -O3 -funroll-loops  -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -L/home/sel/AMDAPPSDK-3.0/lib/x86_64 -L/opt/rocm/opencl/lib/x86_64 -lOpenCL -o ../mfakto
mfakto.cpp: In function ‘tf_class_opencl.constprop’:
mfakto.cpp:2696:35: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
           status = run_gs_kernel15(kernel_info[use_kernel].kernel, numblocks, shared_mem_required, k_base, b_in, shiftcount);
                                   ^
mfakto.cpp: In function ‘run_gs_kernel15’:
mfakto.cpp:2107:5: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
 int run_gs_kernel15(cl_kernel kernel, cl_uint numblocks, cl_uint shared_mem_required, int75 k_base, cl_uint8 b_in, cl_uint shiftcount)
     ^
read_config.c: In function ‘my_read_string’:
read_config.c:124:9: warning: ‘strncpy’ specified bound depends on the length of the source argument [-Wstringop-overflow=]
         strncpy(string, buf + idx + 1, found);
         ^
read_config.c:120:30: note: length computed here
       found = (unsigned int) strlen(buf + idx + 1);
                              ^

Note the following line:


mfakto.cpp:2107:5: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
  Reply With Quote
Old 2018-08-11, 13:27   #564
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

25378 Posts
Default

Quote:
Originally Posted by SELROC View Post
mfakto.cpp:2107:5: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
Try it. If it passes the self-test, it's good to go.
preda is offline   Reply With Quote
Old 2018-08-11, 19:34   #565
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10111110010112 Posts
Default Transform selection in V3.5 OpenOwL

Code:
C:\msys64\home\ken\v35test>openowl-v35-457601f-w64 -h
gpuowl-OpenCL 3.5-457601f

Command line options:

-user <name>       : specify the user name.
-cpu  <name>       : specify the hardware name.
-time              : display kernel profiling information.
-fft <size>        : specify FFT size, such as: 5000K, 4M, +2, -1.
-block 100|200|400 : select PRP-check block size. Smaller block is slower but detects errors earlier.
-carry long|short  : force carry type. Short carry may be faster, but requires high bits/word.
-list fft          : display a list of available FFT configurations.
-device <N>        : select a specific device:
 0 : Ellesmere-36x1266-@28:0.0 Radeon (TM) RX 480 Graphics
 1 : gfx804-8x1203-@3:0.0 Radeon 550 Series
 2 : Intel(R) Xeon(R) CPU           E5645  @ 2.40GHz-12x2394-@0:0.0
Code:
C:\msys64\home\ken\v35test>openowl-v35-457601f-w64 -list fft
gpuowl-OpenCL 3.5-457601f
   FFT  maxExp    W    H M
  0.5M   10.3M  512  512 1
  1.0M   20.3M 1024  512 1
  1.0M   20.3M  512 1024 1
  2.0M   39.8M 1024 1024 1
  2.0M   39.8M  512 2048 1
  2.0M   39.8M 2048  512 1
  2.5M   49.4M  512  512 5
  4.0M   78.0M 1024 2048 1
  4.0M   78.0M 2048 1024 1
  4.0M   78.0M 4096  512 1
  4.5M   87.5M  512  512 9
  5.0M   96.9M 1024  512 5
  5.0M   96.9M  512 1024 5
  8.0M  153.0M 2048 2048 1
  8.0M  153.0M 4096 1024 1
  9.0M  171.6M 1024  512 9
  9.0M  171.6M  512 1024 9
 10.0M  190.0M 1024 1024 5
 10.0M  190.0M  512 2048 5
 10.0M  190.0M 2048  512 5
 16.0M  300.0M 4096 2048 1
 18.0M  336.3M 1024 1024 9
 18.0M  336.3M  512 2048 9
 18.0M  336.3M 2048  512 9
 20.0M  372.5M 1024 2048 5
 20.0M  372.5M 2048 1024 5
 20.0M  372.5M 4096  512 5
 36.0M  659.0M 1024 2048 9
 36.0M  659.0M 2048 1024 9
 36.0M  659.0M 4096  512 9
 40.0M  730.0M 2048 2048 5
 40.0M  730.0M 4096 1024 5
 72.0M 1290.9M 2048 2048 9
 72.0M 1290.9M 4096 1024 9
 80.0M 1429.8M 4096 2048 5
144.0M 2527.5M 4096 2048 9

FFT 4096K: Width 1024 (256x4), Height 2048 (256x8); 18.48 bits/word
Note: using short carry kernels
 ...
But how does a user specify selection among transforms of the same size? (Most lengths above have more than one flavor.)

Presumably the program selects the minimum adequate length for speed. How does the program select one flavor of a given length versus another flavor?

Are there speed differences or other differences known?

Last fiddled with by kriesel on 2018-08-11 at 19:38
kriesel is offline   Reply With Quote
Old 2018-08-11, 21:53   #566
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

55F16 Posts
Default

Quote:
Originally Posted by kriesel View Post
Code:
 18.0M  336.3M 1024 1024 9
 18.0M  336.3M  512 2048 9
 18.0M  336.3M 2048  512 9
But how does a user specify selection among transforms of the same size? (Most lengths above have more than one flavor.)
As you say, by default the first variant of the size that fits is selected. The other ones can be specified using "-fft +1", "-fft +2", "-fft -1" etc. on the command line, which move in the list with the given increment. (not ideal, I know).

I would be surprised if a higher-size FFT would ever be faster than a lower-sized one. OTOH among the same-size variants, there are speed differences, and the first (the default) is not necessarily the fastest. The user should try them and choose the best.
preda is offline   Reply With Quote
Old 2018-08-12, 07:44   #567
SELROC
 

215108 Posts
Default

Quote:
Originally Posted by preda View Post
Try it. If it passes the self-test, it's good to go.



Mfakto passed the self test. Doing performance testing now.
  Reply With Quote
Old 2018-08-12, 08:59   #568
SELROC
 

22×13×29 Posts
Default

Quote:
Originally Posted by SELROC View Post
Mfakto passed the self test. Doing performance testing now.

Also got some exponent to test, but the results.txt files does not seem in a format that can be submitted to primenet...
  Reply With Quote
Old 2018-08-12, 14:32   #569
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

6,091 Posts
Default samples of mfakto output

Quote:
Originally Posted by SELROC View Post
Also got some exponent to test, but the results.txt files does not seem in a format that can be submitted to primenet...
Sample mfakto output:
Code:
[Sun Feb 25 22:55:24 2018]
UID: kriesel/condorella-rx550, no factor for M942477799 from 2^70 to 2^71 [mfakto 0.15pre6-Win cl_barrett15_73_gs_2]
Code:
[Fri Apr 13 11:13:22 2018]
UID: kriesel/condorella-rx550, M290001377 has a factor: 96303240212210144213599 [TF:76:77:mfakto 0.15pre6-Win cl_barrett15_82_gs_2]
[Fri Apr 13 13:53:29 2018]
UID: kriesel/condorella-rx550, found 1 factor for M290001377 from 2^76 to 2^77 [mfakto 0.15pre6-Win cl_barrett15_82_gs_2]
Code:
[Thu Apr 26 12:42:53 2018]
UID: kriesel/condorella-rx550, M457739 has a factor: 3300797548509019438097 [TF:71:72:mfakto 0.15pre6-Win cl_barrett15_73_gs_2]
[Sat Apr 28 22:07:45 2018]
UID: kriesel/condorella-rx550, M457739 has a factor: 2465216072071389650639 [TF:71:72:mfakto 0.15pre6-Win cl_barrett15_73_gs_2]
[Mon Apr 30 00:18:37 2018]
UID: kriesel/condorella-rx550, found 2 factors for M457739 from 2^71 to 2^72 [mfakto 0.15pre6-Win cl_barrett15_73_gs_2]
What does your output look like?

Last fiddled with by kriesel on 2018-08-12 at 14:33
kriesel is offline   Reply With Quote
Old 2018-08-12, 15:04   #570
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

6,091 Posts
Default RX480 timings in V3.3 and V3.5 OpenOwL

See the attachment. With rare exceptions, the default lengths seem fastest in V3.5. V3.5 is faster than each corresponding V3.3 test, by slight and varying amounts versus fft length. All tests performed on Win 7 X64, same system, same patch state, same driver version, same gpu. Initial iterations are faster than later ones. Most tests were 160k iterations or more in length and ignore the first 800 to 10,000 iterations' speed, and average the rest.

In a nutshell, use the default fft lengths, except for 10M and 18M, use -fft +2 on the command line or in a batch file.
It's possible some further gains could be found by testing carry choices.

A couple of V3.5 test exponents produced errors, indicating the maximum exponent guidance may be set a bit too high.
Attached Files
File Type: pdf openowl v33 and v35 timings.pdf (20.1 KB, 129 views)
kriesel is offline   Reply With Quote
Old 2018-08-12, 15:10   #571
SELROC
 

766410 Posts
Default

Quote:
Originally Posted by kriesel View Post
Sample mfakto output:
Code:
[Sun Feb 25 22:55:24 2018]
UID: kriesel/condorella-rx550, no factor for M942477799 from 2^70 to 2^71 [mfakto 0.15pre6-Win cl_barrett15_73_gs_2]
Code:
[Fri Apr 13 11:13:22 2018]
UID: kriesel/condorella-rx550, M290001377 has a factor: 96303240212210144213599 [TF:76:77:mfakto 0.15pre6-Win cl_barrett15_82_gs_2]
[Fri Apr 13 13:53:29 2018]
UID: kriesel/condorella-rx550, found 1 factor for M290001377 from 2^76 to 2^77 [mfakto 0.15pre6-Win cl_barrett15_82_gs_2]
Code:
[Thu Apr 26 12:42:53 2018]
UID: kriesel/condorella-rx550, M457739 has a factor: 3300797548509019438097 [TF:71:72:mfakto 0.15pre6-Win cl_barrett15_73_gs_2]
[Sat Apr 28 22:07:45 2018]
UID: kriesel/condorella-rx550, M457739 has a factor: 2465216072071389650639 [TF:71:72:mfakto 0.15pre6-Win cl_barrett15_73_gs_2]
[Mon Apr 30 00:18:37 2018]
UID: kriesel/condorella-rx550, found 2 factors for M457739 from 2^71 to 2^72 [mfakto 0.15pre6-Win cl_barrett15_73_gs_2]
What does your output look like?

it lacks the UID...


Code:
[Sun Aug 12 10:20:01 2018]
no factor for M218863159 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:22:37 2018]
no factor for M218863279 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:25:13 2018]
no factor for M218863283 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:27:48 2018]
no factor for M218863367 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:30:23 2018]
no factor for M218863409 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:32:59 2018]
no factor for M218863501 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:35:34 2018]
no factor for M218863529 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:38:11 2018]
no factor for M218863531 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:40:44 2018]
no factor for M218863649 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:43:19 2018]
no factor for M218863663 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
  Reply With Quote
Old 2018-08-12, 15:42   #572
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

137138 Posts
Default

Quote:
Originally Posted by SELROC View Post
it lacks the UID...


Code:
[Sun Aug 12 10:20:01 2018]
no factor for M218863159 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:22:37 2018]
no factor for M218863279 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:25:13 2018]
no factor for M218863283 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:27:48 2018]
no factor for M218863367 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:30:23 2018]
no factor for M218863409 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:32:59 2018]
no factor for M218863501 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:35:34 2018]
no factor for M218863529 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:38:11 2018]
no factor for M218863531 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:40:44 2018]
no factor for M218863649 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
[Sun Aug 12 10:43:19 2018]
no factor for M218863663 from 2^70 to 2^71 [mfakto 0.15pre6 cl_barrett15_73_gs_2]
UID is optional; modify the ini file and restart to include that. You'll still need to log in before pasting the results into the manual results submission form.
kriesel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1680 2021-09-13 17:01
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 10:41.


Tue Jan 18 10:41:35 UTC 2022 up 179 days, 5:10, 0 users, load averages: 1.24, 1.27, 1.32

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔