![]() |
Anyone had any luck running mfakto on the Mali GPUs found in ARM devices or am I barking up the wrong tree? After removing x86-specific flag -m64 from the Makefile and pointing to the right include and lib directories mfakto compiled but failed to compile OpenCL kernels at runtime:
Compilation:[code]gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -funroll-all-loops -funsafe-loop-optimizations -fira-region=all -fsched-spec-load -fsched-stalled-insns=10 -fsched-stalled-insns-dep=10 -fno-align-labels -c sieve.c -o sieve.o gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c timer.c -o timer.o gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c parse.c -o parse.o gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c read_config.c -o read_config.o gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c mfaktc.c -o mfaktc.o gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c checkpoint.c -o checkpoint.o gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c signal_handler.c -o signal_handler.o gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c filelocking.c -o filelocking.o gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c output.c -o output.o gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c mfakto.cpp -o mfakto.o gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c gpusieve.cpp -o gpusieve.o gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c perftest.cpp -o perftest.o perftest.cpp:1737:18: warning: invalid suffix on literal; C++11 requires a space between literal and string macro [-Wliteral-suffix] std::cerr << "\nKernel file \""KERNEL_FILE"\" not found, it needs to be in the same directory as the executable.\n"; ^ perftest.cpp: In function ‘GPUKernels test_cpu_tf_kernels(cl_uint)’: perftest.cpp:1018:88: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘cl_ulong {aka long unsigned int}’ [-Wformat=] mystuff.exponent, num_fcs >> 20, ((cl_ulong)num_loops*mystuff.threads_per_grid)>>20); ~~~~~~~~~~~~~ ^ perftest.cpp:1018:88: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 4 has type ‘cl_ulong {aka long unsigned int}’ [-Wformat=] perftest.cpp:1021:86: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘cl_ulong {aka long unsigned int}’ [-Wformat=] printf("k=%llu, %f GHz-days (assignment), %f GHz-days (per test): ", k, ghzd, ghzdt); fflush(stdout); ^ perftest.cpp: In function ‘GPUKernels test_gpu_tf_kernels(cl_uint)’: perftest.cpp:1153:72: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘cl_ulong {aka long unsigned int}’ [-Wformat=] printf("exponent=%u, %lldM FCs each, ", mystuff.exponent, num_fcs>>20); ~~~~~~~~~~~^ perftest.cpp:1156:86: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘cl_ulong {aka long unsigned int}’ [-Wformat=] printf("k=%llu, %f GHz-days (assignment), %f GHz-days (per test): ", k, ghzd, ghzdt); fflush(stdout); ^ perftest.cpp: In function ‘void CL_test(cl_int)’: perftest.cpp:1767:3: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation] if (mystuff.CompileOptions[0]) // if mfakto.ini defined compile options, override the default with them ^~ perftest.cpp:1770:5: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’ printf("Compiling kernels (build options: \"%s\").", program_options); ^~~~~~ gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c menu.cpp -o menu.o menu.cpp: In function ‘void handle_menu(mystuff_t*)’: menu.cpp:204:10: warning: ignoring return value of ‘char* fgets(char*, int, FILE*)’, declared with attribute warn_unused_result [-Wunused-result] fgets(choice_string, 9, stdin); // std:cin does not allow empty input ~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~ gcc -Wall -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -I/usr/rk3399-libs/include -DBUILD_OPENCL -c kbhit.cpp -o kbhit.o kbhit.cpp: In member function ‘int keyboard::getch()’: kbhit.cpp:55:16: warning: ignoring return value of ‘ssize_t read(int, void*, size_t)’, declared with attribute warn_unused_result [-Wunused-result] } else read(0,&ch,1); ~~~~^~~~~~~~~ g++ sieve.o timer.o parse.o read_config.o mfaktc.o checkpoint.o signal_handler.o filelocking.o output.o mfakto.o gpusieve.o perftest.o menu.o kbhit.o -O3 -funroll-loops -ffast-math -finline-functions -frerun-loop-opt -fgcse-sm -fgcse-las -flto -L/usr/rk3399-libs/lib64 -lOpenCL -o ../mfakto[/code]Runtime:[code]mfakto 0.15pre6 (64bit build) Runtime options Inifile mfakto.ini Verbosity 1 SieveOnGPU yes MoreClasses yes GPUSievePrimes 81157 GPUSieveProcessSize 24Ki bits GPUSieveSize 96Mi bits FlushInterval 0 WorkFile worktodo.txt ResultsFile results.txt Checkpoints enabled CheckpointDelay 300s Stages enabled StopAfterFactor class PrintMode compact V5UserID none ComputerID none TimeStampInResults yes VectorSize 2 GPUType AUTO SmallExp no UseBinfile mfakto_Kernels.elf Compiletime options Select device - Get device info: WARNING: Unknown GPU name, assuming GCN. Please post the device name "Mali-T860 (ARM)" to http://www.mersenneforum.org/showthread.php?t=15646 to have it added to mfakto. Set GPUType in mfakto.ini to select a GPU type yourself to avoid this warning. OpenCL device info name Mali-T860 (ARM) device (driver) version OpenCL 1.2 v1.r13p0-00rel0-git(a4271c9).31ba04af2d3c01618138bef3aed66c2c (1.2) maximum threads per block 256 maximum threads per grid 16777216 number of multiprocessors 4 (256 compute elements) clock rate 200MHz Automatic parameters threads per grid 0 optimizing kernels for GCN Compiling kernels. BUILD OUTPUT In file included from <source>:84: ./barrett15.cl:45:41: error: Ternary operator argument types do not match tmp.d0 = (tmp.d4 > a.d4) ? a.d0 : tmp.d0; ~~~~^~ ./barrett15.cl:46:41: error: Ternary operator argument types do not match tmp.d1 = (tmp.d4 > a.d4) ? a.d1 : tmp.d1; ~~~~^~ ./barrett15.cl:47:41: error: Ternary operator argument types do not match tmp.d2 = (tmp.d4 > a.d4) ? a.d2 : tmp.d2; ~~~~^~ ./barrett15.cl:48:41: error: Ternary operator argument types do not match tmp.d3 = (tmp.d4 > a.d4) ? a.d3 : tmp.d3; ~~~~^~ ./barrett15.cl:49:41: error: Ternary operator argument types do not match tmp.d4 = (tmp.d4 > a.d4) ? a.d4 : tmp.d4; // & 0x7FFF not necessary as tmp.d4 is <= a.d4 ~~~~^~ ./barrett15.cl:1993:41: error: Ternary operator argument types do not match tmp.d0 = (tmp.d5 > a.d5) ? a.d0 : tmp.d0; ~~~~^~ ./barrett15.cl:1994:41: error: Ternary operator argument types do not match tmp.d1 = (tmp.d5 > a.d5) ? a.d1 : tmp.d1; ~~~~^~ ./barrett15.cl:1995:41: error: Ternary operator argument types do not match tmp.d2 = (tmp.d5 > a.d5) ? a.d2 : tmp.d2; ~~~~^~ ./barrett15.cl:1996:41: error: Ternary operator argument types do not match tmp.d3 = (tmp.d5 > a.d5) ? a.d3 : tmp.d3; ~~~~^~ ./barrett15.cl:1997:41: error: Ternary operator argument types do not match tmp.d4 = (tmp.d5 > a.d5) ? a.d4 : tmp.d4; ~~~~^~ ./barrett15.cl:1998:41: error: Ternary operator argument types do not match tmp.d5 = (tmp.d5 > a.d5) ? a.d5 : tmp.d5; // & 0x7FFF not necessary as tmp.d5 is <= a.d5 ~~~~^~ error: Compiler frontend failed (error code 59) END OF BUILD OUTPUT Error -11 (Build program failure): clBuildProgram ERROR: load_kernels(0) failed[/code]The environment can compile and successfully run a Hello World OpenCL program which uses a kernel. All I've tried so far is manually setting every GPUType in worktodo.ini. It errors at every instance of a ternary operator. I tried replacing one with an if statement but I'm fumbling as I know zero about OpenCL kernels and of course it didn't work. Any thoughts? |
Hi,
I am playing a little with mfakto and now got this message with my AMD A8 7600. I posted you the verbose output, since I didnt know if you would need it. [code] Select device - Get device info: Device 1/1: Spectre (Advanced Micro Devices, Inc.), device version: OpenCL 1.2 AMD-APP (2639.3), driver version: 2639.3 Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event Global memory:2139906048, Global memory cache: 16384, local memory: 32768, workgroup size: 256, Work dimensions: 3[1024, 1024, 1024, 0, 0] , Max clock speed:720, compute units:6 WARNING: Unknown GPU name, assuming GCN. Please post the device name "Spectre (Advanced Micro Devices, Inc.)" to http://www.mersenneforum.org/showthread.php?t=15646 to have it added to mfakto. Set GPUType in mfakto.ini to select a GPU type yourself to avoid this warning. OpenCL device info name Spectre (Advanced Micro Devices, Inc.) device (driver) version OpenCL 1.2 AMD-APP (2639.3) (2639.3) maximum threads per block 1024 maximum threads per grid 1073741824 number of multiprocessors 6 (384 compute elements) clock rate 720MHz [/code] |
[QUOTE=cbug;506732]Hi,
I am playing a little with mfakto and now got this message with my AMD A8 7600. I posted you the verbose output, since I didnt know if you would need it. [code] Select device - Get device info: Device 1/1: Spectre (Advanced Micro Devices, Inc.), device version: OpenCL 1.2 AMD-APP (2639.3), driver version: 2639.3 Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event Global memory:2139906048, Global memory cache: 16384, local memory: 32768, workgroup size: 256, Work dimensions: 3[1024, 1024, 1024, 0, 0] , Max clock speed:720, compute units:6 WARNING: Unknown GPU name, assuming GCN. Please post the device name "Spectre (Advanced Micro Devices, Inc.)" to http://www.mersenneforum.org/showthread.php?t=15646 to have it added to mfakto. Set GPUType in mfakto.ini to select a GPU type yourself to avoid this warning. OpenCL device info name Spectre (Advanced Micro Devices, Inc.) device (driver) version OpenCL 1.2 AMD-APP (2639.3) (2639.3) maximum threads per block 1024 maximum threads per grid 1073741824 number of multiprocessors 6 (384 compute elements) clock rate 720MHz [/code][/QUOTE] In case the performance is low, run this command: [CODE] ./mfakto -d 2 --perftest 1 [/CODE] it will adjust parameters for your GPU. |
I might be getting access to a Mac Pro soon. Will mfakto work on a Windows virtual machine?
|
[QUOTE=ixfd64;511437]I might be getting access to a Mac Pro soon. Will mfakto work on a Windows virtual machine?[/QUOTE]
Probably not optimally. But why don't you give it a go, and determine empirically? :smile: |
I just got the computer and installed mfakto on a Windows virtual machine. However, mfakto doesn't run and immediately crashes with a generic error code. I did some research and found some sources saying macOS doesn't support GPU passthrough. It seems that is still the case. :\
Currently I'm trying to compile mfakto myself on the Mac Pro. I saw a few posts from elsewhere saying that macOS doesn't need the AMD APP SDK as it contains a native OpenCL implementation, but the mfakto makefile specifically references the SDK directories. I copied the SDK from a Linux system but got a bunch of syntax errors. Do I even need the SDK to build mfakto? I did see a post from David (airsquirrels) saying that he managed to build mfakto on macOS after modifying the OpenCL kernels, but he hasn't been on this forum in a while. |
[QUOTE=M344587487;506054]Anyone had any luck running mfakto on the Mali GPUs found in ARM devices or am I barking up the wrong tree? After removing x86-specific flag -m64 from the Makefile and pointing to the right include and lib directories mfakto compiled but failed to compile OpenCL kernels at runtime:
...[code] Select device - Get device info: WARNING: Unknown GPU name, assuming GCN. Please post the device name "Mali-T860 (ARM)" to [URL]http://www.mersenneforum.org/showthread.php?t=15646[/URL] to have it added to mfakto. Set GPUType in mfakto.ini to select a GPU type yourself to avoid this warning. OpenCL device info name Mali-T860 (ARM) device (driver) version OpenCL 1.2 v1.r13p0-00rel0-git(a4271c9).31ba04af2d3c01618138bef3aed66c2c (1.2) maximum threads per block 256 maximum threads per grid 16777216 number of multiprocessors 4 (256 compute elements) clock rate 200MHz Automatic parameters threads per grid 0 optimizing kernels for GCN END OF BUILD OUTPUT Error -11 (Build program failure): clBuildProgram ERROR: load_kernels(0) failed[/code]The environment can compile and successfully run a Hello World OpenCL program which uses a kernel. All I've tried so far is manually setting every GPUType in worktodo.ini. It errors at every instance of a ternary operator. I tried replacing one with an if statement but I'm fumbling as I know zero about OpenCL kernels and of course it didn't work. Any thoughts?[/QUOTE]I think it likely you'll need to do similar to what BDot did to make the Intel type, to allow running mfakto on Intel igps. Perhaps BDot could chime in with some guidance. Consider independently confirming opencl is present and working on the device, and what its capabilities are, and testing your build process by some available sample programs. Perhaps there's something here you can use: [URL="https://www.cnx-software.com/2018/05/13/how-to-get-started-with-opencl-on-odroid-xu4-board-with-arm-mali-t628mp6-gpu/"]https://www.cnx-software.com/2018/05/13/how-to-get-started-with-opencl-on-odroid-xu4-board-with-arm-mali-t628mp6-gpu/ [/URL] The mfakto header info is encouraging that opencl is there and reading a bit about the mali device. I think you may be one of the first explorers in this new territory, as far as GIMPS goes. Good luck! |
Do these seem like reasonable TF --perftest results for a Radeon VII? Just used --perftest on a single instance, don't know if that's how it should be done or how to compare them to the GPUs here: [URL]https://www.mersenne.ca/mfaktc.php?show=642[/URL]
[code]WARNING: Unknown GPU name, assuming GCN. Please post the device name "gfx906 (Advanced Micro Devices, Inc.)" to http://www.mersenneforum.org/showthread.php?t=15646 to have it added to mfakto. Set GPUType in mfakto.ini to select a GPU type yourself to avoid this warning. 5. GPU tf kernels exponent=2000093 ... calibrating exponent=2000093, 24575M FCs each, k=73783545359978, 29.889569 GHz-days (assignment), 0.050239 GHz-days (per test): .............. cl_barrett32_76_gs [64-76]: 1893.28 ms ==> 13611.22M FCs/s ==> 2292.67 GHz-days/day cl_barrett32_77_gs [64-77]: 1925.87 ms ==> 13380.87M FCs/s ==> 2253.87 GHz-days/day cl_barrett15_69_gs [60-69]: 2082.08 ms ==> 12376.98M FCs/s ==> 2084.77 GHz-days/day cl_barrett15_70_gs [60-69]: 2084.00 ms ==> 12365.52M FCs/s ==> 2082.85 GHz-days/day cl_barrett32_87_gs [65-87]: 2195.95 ms ==> 11735.12M FCs/s ==> 1976.66 GHz-days/day cl_barrett32_79_gs [64-79]: 2224.91 ms ==> 11582.40M FCs/s ==> 1950.94 GHz-days/day cl_barrett32_88_gs [65-88]: 2230.25 ms ==> 11554.69M FCs/s ==> 1946.27 GHz-days/day cl_barrett15_71_gs [60-70]: 2418.11 ms ==> 10656.98M FCs/s ==> 1795.06 GHz-days/day cl_barrett15_73_gs [60-73]: 2508.72 ms ==> 10272.09M FCs/s ==> 1730.23 GHz-days/day cl_barrett32_92_gs [65-92]: 2528.90 ms ==> 10190.14M FCs/s ==> 1716.42 GHz-days/day cl_barrett15_74_gs [60-74]: 2553.34 ms ==> 10092.60M FCs/s ==> 1700.00 GHz-days/day cl_barrett15_82_gs [60-81]: 2810.59 ms ==> 9168.81M FCs/s ==> 1544.39 GHz-days/day cl_barrett15_83_gs [60-82]: 3277.90 ms ==> 7861.68M FCs/s ==> 1324.22 GHz-days/day cl_barrett15_88_gs [60-87]: 3304.08 ms ==> 7799.38M FCs/s ==> 1313.73 GHz-days/day Resulting speed for M2000093: bit_min - bit_max GHz-days/day kernelname 60 - 64 2084.775 cl_barrett15_69_gs 64 - 76 2292.670 cl_barrett32_76_gs 76 - 77 2253.871 cl_barrett32_77_gs 77 - 87 1976.661 cl_barrett32_87_gs 87 - 88 1946.269 cl_barrett32_88_gs 88 - 92 1716.425 cl_barrett32_92_gs exponent=39000037 ... calibrating exponent=39000037, 24575M FCs each, k=3783943912403, 1.532868 GHz-days (assignment), 0.050239 GHz-days (per test): .............. cl_barrett32_76_gs [64-76]: 2383.45 ms ==> 10811.99M FCs/s ==> 1821.17 GHz-days/day cl_barrett32_77_gs [64-77]: 2418.97 ms ==> 10653.20M FCs/s ==> 1794.42 GHz-days/day cl_barrett15_69_gs [60-69]: 2648.55 ms ==> 9729.77M FCs/s ==> 1638.88 GHz-days/day cl_barrett15_70_gs [60-69]: 2649.91 ms ==> 9724.80M FCs/s ==> 1638.04 GHz-days/day cl_barrett32_87_gs [65-87]: 2789.28 ms ==> 9238.87M FCs/s ==> 1556.19 GHz-days/day cl_barrett32_79_gs [64-79]: 2832.18 ms ==> 9098.94M FCs/s ==> 1532.62 GHz-days/day cl_barrett32_88_gs [65-88]: 2844.32 ms ==> 9060.11M FCs/s ==> 1526.08 GHz-days/day cl_barrett15_71_gs [60-70]: 3122.32 ms ==> 8253.41M FCs/s ==> 1390.20 GHz-days/day cl_barrett15_73_gs [60-73]: 3237.86 ms ==> 7958.90M FCs/s ==> 1340.60 GHz-days/day cl_barrett32_92_gs [65-92]: 3242.44 ms ==> 7947.66M FCs/s ==> 1338.70 GHz-days/day cl_barrett15_74_gs [60-74]: 3298.67 ms ==> 7812.17M FCs/s ==> 1315.88 GHz-days/day cl_barrett15_82_gs [60-81]: 3606.80 ms ==> 7144.78M FCs/s ==> 1203.46 GHz-days/day cl_barrett15_83_gs [60-82]: 4262.84 ms ==> 6045.22M FCs/s ==> 1018.26 GHz-days/day cl_barrett15_88_gs [60-87]: 4287.53 ms ==> 6010.40M FCs/s ==> 1012.39 GHz-days/day Resulting speed for M39000037: bit_min - bit_max GHz-days/day kernelname 60 - 64 1638.880 cl_barrett15_69_gs 64 - 76 1821.170 cl_barrett32_76_gs 76 - 77 1794.423 cl_barrett32_77_gs 77 - 87 1556.193 cl_barrett32_87_gs 87 - 88 1526.083 cl_barrett32_88_gs 88 - 92 1338.702 cl_barrett32_92_gs exponent=66362159 ... calibrating exponent=66362159, 24575M FCs each, k=2223766598517, 0.900843 GHz-days (assignment), 0.050239 GHz-days (per test): .............. cl_barrett32_76_gs [64-76]: 2383.68 ms ==> 10810.92M FCs/s ==> 1820.99 GHz-days/day cl_barrett32_77_gs [64-77]: 2419.53 ms ==> 10650.73M FCs/s ==> 1794.01 GHz-days/day cl_barrett15_69_gs [60-69]: 2649.23 ms ==> 9727.28M FCs/s ==> 1638.46 GHz-days/day cl_barrett15_70_gs [60-69]: 2650.28 ms ==> 9723.44M FCs/s ==> 1637.81 GHz-days/day cl_barrett32_87_gs [65-87]: 2789.70 ms ==> 9237.50M FCs/s ==> 1555.96 GHz-days/day cl_barrett32_79_gs [64-79]: 2832.34 ms ==> 9098.41M FCs/s ==> 1532.53 GHz-days/day cl_barrett32_88_gs [65-88]: 2844.57 ms ==> 9059.31M FCs/s ==> 1525.95 GHz-days/day cl_barrett15_71_gs [60-70]: 3122.16 ms ==> 8253.83M FCs/s ==> 1390.27 GHz-days/day cl_barrett15_73_gs [60-73]: 3238.22 ms ==> 7958.02M FCs/s ==> 1340.45 GHz-days/day cl_barrett32_92_gs [65-92]: 3242.91 ms ==> 7946.52M FCs/s ==> 1338.51 GHz-days/day cl_barrett15_74_gs [60-74]: 3298.96 ms ==> 7811.48M FCs/s ==> 1315.76 GHz-days/day cl_barrett15_82_gs [60-81]: 3606.81 ms ==> 7144.77M FCs/s ==> 1203.46 GHz-days/day cl_barrett15_83_gs [60-82]: 4263.03 ms ==> 6044.94M FCs/s ==> 1018.21 GHz-days/day cl_barrett15_88_gs [60-87]: 4287.52 ms ==> 6010.43M FCs/s ==> 1012.39 GHz-days/day Resulting speed for M66362159: bit_min - bit_max GHz-days/day kernelname 60 - 64 1638.461 cl_barrett15_69_gs 64 - 76 1820.989 cl_barrett32_76_gs 76 - 77 1794.006 cl_barrett32_77_gs 77 - 87 1555.962 cl_barrett32_87_gs 87 - 88 1525.948 cl_barrett32_88_gs 88 - 92 1338.509 cl_barrett32_92_gs exponent=74000077 ... calibrating exponent=74000077, 24575M FCs each, k=1994240527475, 0.807863 GHz-days (assignment), 0.050239 GHz-days (per test): .............. cl_barrett32_76_gs [64-76]: 2475.67 ms ==> 10409.22M FCs/s ==> 1753.33 GHz-days/day cl_barrett32_77_gs [64-77]: 2507.11 ms ==> 10278.71M FCs/s ==> 1731.34 GHz-days/day cl_barrett15_69_gs [60-69]: 2743.33 ms ==> 9393.64M FCs/s ==> 1582.26 GHz-days/day cl_barrett15_70_gs [60-69]: 2744.54 ms ==> 9389.47M FCs/s ==> 1581.56 GHz-days/day cl_barrett32_87_gs [65-87]: 2902.57 ms ==> 8878.27M FCs/s ==> 1495.45 GHz-days/day cl_barrett32_79_gs [64-79]: 2947.52 ms ==> 8742.89M FCs/s ==> 1472.65 GHz-days/day cl_barrett32_88_gs [65-88]: 2951.63 ms ==> 8730.70M FCs/s ==> 1470.60 GHz-days/day cl_barrett15_71_gs [60-70]: 3223.58 ms ==> 7994.16M FCs/s ==> 1346.53 GHz-days/day cl_barrett15_73_gs [60-73]: 3365.31 ms ==> 7657.49M FCs/s ==> 1289.83 GHz-days/day cl_barrett32_92_gs [65-92]: 3379.26 ms ==> 7625.87M FCs/s ==> 1284.50 GHz-days/day cl_barrett15_74_gs [60-74]: 3428.84 ms ==> 7515.61M FCs/s ==> 1265.93 GHz-days/day cl_barrett15_82_gs [60-81]: 3742.41 ms ==> 6885.89M FCs/s ==> 1159.86 GHz-days/day cl_barrett15_83_gs [60-82]: 4411.98 ms ==> 5840.86M FCs/s ==> 983.83 GHz-days/day cl_barrett15_88_gs [60-87]: 4461.24 ms ==> 5776.37M FCs/s ==> 972.97 GHz-days/day Resulting speed for M74000077: bit_min - bit_max GHz-days/day kernelname 60 - 64 1582.262 cl_barrett15_69_gs 64 - 76 1753.327 cl_barrett32_76_gs 76 - 77 1731.343 cl_barrett32_77_gs 77 - 87 1495.453 cl_barrett32_87_gs 87 - 88 1470.596 cl_barrett32_88_gs 88 - 92 1284.500 cl_barrett32_92_gs exponent=78000071 ... calibrating exponent=78000071, 24575M FCs each, k=1891972028970, 0.766434 GHz-days (assignment), 0.050239 GHz-days (per test): .............. cl_barrett32_76_gs [64-76]: 2480.96 ms ==> 10387.04M FCs/s ==> 1749.59 GHz-days/day cl_barrett32_77_gs [64-77]: 2520.91 ms ==> 10222.41M FCs/s ==> 1721.86 GHz-days/day cl_barrett15_69_gs [60-69]: 2763.75 ms ==> 9324.21M FCs/s ==> 1570.57 GHz-days/day cl_barrett15_70_gs [60-69]: 2764.86 ms ==> 9320.49M FCs/s ==> 1569.94 GHz-days/day cl_barrett32_87_gs [65-87]: 2909.00 ms ==> 8858.66M FCs/s ==> 1492.15 GHz-days/day cl_barrett32_79_gs [64-79]: 2953.57 ms ==> 8724.97M FCs/s ==> 1469.63 GHz-days/day cl_barrett32_88_gs [65-88]: 2968.85 ms ==> 8680.05M FCs/s ==> 1462.07 GHz-days/day cl_barrett15_71_gs [60-70]: 3267.20 ms ==> 7887.43M FCs/s ==> 1328.56 GHz-days/day cl_barrett32_92_gs [65-92]: 3385.53 ms ==> 7611.74M FCs/s ==> 1282.12 GHz-days/day cl_barrett15_73_gs [60-73]: 3386.59 ms ==> 7609.36M FCs/s ==> 1281.72 GHz-days/day cl_barrett15_74_gs [60-74]: 3449.81 ms ==> 7469.92M FCs/s ==> 1258.23 GHz-days/day cl_barrett15_82_gs [60-81]: 3768.71 ms ==> 6837.83M FCs/s ==> 1151.76 GHz-days/day cl_barrett15_83_gs [60-82]: 4465.07 ms ==> 5771.43M FCs/s ==> 972.14 GHz-days/day cl_barrett15_88_gs [60-87]: 4487.44 ms ==> 5742.66M FCs/s ==> 967.29 GHz-days/day Resulting speed for M78000071: bit_min - bit_max GHz-days/day kernelname 60 - 64 1570.567 cl_barrett15_69_gs 64 - 76 1749.590 cl_barrett32_76_gs 76 - 77 1721.861 cl_barrett32_77_gs 77 - 87 1492.150 cl_barrett32_87_gs 87 - 88 1462.066 cl_barrett32_88_gs 88 - 92 1282.119 cl_barrett32_92_gs exponent=332900047 ... calibrating exponent=332900047, 24575M FCs each, k=443298082771, 0.179579 GHz-days (assignment), 0.050239 GHz-days (per test): .............. cl_barrett32_76_gs [64-76]: 2679.23 ms ==> 9618.36M FCs/s ==> 1620.11 GHz-days/day cl_barrett32_77_gs [64-77]: 2723.72 ms ==> 9461.25M FCs/s ==> 1593.65 GHz-days/day cl_barrett15_69_gs [60-69]: 2993.98 ms ==> 8607.20M FCs/s ==> 1449.79 GHz-days/day cl_barrett15_70_gs [60-69]: 2995.49 ms ==> 8602.87M FCs/s ==> 1449.07 GHz-days/day cl_barrett32_87_gs [65-87]: 3147.90 ms ==> 8186.34M FCs/s ==> 1378.91 GHz-days/day cl_barrett32_79_gs [64-79]: 3196.20 ms ==> 8062.63M FCs/s ==> 1358.07 GHz-days/day cl_barrett32_88_gs [65-88]: 3216.82 ms ==> 8010.95M FCs/s ==> 1349.36 GHz-days/day cl_barrett15_71_gs [60-70]: 3558.08 ms ==> 7242.61M FCs/s ==> 1219.94 GHz-days/day cl_barrett32_92_gs [65-92]: 3672.30 ms ==> 7017.35M FCs/s ==> 1182.00 GHz-days/day cl_barrett15_73_gs [60-73]: 3683.84 ms ==> 6995.36M FCs/s ==> 1178.30 GHz-days/day cl_barrett15_74_gs [60-74]: 3753.13 ms ==> 6866.22M FCs/s ==> 1156.54 GHz-days/day cl_barrett15_82_gs [60-81]: 4093.19 ms ==> 6295.78M FCs/s ==> 1060.46 GHz-days/day cl_barrett15_83_gs [60-82]: 4869.88 ms ==> 5291.67M FCs/s ==> 891.33 GHz-days/day cl_barrett15_88_gs [60-87]: 4887.28 ms ==> 5272.83M FCs/s ==> 888.15 GHz-days/day Resulting speed for M332900047: bit_min - bit_max GHz-days/day kernelname 60 - 64 1449.794 cl_barrett15_69_gs 64 - 76 1620.114 cl_barrett32_76_gs 76 - 77 1593.650 cl_barrett32_77_gs 77 - 87 1378.906 cl_barrett32_87_gs 87 - 88 1349.362 cl_barrett32_88_gs 88 - 92 1182.000 cl_barrett32_92_gs exponent=999900079 ... calibrating exponent=999900079, 24575M FCs each, k=147588699800, 0.059788 GHz-days (assignment), 0.050239 GHz-days (per test): .............. cl_barrett32_76_gs [64-76]: 2768.51 ms ==> 9308.17M FCs/s ==> 1567.87 GHz-days/day cl_barrett32_77_gs [64-77]: 2804.79 ms ==> 9187.79M FCs/s ==> 1547.59 GHz-days/day cl_barrett15_69_gs [60-69]: 3079.74 ms ==> 8367.54M FCs/s ==> 1409.43 GHz-days/day cl_barrett15_70_gs [60-69]: 3080.84 ms ==> 8364.54M FCs/s ==> 1408.92 GHz-days/day cl_barrett32_87_gs [65-87]: 3259.31 ms ==> 7906.51M FCs/s ==> 1331.77 GHz-days/day cl_barrett32_79_gs [64-79]: 3307.28 ms ==> 7791.84M FCs/s ==> 1312.46 GHz-days/day cl_barrett32_88_gs [65-88]: 3315.69 ms ==> 7772.08M FCs/s ==> 1309.13 GHz-days/day cl_barrett15_71_gs [60-70]: 3639.54 ms ==> 7080.51M FCs/s ==> 1192.64 GHz-days/day cl_barrett15_73_gs [60-73]: 3801.08 ms ==> 6779.60M FCs/s ==> 1141.95 GHz-days/day cl_barrett32_92_gs [65-92]: 3806.62 ms ==> 6769.74M FCs/s ==> 1140.29 GHz-days/day cl_barrett15_74_gs [60-74]: 3873.38 ms ==> 6653.06M FCs/s ==> 1120.64 GHz-days/day cl_barrett15_82_gs [60-81]: 4216.73 ms ==> 6111.32M FCs/s ==> 1029.39 GHz-days/day cl_barrett15_83_gs [60-82]: 4995.54 ms ==> 5158.56M FCs/s ==> 868.91 GHz-days/day cl_barrett15_88_gs [60-87]: 5051.22 ms ==> 5101.70M FCs/s ==> 859.33 GHz-days/day Resulting speed for M999900079: bit_min - bit_max GHz-days/day kernelname 60 - 64 1409.426 cl_barrett15_69_gs 64 - 76 1567.866 cl_barrett32_76_gs 76 - 77 1547.589 cl_barrett32_77_gs 77 - 87 1331.771 cl_barrett32_87_gs 87 - 88 1309.127 cl_barrett32_88_gs 88 - 92 1140.294 cl_barrett32_92_gs exponent=2001862367 ... calibrating exponent=2001862367, 24575M FCs each, k=73718331001, 0.029863 GHz-days (assignment), 0.050239 GHz-days (per test): .............. cl_barrett32_76_gs [64-76]: 2879.20 ms ==> 8950.32M FCs/s ==> 1507.59 GHz-days/day cl_barrett32_77_gs [64-77]: 2934.62 ms ==> 8781.31M FCs/s ==> 1479.12 GHz-days/day cl_barrett15_69_gs [60-69]: 3235.08 ms ==> 7965.74M FCs/s ==> 1341.75 GHz-days/day cl_barrett15_70_gs [60-69]: 3237.09 ms ==> 7960.79M FCs/s ==> 1340.91 GHz-days/day cl_barrett32_87_gs [65-87]: 3391.29 ms ==> 7598.82M FCs/s ==> 1279.94 GHz-days/day cl_barrett32_79_gs [64-79]: 3441.93 ms ==> 7487.02M FCs/s ==> 1261.11 GHz-days/day cl_barrett32_88_gs [65-88]: 3474.05 ms ==> 7417.80M FCs/s ==> 1249.45 GHz-days/day cl_barrett15_71_gs [60-70]: 3871.34 ms ==> 6656.57M FCs/s ==> 1121.23 GHz-days/day cl_barrett32_92_gs [65-92]: 3961.01 ms ==> 6505.87M FCs/s ==> 1095.85 GHz-days/day cl_barrett15_73_gs [60-73]: 3991.74 ms ==> 6455.79M FCs/s ==> 1087.41 GHz-days/day cl_barrett15_74_gs [60-74]: 4067.63 ms ==> 6335.34M FCs/s ==> 1067.12 GHz-days/day cl_barrett15_82_gs [60-81]: 4430.71 ms ==> 5816.18M FCs/s ==> 979.68 GHz-days/day cl_barrett15_88_gs [60-87]: 5301.90 ms ==> 4860.48M FCs/s ==> 818.70 GHz-days/day cl_barrett15_83_gs [60-82]: 5302.55 ms ==> 4859.89M FCs/s ==> 818.60 GHz-days/day Resulting speed for M2001862367: bit_min - bit_max GHz-days/day kernelname 60 - 64 1341.748 cl_barrett15_69_gs 64 - 76 1507.590 cl_barrett32_76_gs 76 - 77 1479.122 cl_barrett32_77_gs 77 - 87 1279.943 cl_barrett32_87_gs 87 - 88 1249.452 cl_barrett32_88_gs 88 - 92 1095.847 cl_barrett32_92_gs exponent=4201971233 ... calibrating exponent=4201971233, 24575M FCs each, k=35120172035, 0.014227 GHz-days (assignment), 0.050239 GHz-days (per test): .............. cl_barrett32_76_gs [64-76]: 2961.94 ms ==> 8700.30M FCs/s ==> 1465.48 GHz-days/day cl_barrett32_77_gs [64-77]: 3001.67 ms ==> 8585.16M FCs/s ==> 1446.08 GHz-days/day cl_barrett15_69_gs [60-69]: 3300.24 ms ==> 7808.48M FCs/s ==> 1315.26 GHz-days/day cl_barrett15_70_gs [60-69]: 3301.97 ms ==> 7804.37M FCs/s ==> 1314.57 GHz-days/day cl_barrett32_87_gs [65-87]: 3495.23 ms ==> 7372.85M FCs/s ==> 1241.88 GHz-days/day cl_barrett32_79_gs [64-79]: 3548.02 ms ==> 7263.14M FCs/s ==> 1223.40 GHz-days/day cl_barrett32_88_gs [65-88]: 3555.39 ms ==> 7248.09M FCs/s ==> 1220.87 GHz-days/day cl_barrett15_71_gs [60-70]: 3909.34 ms ==> 6591.86M FCs/s ==> 1110.33 GHz-days/day cl_barrett32_92_gs [65-92]: 4087.63 ms ==> 6304.34M FCs/s ==> 1061.90 GHz-days/day cl_barrett15_73_gs [60-73]: 4088.04 ms ==> 6303.70M FCs/s ==> 1061.79 GHz-days/day cl_barrett15_74_gs [60-74]: 4167.08 ms ==> 6184.14M FCs/s ==> 1041.65 GHz-days/day cl_barrett15_82_gs [60-81]: 4528.72 ms ==> 5690.31M FCs/s ==> 958.47 GHz-days/day cl_barrett15_83_gs [60-82]: 5375.41 ms ==> 4794.02M FCs/s ==> 807.50 GHz-days/day cl_barrett15_88_gs [60-87]: 5438.51 ms ==> 4738.39M FCs/s ==> 798.13 GHz-days/day Resulting speed for M4201971233: bit_min - bit_max GHz-days/day kernelname 60 - 64 1315.258 cl_barrett15_69_gs 64 - 76 1465.477 cl_barrett32_76_gs 76 - 77 1446.082 cl_barrett32_77_gs 77 - 87 1241.881 cl_barrett32_87_gs 87 - 88 1220.867 cl_barrett32_88_gs 88 - 92 1061.902 cl_barrett32_92_gs [/code] |
1 Attachment(s)
I got past the "unknown argument" errors by telling [c]make[/c] to use GCC instead of Clang. However, I'm getting a [I]ton[/I] of errors. Any idea how to resolve this?
I'm trying to compile on a Mac Pro. |
1 Attachment(s)
Some progress: I was able to compile mfakto for macOS after making an OS-specific makefile and adding macros to detect macOS systems. However, mfakto crashes with an error:
[CODE]OpenCL device info name AMD Radeon HD - FirePro D700 Compute Engine (AMD) device (driver) version OpenCL 1.2 (1.2 (Jun 29 2018 18:33:51)) maximum threads per block 256 maximum threads per grid 16777216 number of multiprocessors 32 (2048 compute elements) clock rate 150MHz Automatic parameters threads per grid 0 optimizing kernels for GCN Compiling kernels (build options: "-I. -DVECTOR_SIZE=2 -DGCN -O3 -DMORE_CLASSES -DCL_GPU_SIEVE"). BUILD OUTPUT END OF BUILD OUTPUT Error -43 (Invalid build options): clBuildProgram ERROR: load_kernels(0) failed[/CODE] Also, the program does not correctly detect the clock speed, which is supposed to be 850 MHz. Any ideas? [B]Update:[/B] I think I got the program to run. Hell yeah! It turns out the [c]-O3[/c] flag isn't supported in this environment either. Disabling it resolved the [c]clBuildProgram[/c] error. mfakto still shows the wrong clock rate, but this doesn't seem to affect performance. At any rate, attached is my macOS build. Please test it and let me know if it works. If there are no issues, I'll post the build instructions. |
[QUOTE=ixfd64;511785]
It turns out the [c]-O3[/c] flag isn't supported in this environment either. Disabling it resolved the [c]clBuildProgram[/c] error. mfakto still shows the wrong clock rate, but this doesn't seem to affect performance.[/QUOTE] No Mac here to test with. The good news is -O3 would only affect the cpu side, a small fraction of the overall performance since mfaktx is primarily a gpu application. |
| All times are UTC. The time now is 22:00. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.