![]() |
|
|
#1222 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
541910 Posts |
I don't know what they all are, or which if any the end users shouldn't mess with, but I found these in the gpuowl.cl file:
Code:
OLD_ISBIG ORIG_SQ ORIG_X2 INLINE_X2 FMA_X2 NEWEST_FFT8 NEW_FFT8 NEWEST_FFT5 NEW_FFT5 OLD_FFT5 NEWEST_FFT10 NEW_FFT10 OLD_FFT10 ALT_RESTRICT ORIG_PAIRSQ ORIG_PAIRMUL TEST_KERNEL MIDDLE_MUL_LOOP WIDTH SMALL_HEIGHT MIDDLE NH |
|
|
|
|
|
#1223 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,419 Posts |
Code:
gpuowl v6.5-61-g5c0db85 Command line options: -dir <folder> : specify work directory (containing worktodo.txt, results.txt, config.txt, gpuowl.log) -user <name> : specify the user name. -cpu <name> : specify the hardware name. -time : display kernel profiling information. -fft <size> : specify FFT size, such as: 5000K, 4M, +2, -1. -block <value> : PRP GEC block size. Default 1000. Smaller block is slower but detects errors sooner. -log <step> : log every <step> iterations, default 20000. Multiple of 10000. -carry long|short : force carry type. Short carry may be faster, but requires high bits/word. -B1 : P-1 B1 bound, default 500000 -B2 : P-1 B2 bound, default B1 * 30 -rB2 : ratio of B2 to B1. Default 30, used only if B2 is not explicitly set -prp <exponent> : run a single PRP test and exit, ignoring worktodo.txt -pm1 <exponent> : run a single P-1 test and exit, ignoring worktodo.txt -results <file> : name of results file, default 'results.txt' -iters <N> : run next PRP test for <N> iterations and exit. Multiple of 10000. -use NEW_FFT8,OLD_FFT5,NEW_FFT10: comma separated list of defines, see the #if tests in gpuowl.cl (used for perf tuning). -device <N> : select a specific device: 0 : Intel(R) UHD Graphics 630-24x1100- 1 : Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz-12x2200- 2 : GeForce GTX 1050 Ti-6x1620- Code:
2019-05-30 13:17:51 config: -device 0 2019-05-30 13:17:51 85469147 FFT 4608K: Width 256x4, Height 64x4, Middle 9; 18.11 bits/word 2019-05-30 13:17:51 using short carry kernels 2019-05-30 13:18:42 OpenCL compilation in 50608 ms, with "-DEXP=85469147u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-05-30 13:18:44 85469147.owl not found, starting from the beginning. 2019-05-30 13:25:50 85469147 EE 2000 0.00%; 95.53 ms/sq; ETA 94d 11:54; 91e7259a0ae0534b (check 96.17s) 2019-05-30 13:25:50 85469147.owl not found, starting from the beginning. 2019-05-30 13:32:39 85469147 EE 2000 0.00%; 156.09 ms/sq; ETA 154d 09:38; 91e7259a0ae0534b (check 96.44s) Code:
>gpuowl-win -device 1 -fft +1 -carry short 2019-05-30 15:03:16 gpuowl v6.5-c48d46f 2019-05-30 15:03:16 Note: no config.txt file found 2019-05-30 15:03:16 config: -device 1 -fft +1 -carry short 2019-05-30 15:03:16 85469147 FFT 4608K: Width 64x4, Height 256x4, Middle 9; 18.11 bits/word 2019-05-30 15:03:16 using short carry kernels 2019-05-30 15:03:18 OpenCL compilation error -11 (args -DEXP=85469147u -DWIDTH=256u -DSMALL_HEIGHT=1024u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0) 2019-05-30 15:03:18 Compilation started Compilation done Linking started Linking done Device build started Failed to build device program Error: unimplemented function(s) used: _Z18work_group_barrierj12memory_scope is undefined CompilerException Failed to parse IR 2019-05-30 15:03:18 Exception 9gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:220 build 2019-05-30 15:03:18 Bye Code:
>gpuowl-win -device 1 -fft +1 -carry short -use ORIG_X2 2019-05-30 15:15:53 gpuowl v6.5-61-g5c0db85 2019-05-30 15:15:53 Note: no config.txt file found 2019-05-30 15:15:53 config: -device 1 -fft +1 -carry short -use ORIG_X2 2019-05-30 15:15:53 85469147 FFT 4608K: Width 64x4, Height 256x4, Middle 9; 18.11 bits/word 2019-05-30 15:15:53 using short carry kernels 2019-05-30 15:15:53 OpenCL args "-DEXP=85469147u -DWIDTH=256u -DSMALL_HEIGHT=1024u -DMIDDLE=9u -DFRAC=2089525580236878279ul -DWEIGHT_STEP=0xe.cab3fdd2379b8p-3 -DIWEIGHT_STEP=0x8.a747b4917f72p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DINVWEIGHT_LIMIT=0xe.38e38e38e38ep-29 -DORIG_X2=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-05-30 15:15:54 OpenCL compilation error -11 (args -DEXP=85469147u -DWIDTH=256u -DSMALL_HEIGHT=1024u -DMIDDLE=9u -DFRAC=2089525580236878279ul -DWEIGHT_STEP=0xe.cab3fdd2379b8p-3 -DIWEIGHT_STEP=0x8.a747b4917f72p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DINVWEIGHT_LIMIT=0xe.38e38e38e38ep-29 -DORIG_X2=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0) 2019-05-30 15:15:54 Compilation started Compilation done Linking started Linking done Device build started Failed to build device program Error: unimplemented function(s) used: _Z18work_group_barrierj12memory_scope is undefined CompilerException Failed to parse IR 2019-05-30 15:15:54 Exception 9gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:215 build 2019-05-30 15:15:54 Bye |
|
|
|
|
|
#1224 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
541910 Posts |
gpuowl attempt on i7-8750H's uhd630 IGP OpenCL device 0 unsuccessful in various ways:
Code:
>gpuowl-win-c48d46f -device 0 -fft +0 -carry short 2019-05-30 13:17:51 Note: no config.txt file found 2019-05-30 13:17:51 config: -device 0 2019-05-30 13:17:51 85469147 FFT 4608K: Width 256x4, Height 64x4, Middle 9; 18.11 bits/word 2019-05-30 13:17:51 using short carry kernels 2019-05-30 13:18:42 OpenCL compilation in 50608 ms, with "-DEXP=85469147u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-05-30 13:18:44 85469147.owl not found, starting from the beginning. 2019-05-30 13:25:50 85469147 EE 2000 0.00%; 95.53 ms/sq; ETA 94d 11:54; 91e7259a0ae0534b (check 96.17s) 2019-05-30 13:25:50 85469147.owl not found, starting from the beginning. 2019-05-30 13:32:39 85469147 EE 2000 0.00%; 156.09 ms/sq; ETA 154d 09:38; 91e7259a0ae0534b (check 96.44s) Code:
>gpuowl-win-c48d46f -device 0 -fft +1 -carry short 2019-05-30 17:47:08 gpuowl v6.5-c48d46f 2019-05-30 17:47:08 Note: no config.txt file found 2019-05-30 17:47:08 config: -device 0 -fft +1 -carry short 2019-05-30 17:47:08 85469147 FFT 4608K: Width 64x4, Height 256x4, Middle 9; 18.11 bits/word 2019-05-30 17:47:08 using short carry kernels 2019-05-30 17:48:01 OpenCL compilation in 53016 ms, with "-DEXP=85469147u -DWIDTH=256u -DSMALL_HEIGHT=1024u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-05-30 17:48:03 85469147.owl loaded: k 223000, block 1000, res64 6dc0ba3dd68cf05d 2019-05-30 17:50:30 85469147 EE loaded: 223000, blockSize 1000, ee2866e4a4297374 (expected 6dc0ba3dd68cf05d) 2019-05-30 17:50:30 Exiting because "error on load" 2019-05-30 17:50:30 Bye >gpuowl-win-c48d46f -device 0 -fft +3 -carry short 2019-05-30 17:52:14 gpuowl v6.5-c48d46f 2019-05-30 17:52:14 Note: no config.txt file found 2019-05-30 17:52:14 config: -device 0 -fft +3 -carry short 2019-05-30 17:52:14 85469147 FFT 4608K: Width 512x8, Height 8x8, Middle 9; 18.11 bits/word 2019-05-30 17:52:14 using short carry kernels 2019-05-30 17:52:55 OpenCL compilation in 40489 ms, with "-DEXP=85469147u -DWIDTH=4096u -DSMALL_HEIGHT=64u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-05-30 17:52:57 85469147.owl loaded: k 223000, block 1000, res64 6dc0ba3dd68cf05d Abort was called at 74 line in file: D:\qb\workspace\19992\src\vpg-compute-neo\runtime/command_stream/linear_stream.h >gpuowl-win-c48d46f -device 0 -fft +2 -carry short 2019-05-30 17:54:32 gpuowl v6.5-c48d46f 2019-05-30 17:54:32 Note: no config.txt file found 2019-05-30 17:54:32 config: -device 0 -fft +2 -carry short 2019-05-30 17:54:32 85469147 FFT 4608K: Width 64x8, Height 64x8, Middle 9; 18.11 bits/word 2019-05-30 17:54:32 using short carry kernels 2019-05-30 17:56:02 OpenCL compilation in 88926 ms, with "-DEXP=85469147u -DWIDTH=512u -DSMALL_HEIGHT=512u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-05-30 17:56:03 85469147.owl loaded: k 223000, block 1000, res64 6dc0ba3dd68cf05d (no progress indicated for 4 hours, no response to CTRL-C, igp is busy; terminated process in Task Manager) >time The current time is: 22:16:37.56 >gpuowl-win-c48d46f -device 0 -fft +0 -carry long 2019-05-30 22:26:15 gpuowl v6.5-c48d46f 2019-05-30 22:26:15 Note: no config.txt file found 2019-05-30 22:26:15 config: -device 0 -fft +0 -carry long 2019-05-30 22:26:15 85469147 FFT 4608K: Width 256x4, Height 64x4, Middle 9; 18.11 bits/word 2019-05-30 22:26:15 using long carry kernels 2019-05-30 22:27:06 OpenCL compilation in 50507 ms, with "-DEXP=85469147u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-05-30 22:27:08 85469147.owl loaded: k 223000, block 1000, res64 6dc0ba3dd68cf05d 2019-05-30 22:29:46 85469147 EE loaded: 223000, blockSize 1000, 5ba05a0a832d8141 (expected 6dc0ba3dd68cf05d) 2019-05-30 22:29:46 Exiting because "error on load" 2019-05-30 22:29:46 Bye |
|
|
|
|
|
#1225 |
|
2·13·293 Posts |
It has been said earlier that gpuowl needs a discrete gpu. Your device 0 is an integrated gpu with shared memory.
https://www.notebookcheck.net/Intel-....257928.0.html |
|
|
|
#1226 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
10101001010112 Posts |
Quote:
Some earlier IGPs lacked DP, so could run mfakto but not gpuowl. The UHD630's OpenCl indicates DP capability. (as does the HD620) From Gpu-Z's Advanced tab for OpenCl: Code:
General Platform Name Intel(R) OpenCL Platform Vendor Intel(R) Corporation Platform Profile FULL_PROFILE Platform Version OpenCL 2.1 Vendor Intel(R) Corporation Device Name Intel(R) UHD Graphics 630 Version OpenCL 2.1 NEO Driver Version 23.20.16.4973 C Version OpenCL C 2.1 IL Version SPIR-V_1.0 Profile FULL_PROFILE Global Memory Size 6497 MB Clock Frequency 1100 MHz Compute Units 24 Device Available Yes Compiler Available Yes Linker Available Yes Preferred Synchronization User CMD Queue Properties Out of Order, Profiling SVM Capabilities Coarse, Fine, Atomics DP Capability Denorm, INF NAN, Round Nearest, Round Zero, Round INF, FMA SP Capability Denorm, INF NAN, Round Nearest, Round Zero, Round INF, FMA Half FP Capability Denorm, INF NAN, Round Nearest, Round Zero, Round INF, FMA Address Bits 64 Preferred On-Device Queue 128 KB Global Memory Cache 512 KB (RW Cache) Global Memory Cacheline 0 KB Preferred Global Atomic Alignment 0 Preferred Local Atomic Alignment 0 Preferred Platform Atomic Alignment 0 Local Memory Local (64 KB) Memory Alignment 1024 bits Pitch Alignment 4 pixels Built-in Kernels block_motion_estimate_intel;block_advanced_motion_estimate_check_intel;block_advanced_motion_estimate_bidirectional_check_intel; Little Endian Yes Error Correction No Execution Capability Kernel Unified Memory Yes Image Support Yes Limits Max Device Events 1024 Max Device Queues 1 Max On-Device Queue 65536 KB Preferred Max Variable Size 3406522368 Bytes Max Memory Allocation 3248 MB Max Constant Buffer 3326682 KB Max Constant Args 8 Max Pipe Args 16 Max Pipe Reservations 1 Max Pipe Packet Size 1024 Bytes Max Read Image Args 128 Max Write Image Args 128 Max Read-Write Image Args 0 Max Samplers 16 Max Work Item Dims 3 Max Write Image Args 128 Native Vectors Native Vector Width (CHAR) 16 Native Vector Width (SHORT) 8 Native Vector Width (INT) 4 Native Vector Width (LONG) 1 Native Vector Width (FLOAT) 1 Native Vector Width (DOUBLE) 1 Native Vector Width (HALF) 8 Preferred Vector Width (CHAR) 16 Preferred Vector Width (SHORT) 8 Preferred Vector Width (INT) 4 Preferred Vector Width (LONG) 1 Preferred Vector Width (FLOAT) 1 Preferred Vector Width (DOUBLE) 1 Preferred Vector Width (HALF) 8 Extensions cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_depth_images cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_image2d_from_buffer cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_media_block_io cl_intel_driver_diagnostics cl_intel_device_side_avc_motion_estimation cl_khr_priority_hints cl_khr_subgroups cl_khr_il_program cl_khr_fp64 cl_intel_planar_yuv cl_intel_packed_yuv cl_intel_motion_estimation cl_intel_advanced_motion_estimation cl_khr_gl_sharing cl_khr_gl_depth_images cl_khr_gl_event cl_khr_gl_msaa_sharing cl_intel_dx9_media_sharing cl_khr_dx9_media_sharing cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_intel_d3d11_nv12_media_sharing cl_intel_simultaneous_sharing |
|
|
|
|
|
|
#1227 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,419 Posts |
Back in gpuowl V1.9, there were four transform types, SP, DP, M31, and M61. M61 could go a bit higher on exponent than DP of the same length but was not nearly as fast, on AMD with its 1:16 DP:SP ratio.
Now in V6.5, gpuowl is running in OpenCl1.2 or above on NVIDIA. Most NVIDIA gpus have a slower ratio DP:SP than AMD does. Specifically, GTX10xx is 1:32. If the M61 transform was available in gpuowl v6.x, it may be faster on NVIDIA than DP is. See first attachment of https://www.mersenneforum.org/showpo...35&postcount=2, and https://www.mersenneforum.org/showpo...31&postcount=8 |
|
|
|
|
|
#1228 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
124538 Posts |
Latest makefile seems to get the strip right on Windows, requires specifying the target as gpuowl-win.exe.
Code:
$ make gpuowl-win.exe cat head.txt gpuowl.cl tail.txt > gpuowl-wrap.cpp echo \"`git describe --long --dirty --always`\" > version.new diff -q -N version.new version.inc >/dev/null || mv version.new version.inc echo Version: `cat version.inc` Version: "v6.5-75-g4902439-dirty" g++ -MT Pm1Plan.o -MMD -MP -MF .d/Pm1Plan.Td -Wall -O2 -std=c++17 -c -o Pm1Plan.o Pm1Plan.cpp g++ -MT GmpUtil.o -MMD -MP -MF .d/GmpUtil.Td -Wall -O2 -std=c++17 -c -o GmpUtil.o GmpUtil.cpp g++ -MT Worktodo.o -MMD -MP -MF .d/Worktodo.Td -Wall -O2 -std=c++17 -c -o Worktodo.o Worktodo.cpp g++ -MT common.o -MMD -MP -MF .d/common.Td -Wall -O2 -std=c++17 -c -o common.o common.cpp g++ -MT main.o -MMD -MP -MF .d/main.Td -Wall -O2 -std=c++17 -c -o main.o main.cpp g++ -MT Gpu.o -MMD -MP -MF .d/Gpu.Td -Wall -O2 -std=c++17 -c -o Gpu.o Gpu.cpp g++ -MT clwrap.o -MMD -MP -MF .d/clwrap.Td -Wall -O2 -std=c++17 -c -o clwrap.o clwrap.cpp g++ -MT Task.o -MMD -MP -MF .d/Task.Td -Wall -O2 -std=c++17 -c -o Task.o Task.cpp g++ -MT checkpoint.o -MMD -MP -MF .d/checkpoint.Td -Wall -O2 -std=c++17 -c -o checkpoint.o checkpoint.cpp g++ -MT timeutil.o -MMD -MP -MF .d/timeutil.Td -Wall -O2 -std=c++17 -c -o timeutil.o timeutil.cpp g++ -MT Args.o -MMD -MP -MF .d/Args.Td -Wall -O2 -std=c++17 -c -o Args.o Args.cpp g++ -MT state.o -MMD -MP -MF .d/state.Td -Wall -O2 -std=c++17 -c -o state.o state.cpp g++ -MT Signal.o -MMD -MP -MF .d/Signal.Td -Wall -O2 -std=c++17 -c -o Signal.o Signal.cpp g++ -MT FFTConfig.o -MMD -MP -MF .d/FFTConfig.Td -Wall -O2 -std=c++17 -c -o FFTConfig.o FFTConfig.cpp g++ -MT clpp.o -MMD -MP -MF .d/clpp.Td -Wall -O2 -std=c++17 -c -o clpp.o clpp.cpp g++ -MT gpuowl-wrap.o -MMD -MP -MF .d/gpuowl-wrap.Td -Wall -O2 -std=c++17 -c -o gpuowl-wrap.o gpuowl-wrap.cpp g++ -o gpuowl-win.exe Pm1Plan.o GmpUtil.o Worktodo.o common.o main.o Gpu.o clwrap.o Task.o checkpoint.o timeutil.o Args.o state.o Signal.o FFTConfig.o clpp.o gpuowl-wrap.o -lstdc++fs -lOpenCL -lgmp -pthread -L/opt/rocm/opencl/lib/x86_64 -L/opt/amdgpu-pro/lib/x86_64-linux-gnu -L/c/Windows/System32 -L. -static strip gpuowl-win.exe Code:
>gpuowl-win -prp 3321928097 2019-06-03 14:22:00 gpuowl v6.5-75-g4902439-dirty 2019-06-03 14:22:00 Exception St12out_of_range: stol 2019-06-03 14:22:00 Bye >gpuowl-win -prp 2147483659 2019-06-03 14:28:16 gpuowl v6.5-75-g4902439-dirty 2019-06-03 14:28:17 Exception St12out_of_range: stol 2019-06-03 14:28:17 Bye >gpuowl-win -prp 2147483647 -use FMA_X2 2019-06-03 14:29:52 gpuowl v6.5-75-g4902439-dirty 2019-06-03 14:29:52 Note: no config.txt file found 2019-06-03 14:29:52 config: -prp 2147483647 -use FMA_X2 2019-06-03 14:29:52 2147483647 FFT 147456K: Width 512x8, Height 256x8, Middle 9; 14.22 bits/word 2019-06-03 14:29:52 using long carry kernels 2019-06-03 14:30:00 OpenCL args "-DEXP=2147483647u -DWIDTH=4096u -DSMALL_HEIGHT=2048u -DMIDDLE=9u -DWEIGHT_STEP=0xd.b745787f2c4cp-3 -DIWEIGHT_STEP=0x9.550d2c9e8 37e8p-4 -DWEIGHT_BIGSTEP=0x8.b95c1e3ea8bd8p-3 -DIWEIGHT_BIGSTEP=0xe.ac0c6e7dd2438p-4 -DFMA_X2=1 -DFMA_X2=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-06-03 14:30:04 OpenCL compilation in 4704 ms 2019-06-03 14:30:28 2147483647.owl not found, starting from the beginning. 2019-06-03 14:42:03 2147483647 OK 2000 0.00%; 162.835 ms/sq; ETA 4047d 06:30; fb12c8169932aa03 (check 172.72s) 2147483647 < 231 < 2147483659; log10(23321928097-1) > 109) Last fiddled with by kriesel on 2019-06-03 at 20:29 |
|
|
|
|
|
#1229 |
|
"Mihai Preda"
Apr 2015
3×457 Posts |
Dirty means that there are uncommited local changes (edits) to some files. If the build is done from exactly the version that is checked-out, then it's not dirty.
I tried to fix the stol(), please re-try with a >2G exponent. |
|
|
|
|
|
#1230 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,419 Posts |
Looks good. Timings don't though. Eleven to 23.5 years for these on RX480.
Code:
>gpuowl-win -prp 2147483659 -use FMA_X2 2019-06-03 17:40:09 gpuowl v6.5-76-g1ca08e2-dirty 2019-06-03 17:40:09 Note: no config.txt file found 2019-06-03 17:40:09 config: -prp 2147483659 -use FMA_X2 2019-06-03 17:40:09 2147483659 FFT 147456K: Width 512x8, Height 256x8, Middle 9; 14.22 bits/word 2019-06-03 17:40:09 using long carry kernels 2019-06-03 17:40:16 OpenCL args "-DEXP=2147483659u -DWIDTH=4096u -DSMALL_HEIGHT=2048u -DMIDDLE=9u -DWEIGHT_STEP=0xd.b7456bd211bf8p-3 -DIWEIGHT_STEP=0x9.550d353e 7752p-4 -DWEIGHT_BIGSTEP=0xc.5672a115506d8p-3 -DIWEIGHT_BIGSTEP=0xa.5fed6a9b15138p-4 -DFMA_X2=1 -DFMA_X2=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-06-03 17:40:21 OpenCL compilation in 4679 ms 2019-06-03 17:40:47 2147483659.owl not found, starting from the beginning. 2019-06-03 17:52:25 2147483659 OK 2000 0.00%; 161.868 ms/sq; ETA 4023d 06:15; 25ac32a404e8574e (check 171.25s) ^CTerminate batch job (Y/N)? n >gpuowl-win -prp 3321928097 -use ORIG_X2 2019-06-03 17:53:53 gpuowl v6.5-76-g1ca08e2-dirty 2019-06-03 17:53:53 Note: no config.txt file found 2019-06-03 17:53:53 config: -prp 3321928097 -use ORIG_X2 2019-06-03 17:53:53 3321928097 FFT 196608K: Width 512x8, Height 256x8, Middle 12; 16.50 bits/word 2019-06-03 17:53:53 using long carry kernels 2019-06-03 17:53:59 OpenCL args "-DEXP=3321928097u -DWIDTH=4096u -DSMALL_HEIGHT=2048u -DMIDDLE=12u -DWEIGHT_STEP=0xb.4feacf46035b8p-3 -DIWEIGHT_STEP=0xb.50b39ab 42445p-4 -DWEIGHT_BIGSTEP=0xe.ac0c6e7dd2438p-3 -DIWEIGHT_BIGSTEP=0x8.b95c1e3ea8bd8p-4 -DORIG_X2=1 -DORIG_X2=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-06-03 17:54:03 OpenCL compilation in 4318 ms 2019-06-03 17:54:38 3321928097.owl not found, starting from the beginning. 2019-06-03 18:20:55 3321928097 OK 2000 0.00%; 222.996 ms/sq; ETA 8573d 19:14; 5388b104718177b6 (check 237.96s) 2019-06-03 18:24:37 Stopping, please wait.. 2019-06-03 18:28:33 3321928097 OK 3000 0.00%; 221.702 ms/sq; ETA 8524d 01:06; faa54e1e75915eab (check 235.93s) 2019-06-03 18:28:37 Exiting because "stop requested" 2019-06-03 18:28:37 Bye In the second case, 2000 iterations x 223 ms/sq + 238 = 684 sec, but elapsed time = 18:20:55-17:54:38 = 1577 sec. GPU ram usage was ~6GB in the second case. |
|
|
|
|
|
#1231 |
|
"Mihai Preda"
Apr 2015
3×457 Posts |
In a recent commit, the timing display is changed from ms/sq to us/sq ("micros") :)
Code:
2019-06-04 21:46:15 r7u 85504057 OK 78643000 91.97%; 794 us/sq; ETA 0d 01:31; 3dad4b579a2cd95c (check 0.97s) 2019-06-04 21:47:01 r7u 85504057 78700000 92.04%; 811 us/sq; ETA 0d 01:32; 13b0dc053fd74724 Last fiddled with by preda on 2019-06-04 at 11:47 |
|
|
|
|
|
#1232 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
124538 Posts |
I read through the commit listings back to mid January, and saw Preda had acknowledged there numerous contributions made by several individuals. A crude summary follows
Code:
valeriob01 -w argument; readme.md work; description of cmd line arguments
& updates, display of parameters; primenet.py date & time;
makefile fix
k3ack3r fix some msys2 warnings; update makefile
chengsun fix alignment violation causing OUT_OF_RESOURCES error on NVIDIA
GPUs
sillygitter add -iters argument
gwoltman allow making small test kernels; new X2 definition; fft8 cleanup +
documentation; new sq macro; overhaul/comment fft5/fft10 macros;
improved pairSq and pairMul; faster 6m fft using new fft12 middle;
new 5.5m fft using new fft11 middle; increased precision of fft11
constants; inline X2; fft7 middle step; shorter multiply chains in
middle
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
| Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
| Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
| Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |