![]() |
![]() |
#1805 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2·29·127 Posts |
![]() Quote:
Code:
2020-01-23 16:54:19 condorella/rx480 82053239 FFT 4608K: Width 256x4, Height 64x4, Middle 9; 17.39 bits/word 2020-01-23 16:54:19 condorella/rx480 OpenCL args "-DEXP=82053239u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=9u -DWEIGHT_STEP=0xc.373107b1f3e78p-3 -DIWEIGHT_STE P=0xa.7a792f1683b7p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DAMDGPU=1 -DCARRY32=1 -DCHEBYSHEV_MIDDLEMUL2=1 -DMERGED_MIDD LE=1 -DMORE_SQUARES_MIDDLEMUL1=1 -DNEW_SLOWTRIG=1 -DNO_ASM=1 -DT2_SHUFFLE_HEIGHT=1 -DT2_SHUFFLE_WIDTH=1 -DUNROLL_HEIGHT=1 -DUNROLL_WIDTH=1 -DWORKINGIN1=1 -DWORK INGOUT1=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2020-01-23 16:54:22 condorella/rx480 OpenCL compilation in 2.65 s 2020-01-23 16:54:24 condorella/rx480 82053239 EE 0 loaded: blockSize 400, a8c3b11429b46cbf (expected 0000000000000003) 2020-01-23 16:54:24 condorella/rx480 Exiting because "error on load" 2020-01-23 16:54:24 condorella/rx480 Bye This is a PRP-DC 4608K fft length, to which 5M fft length optimal -use options were applied from config.txt, with fatal result. Not looking forward to tuning a long list of -use options on an fftlength by fftlength basis for numerous gpu models and swapping them out manually when exponents change, or having such crashes discard 18 hours of gpu time instead of make progress. At 3-5% speedup on many models, it takes a long time to pay that back. Last fiddled with by kriesel on 2020-01-24 at 17:17 |
|
![]() |
![]() |
![]() |
#1806 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2×29×127 Posts |
![]()
Likes a somewhat different combination than for 5M
Quote:
|
|
![]() |
![]() |
![]() |
#1807 |
"Jorge Coveiro"
Nov 2006
Moura, Portugal
24·3 Posts |
![]()
Hi!
Can someone help me? I've a Nvidia GTX1660 running gpuowl at around 8250 us/it (FFT 5632K). With some overclock I can get less then 8000 us/it, but I'm not sure how to test gpus better for errors or tuning it with -use options. Can someone help me out? More questions: 1. I'm considering to buy 2x Radeon VII or should I wait for Big Navi? 2. Anyone with AMD 5700 XT benchmarks to compare with Radeon VII? 3. CudaLUCAS seems to run slower then gpuowl. Are there any other options? Thanks! |
![]() |
![]() |
![]() |
#1808 |
"Mihai Preda"
Apr 2015
22×192 Posts |
![]()
My expectation is that Radeon VII will still be better than "big navi" because it has such a good DP (FP64) throughput. Also the memory is both large and fast. In addition to that, the prices for Radeon VII moved down a bit.
|
![]() |
![]() |
![]() |
#1809 | |
"Eric"
Jan 2018
USA
22×5×11 Posts |
![]() Quote:
A1: Definitely buy 2 radeon VII over big navi, I seriously doubt amd will put FP64 performance on big navi since the norm right now for gaming GPU is to cut down FP64 as much as possible to save die space for Ray Tracing or Shaders. A2: I think the OpenCL is still broken on Navi GPUs and run much more stably on GCN GPUs. Even if it's not broken I am assuming that the 5700xt should perform slightly better than a stock Vega 56 in PRP, so around 3000us/it for 5632K FFT. But Radeon VII should get it close to 1000us/it (I personally don't own one but if i remembered correctly from other owner's benchmarks). A3: gpuowl is already the fastest option for primality tests. Maybe future optimizations will make it even faster but for now it's going to be way faster than CUDALucas on memory bound GPUs such as Titan V or Radeon VII (in which the latter doesn't run on CUDALucas but gpuowl is 2x faster on Titan V). Though it doesn't matter if you own a modern Nvidia (supporting OpenCL 2.0 and above) or AMD GPU and you should always run gpuowl over CUDALucas or CLLucas due to its superior error checking algorithm that could potentially eliminate the need for double checking. Last fiddled with by xx005fs on 2020-02-01 at 22:24 |
|
![]() |
![]() |
![]() |
#1810 | |
"Jorge Coveiro"
Nov 2006
Moura, Portugal
24·3 Posts |
![]() Quote:
But I think they're a good investment anyway (for this kind of project). I hope they move down a bit more, since AMD discontinued them. |
|
![]() |
![]() |
![]() |
#1811 | |
"Jorge Coveiro"
Nov 2006
Moura, Portugal
24×3 Posts |
![]() Quote:
Well... AMD 5700 XT is alot cheaper than the Radeon VII. They're almost half-price of the Radeon VII. Also 3000us/it for the 5700 XT is still good, but 1000us/it for the Radeon VII is awesome! |
|
![]() |
![]() |
![]() |
#1812 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2×29×127 Posts |
![]() Quote:
"More questions" has been covered pretty well already by others. For gpuowl -use option timing and tuning, I use the Windows batch file attached. Pass zero and one run together; other passes individually. Edit the gotos and sets from one pass to the next, to change the control flow and -use options in effect, respectively. That is what I did to produce my previous posts of tuning results. See the comments at both ends of the file, for more info. (Had to zip it, the forum won't accept a .bat file.) Please post your tuning results. Last fiddled with by kriesel on 2020-02-01 at 23:01 |
|
![]() |
![]() |
![]() |
#1813 | |
"Jorge Coveiro"
Nov 2006
Moura, Portugal
24×3 Posts |
![]() Quote:
But first, just want to say that there is a bug on the program. I'm using gpuowl v6.11-134-g1e0ce1d. ##################################### Running the batch outputs the following errors: Error#1 Running the Windows batch file at: 2020-02-01 23:55:14 config: -time -iters 10000 -use NO_ASM,UNROLL_NONE outputs some errors and after the following: 2020-02-01 23:55:14 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build Error#2 Running the Windows batch file at: 2020-02-01 23:55:14 config: -time -iters 10000 -use NO_ASM,UNROLL_WIDTH outputs some errors and after the following: 2020-02-01 23:55:15 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build Error#3 Running the Windows batch file at: 2020-02-01 23:55:15 config: -time -iters 10000 -use NO_ASM,UNROLL_HEIGHT outputs some errors and after the following: 2020-02-01 23:55:15 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build Error#4 Running the Windows batch file at: 2020-02-01 23:55:15 config: -time -iters 10000 -use NO_ASM,UNROLL_MIDDLEMUL1 outputs some errors and after the following: 2020-02-01 23:55:16 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build Error#5 Running the Windows batch file at: 2020-02-01 23:55:16 config: -time -iters 10000 -use NO_ASM,UNROLL_MIDDLEMUL2 outputs some errors and after the following: 2020-02-01 23:55:16 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build ##################################### Here are some more details on Error#1: Code:
2020-02-01 23:55:14 config: -time -iters 10000 -use NO_ASM,UNROLL_NONE 2020-02-01 23:55:14 device 0, unique id '' 2020-02-01 23:55:14 GeForce GTX 1660-0 99753809 FFT 5632K: Width 256x4, Height 64x4, Middle 11; 17.30 bits/word 2020-02-01 23:55:14 GeForce GTX 1660-0 OpenCL args "-DEXP=99753809u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0xd.064531a6f6b48p-3 -DIWEIGHT_STEP=0x9.d3e00e7c301p-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DNO_ASM=1 -DUNROLL_NONE=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2020-02-01 23:55:14 GeForce GTX 1660-0 OpenCL compilation error -11 (args -DEXP=99753809u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0xd.064531a6f6b48p-3 -DIWEIGHT_STEP=0x9.d3e00e7c301p-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DNO_ASM=1 -DUNROLL_NONE=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0 -DNO_ASM=1) 2020-02-01 23:55:14 GeForce GTX 1660-0 <kernel>:1386:3: error: expected identifier or '(' for (i32 s = 4; s >= 0; s -= 2) { ^ <kernel>:1394:3: error: expected identifier or '(' for (i32 s = 4; s >= 0; s -= 2) { ^ <kernel>:1404:3: error: expected identifier or '(' for (i32 s = 3; s >= 0; s -= 3) { ^ <kernel>:1412:3: error: expected identifier or '(' for (i32 s = 3; s >= 0; s -= 3) { ^ <kernel>:1422:3: error: expected identifier or '(' for (i32 s = 6; s >= 0; s -= 2) { ^ <kernel>:1430:3: error: expected identifier or '(' for (i32 s = 6; s >= 0; s -= 2) { ^ <kernel>:1440:3: error: expected identifier or '(' for (i32 s = 6; s >= 0; s -= 3) { ^ <kernel>:1448:3: error: expected identifier or '(' for (i32 s = 6; s >= 0; s -= 3) { ^ <kernel>:1458:3: error: expected identifier or '(' for (i32 s = 5; s >= 2; s -= 3) { ^ <kernel>:1502:3: error: expected identifier or '(' for (i32 s = 5; s >= 2; s -= 3) { ^ <kernel>:2478:3: error: expected identifier or '(' for (i32 i = 0; i < MIDDLE; ++i) { ^ 2020-02-01 23:55:14 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build 2020-02-01 23:55:14 GeForce GTX 1660-0 Bye Last fiddled with by JCoveiro on 2020-02-02 at 01:15 |
|
![]() |
![]() |
![]() |
#1814 |
"Eric"
Jan 2018
USA
22·5·11 Posts |
![]()
I ran this file on my Titan V to try out the most recent update, but I got consistently slower result (657us/it vs 632us/it) compared to version 6.11-113-g6ecd9a2 that I am running. Seems like that the default Nvidia optimization settings don't play well with the Titan V.
Last fiddled with by xx005fs on 2020-02-02 at 01:15 |
![]() |
![]() |
![]() |
#1815 |
"Jorge Coveiro"
Nov 2006
Moura, Portugal
4810 Posts |
![]()
I have found another bug, while trying to test M47 (a lower exponent).
Code:
2020-02-02 01:36:38 gpuowl v6.11-134-g1e0ce1d 2020-02-02 01:36:38 Note: not found 'config.txt' 2020-02-02 01:36:38 config: -use UNROLL_ALL,WORKINGIN4,WORKINGOUT4,T2_SHUFFLE,CARRY64,FANCYMIDDLEMUL1,LESS_ACCURATE 2020-02-02 01:36:38 device 0, unique id '' 2020-02-02 01:36:38 GeForce GTX 1660-0 43112609 FFT 2304K: Width 8x8, Height 256x8, Middle 9; 18.27 bits/word 2020-02-02 01:36:39 GeForce GTX 1660-0 OpenCL args "-DEXP=43112609u -DWIDTH=64u -DSMALL_HEIGHT=2048u -DMIDDLE=9u -DWEIGHT_STEP=0xd.3ca600d8f455p-3 -DIWEIGHT_STEP=0x9.ab80a96f8aeap-4 -DWEIGHT_BIGSTEP=0xe.ac0c6e7dd2438p-3 -DIWEIGHT_BIGSTEP=0x8.b95c1e3ea8bd8p-4 -DCARRY64=1 -DFANCYMIDDLEMUL1=1 -DLESS_ACCURATE=1 -DT2_SHUFFLE=1 -DUNROLL_ALL=1 -DWORKINGIN4=1 -DWORKINGOUT4=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2020-02-02 01:36:39 GeForce GTX 1660-0 OpenCL compilation error -11 (args -DEXP=43112609u -DWIDTH=64u -DSMALL_HEIGHT=2048u -DMIDDLE=9u -DWEIGHT_STEP=0xd.3ca600d8f455p-3 -DIWEIGHT_STEP=0x9.ab80a96f8aeap-4 -DWEIGHT_BIGSTEP=0xe.ac0c6e7dd2438p-3 -DIWEIGHT_BIGSTEP=0x8.b95c1e3ea8bd8p-4 -DCARRY64=1 -DFANCYMIDDLEMUL1=1 -DLESS_ACCURATE=1 -DT2_SHUFFLE=1 -DUNROLL_ALL=1 -DWORKINGIN4=1 -DWORKINGOUT4=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0 -DNO_ASM=1) 2020-02-02 01:36:39 GeForce GTX 1660-0 <kernel>:2009:2: error: WORKINGOUT4 not compatible with this FFT size #error WORKINGOUT4 not compatible with this FFT size ^ 2020-02-02 01:36:39 GeForce GTX 1660-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:247 build 2020-02-02 01:36:39 GeForce GTX 1660-0 Bye Last fiddled with by JCoveiro on 2020-02-02 at 01:42 |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1719 | 2023-01-16 15:51 |
GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |