![]() |
We are struggling to run gpuOwn on windoze 7, nvidia card (2080Ti, but also 1080Ti). We are sure we are missing something. Can anyone point us to a tutorial? Right now, with cudaLucas, we are squeezing about 22 hours for a 55M LL test. We want to see what the owl can do, before replacing the cards with a couple of radeon vees (or.. it is wees? like in "waa wee cafè"?)
|
[QUOTE=kriesel;542591]Yikes, that means the LL side of gpuowl will be less reliable than CUDALucas v2.06, which has checks for known bad residues seen to occur,
0x0000000000000000, 0x0000000000000002, 0xffffffff80000000, 0xfffffffffffffffd, and excessive roundoff error. Gpuowl checks bits/word. A memory copy fail could give 0; +-2 values come from the residue getting zeroed and then the -2 and the squaring; the 33-bits-set value 0xffffffff80000000 comes from using far too short an fft length as was seen in both cllucas 1.02 and CUDALucas v2.03. [URL]https://mersenneforum.org/showpost.php?p=355661&postcount=232[/URL] [URL]https://mersenneforum.org/showpost.php?p=386081&postcount=299[/URL][/QUOTE] Things may improve in time; this is an intermediary point in the timeline, not the final perfect LL. |
[QUOTE=LaurV;542605]We are struggling to run gpuOwn on windoze 7, nvidia card (2080Ti, but also 1080Ti). We are sure we are missing something. Can anyone point us to a tutorial? Right now, with cudaLucas, we are squeezing about 22 hours for a 55M LL test. We want to see what the owl can do, before replacing the cards with a couple of radeon vees (or.. it is wees? like in "waa wee cafè"?)[/QUOTE]
1. does clinfo run, detecting the GPUs? 2. does gpuowl -h run, printing the list of GPUs? 3. what fails? |
v6.11-134, no errors for months on this RX550 gpu running 9xM PRP.
v6.11-257, 2 errors in 5 hours, same gpu[CODE]2020-04-14 02:08:31 gpuowl v6.11-257-g39fc002 2020-04-14 02:08:31 config: -device 1 -user kriesel -cpu condorella/rx550 -yield -maxAlloc 3600 -use NO_ASM 2020-04-14 02:08:31 device 1, unique id '' 2020-04-14 02:08:31 condorella/rx550 94741139 FFT: 5M 1K:10:256 (18.07 bpw) 2020-04-14 02:08:31 condorella/rx550 Expected maximum carry32: 461E0000 2020-04-14 02:08:32 condorella/rx550 OpenCL args "-DEXP=94741139u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DWEIGHT_STEP=0xf .3cd1fc0411148p-3 -DIWEIGHT_STEP=0x8.66790bf53aca8p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DPM1=0 -DAMDGPU=1 -DNO_ASM=1 -cl-fast-relaxed-math -cl-std=CL2.0 " 2020-04-14 02:08:38 condorella/rx550 OpenCL compilation in 5.62 s 2020-04-14 02:08:44 condorella/rx550 94741139 OK 6054400 loaded: blockSize 400, cc34a0f738ddbc39 2020-04-14 02:09:01 condorella/rx550 94741139 OK 6055200 6.39%; 13606 us/it; ETA 13d 23:12; 2b8af08eb69c5bb4 (check 5.61s) 2020-04-14 02:42:10 condorella/rx550 94741139 OK 6200000 6.54%; 13711 us/it; ETA 14d 01:13; 36657b8d4cf7b2b8 (check 5.63s) 2020-04-14 03:27:58 condorella/rx550 94741139 OK 6400000 6.76%; 13717 us/it; ETA 14d 00:36; 4941c60b1a288320 (check 5.63s) 2020-04-14 04:13:45 condorella/rx550 94741139 OK 6600000 6.97%; 13717 us/it; ETA 13d 23:50; 67aa94150e6fcccf (check 5.62s) 2020-04-14 04:59:31 condorella/rx550 94741139 OK 6800000 7.18%; 13712 us/it; ETA 13d 22:58; cab0b7a0fb0cc066 (check 5.65s) 2020-04-14 05:45:17 condorella/rx550 94741139 EE 7000000 7.39%; 13710 us/it; ETA 13d 22:09; 5e731e02beb738ea (check 5.61s) 2020-04-14 05:45:23 condorella/rx550 94741139 OK 6800000 loaded: blockSize 400, cab0b7a0fb0cc066 2020-04-14 05:54:37 condorella/rx550 94741139 OK 6840000 7.22%; 13711 us/it; ETA 13d 22:46; 4f7b98cea0650fb9 (check 5.63s) 1 er rors 2020-04-14 06:22:07 condorella/rx550 94741139 OK 6960000 7.35%; 13714 us/it; ETA 13d 22:24; a47542d527e8a188 (check 5.63s) 1 er rors 2020-04-14 06:49:37 condorella/rx550 94741139 EE 7080000 7.47%; 13711 us/it; ETA 13d 21:53; b71198a3d710f35b (check 5.62s) 1 er rors 2020-04-14 06:49:43 condorella/rx550 94741139 OK 6960000 loaded: blockSize 400, a47542d527e8a188 2020-04-14 07:09:07 condorella/rx550 94741139 OK 7040000 7.43%; 13716 us/it; ETA 13d 22:08; b7ef942604ff7e9d (check 5.62s) 2 er rors[/CODE] |
Luckily it was a motherboard that failed for me not the R7. I had a mammoth battle installing rocm-3.3.0 onto a different Debian Buster machine, my desktop, which involved a kernel upgrade and it works! With the memory at stock: 0 1000 820 1050 4 and two instances running I am getting 1409us/it each :smile:
|
[QUOTE=paulunderwood;542641]Luckily it was a motherboard that failed for me not the R7. I had a mammoth battle installing rocm-3.3.0 onto a different Debian Buster machine, my desktop, which involved a kernel upgrade and it works! With the memory at stock: 0 1000 820 1050 4 and two instances running I am getting 1409us/it each :smile:[/QUOTE]
Good for you - what sclk setting is the quoted timing at, and what's the total system wall wattage, if you have it running through a wattmeter? |
[QUOTE=ewmayer;542680]Good for you - what sclk setting is the quoted timing at, and what's the total system wall wattage, if you have it running through a wattmeter?[/QUOTE]
No meter. At sclk 4 it is drawing 214 watts according to sensors. The odd thing I noticed is that timings for a PRP test used to go down when the other instance was running P-1, but now (as a desktop GPU) it goes up when P-1 is running. |
[QUOTE=paulunderwood;542685]No meter. At sclk 4 it is drawing 214 watts according to sensors.
The odd thing I noticed is that timings for a PRP test used to go down when the other instance was running P-1, but now (as a desktop GPU) it goes up when P-1 is running.[/QUOTE] I've noticed similar timings effects with one-PRP-one-P-1 ... I think it's due to some kind of internal GPU task-priority setting, which (AFAIK) the user has no contol over. |
gpuowl-win v6.11-259-g83434d8 build fail
some usual-looking warnings, then:[CODE]Gpu.cpp: In member function 'void Gpu::printRoundoff(u32)':
Gpu.cpp:844:35: error: 'M_PI' was not declared in this scope; did you mean 'M_PIl'? 844 | double beta = sdev * (sqrt(6) / M_PI); | ^~~~ | M_PIl make: *** [Makefile:30: Gpu.o] Error 1 [/CODE] |
gpuowl-win v6.11-259-g83434d8 build
2 Attachment(s)
After the trivial edit, see preceding post, built ok.
|
[QUOTE=preda;542577]LL is "naked", no error check at all. Please try/tune combinations on PRP, which will help detect the invalid ones. Only after validation with PRP use any combination for LL.[/QUOTE]
Yes, be very careful tuning LL performance. I tried using the fastest settings that did not zero the residue, but the final residue from the test did not match the initial LL test on the exponent. So I was suspicious and did not turn in the result, but ran the LL test again using the settings from PRP tuning that I used for the other successful LL double checks, and this time it did match the first test and finished the double check. Which means that my first test was faulty due to too aggressive settings. |
| All times are UTC. The time now is 23:07. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.