![]() |
|
|
#1398 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,437 Posts |
Quote:
|
|
|
|
|
|
|
#1399 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,437 Posts |
Win7 X64 Pro, NVIDIA GTX1080Ti, gpuowl-win v6.11-6-g02fd645, M226m P-1 stage 2 continuation,
No -time: without -yield operates normally on the gpu but fully occupies a cpu core (in this case a hyperthread on one of the Xeon E5520 packages); a round took 9 minutes 24 seconds. with -yield, zero cpu after 12 core-seconds initialization, but also zero gpu load per GPU-Z so probably zero progress. With -time: without -yield operates normally on the gpu but fully occupies a cpu core (in this case a hyperthread on one of the Xeon E5520 packages); a round took 9 minutes 34.5 seconds, so -time overhead appears to be ~10 seconds / 564 =~ 1.8% Code:
2019-09-22 11:27:32 226000127 P2 1628/2880: setup 4280 ms; 11400 us/prime, 51335 primes 2019-09-22 11:27:32 36.80% tailFusedMulDelta : 4118 us/call x 51335 calls 2019-09-22 11:27:32 33.56% carryFused : 3547 us/call x 54355 calls 2019-09-22 11:27:32 7.10% fftMiddleIn : 750 us/call x 54355 calls 2019-09-22 11:27:32 7.05% fftMiddleOut : 745 us/call x 54355 calls 2019-09-22 11:27:32 6.63% transposeW : 701 us/call x 54355 calls 2019-09-22 11:27:32 6.56% transposeH : 693 us/call x 54355 calls 2019-09-22 11:27:32 1.58% fftH : 1507 us/call x 6040 calls 2019-09-22 11:27:32 0.72% multiply : 1371 us/call x 3020 calls 2019-09-22 11:27:32 Total time 574.506 s |
|
|
|
|
|
#1400 |
|
"Eric"
Jan 2018
USA
22·53 Posts |
|
|
|
|
|
|
#1401 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,437 Posts |
Separate system, dual xeon e5-2670, Win7 X64 Pro, NVIDIA GTX1080, gpuowl-win v6.11-6-g02fd645, M228m P-1, similar behavior.
|
|
|
|
|
|
#1402 |
|
"Mihai Preda"
Apr 2015
3×457 Posts |
|
|
|
|
|
|
#1403 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,437 Posts |
Quote:
Code:
C:\Users\ken\Documents\v6.11-9-g9ae3189>gpuowl-win -device 0 -use ORIG_X2 -user kriesel -cpu emu/gtx1080 -maxAlloc 8000 -yield 2019-09-22 17:42:39 gpuowl v6.11-9-g9ae3189 2019-09-22 17:42:39 Note: no config.txt file found 2019-09-22 17:42:39 config: -device 0 -use ORIG_X2 -user kriesel -cpu emu/gtx1080 -maxAlloc 8000 -yield 2019-09-22 17:42:39 228000037 FFT 14336K: Width 256x4, Height 256x4, Middle 7; 15.53 bits/word 2019-09-22 17:42:40 OpenCL args "-DEXP=228000037u -DWIDTH=1024u -DSMALL_HEIGHT=1024u -DMIDDLE=7u -DWEIGHT_STEP=0xb.12354e6de8db8p-3 -DIWEIGHT_STEP=0xb.8fc56ff3f adcp-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DORIG_X2=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-09-22 17:42:40 2019-09-22 17:42:40 OpenCL compilation in 22 ms 2019-09-22 17:42:44 228000037 P1 B1=1840000, B2=42320000; 2654010 bits; starting at 1083301 2019-09-22 17:44:16 228000037 P1 1090000 41.07%; 13745 us/sq; ETA 0d 05:58; 646ebd24b9141139 2019-09-22 17:46:33 228000037 P1 1100000 41.45%; 13754 us/sq; ETA 0d 05:56; 5b076380f84fa1f8 2019-09-22 17:48:52 228000037 P1 1110000 41.82%; 13821 us/sq; ETA 0d 05:56; 49cac9f30cafb667 2019-09-22 17:51:09 228000037 P1 1120000 42.20%; 13768 us/sq; ETA 0d 05:52; 49039a105d434d61 2019-09-22 17:53:28 228000037 P1 1130000 42.58%; 13831 us/sq; ETA 0d 05:51; aed916597692a26e 2019-09-22 17:55:45 228000037 P1 1140000 42.95%; 13763 us/sq; ETA 0d 05:47; 0a39a801f50514e8 2019-09-22 17:58:04 228000037 P1 1150000 43.33%; 13877 us/sq; ETA 0d 05:48; a69b4685a5d5e8ed 2019-09-22 18:00:22 228000037 P1 1160000 43.71%; 13764 us/sq; ETA 0d 05:43; 8ba2709ae1589129 2019-09-22 18:02:39 228000037 P1 1170000 44.08%; 13760 us/sq; ETA 0d 05:40; f69bffc29181eec2 2019-09-22 18:04:58 228000037 P1 1180000 44.46%; 13826 us/sq; ETA 0d 05:40; e55aa4dce17619d2 2019-09-22 18:07:15 228000037 P1 1190000 44.84%; 13767 us/sq; ETA 0d 05:36; bd8a0062f3e8109b 2019-09-22 18:09:33 228000037 P1 1200000 45.21%; 13823 us/sq; ETA 0d 05:35; 15f4486494abaf74 2019-09-22 18:11:51 228000037 P1 1210000 45.59%; 13767 us/sq; ETA 0d 05:31; a652297a1008f956 2019-09-22 18:14:10 228000037 P1 1220000 45.97%; 13842 us/sq; ETA 0d 05:31; 78094c385b32ceac 2019-09-22 18:14:16 Stopping, please wait.. 2019-09-22 18:14:17 Exiting because "stop requested" 2019-09-22 18:14:17 Bye Terminate batch job (Y/N)? n C:\Users\ken\Documents\v6.11-9-g9ae3189>gpuowl-win -device 0 -use ORIG_X2 -user kriesel -cpu emu/gtx1080 -maxAlloc 8000 -yield -time 2019-09-22 18:14:40 gpuowl v6.11-9-g9ae3189 2019-09-22 18:14:40 Note: no config.txt file found 2019-09-22 18:14:40 config: -device 0 -use ORIG_X2 -user kriesel -cpu emu/gtx1080 -maxAlloc 8000 -yield -time 2019-09-22 18:14:40 228000037 FFT 14336K: Width 256x4, Height 256x4, Middle 7; 15.53 bits/word 2019-09-22 18:14:40 OpenCL args "-DEXP=228000037u -DWIDTH=1024u -DSMALL_HEIGHT=1024u -DMIDDLE=7u -DWEIGHT_STEP=0xb.12354e6de8db8p-3 -DIWEIGHT_STEP=0xb.8fc56ff3f adcp-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DORIG_X2=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-09-22 18:14:40 2019-09-22 18:14:40 OpenCL compilation in 25 ms 2019-09-22 18:14:45 228000037 P1 B1=1840000, B2=42320000; 2654010 bits; starting at 1220501 2019-09-22 18:16:57 228000037 P1 1230000 46.34%; 13941 us/sq; ETA 0d 05:31; d10c1a457f57634c 2019-09-22 18:16:57 36.96% tailFused : 5058 us/call x 9499 calls 2019-09-22 18:16:57 17.03% carryFused : 4762 us/call x 4650 calls 2019-09-22 18:16:57 16.21% carryFusedMul : 4347 us/call x 4848 calls 2019-09-22 18:16:57 7.52% transposeW : 1029 us/call x 9499 calls 2019-09-22 18:16:57 7.47% transposeH : 1022 us/call x 9499 calls 2019-09-22 18:16:57 7.41% fftMiddleIn : 1014 us/call x 9499 calls 2019-09-22 18:16:57 7.39% fftMiddleOut : 1011 us/call x 9499 calls 2019-09-22 18:16:57 Total time 129.985 s Last fiddled with by kriesel on 2019-09-23 at 00:12 |
|
|
|
|
|
|
#1404 |
|
"Eric"
Jan 2018
USA
22·53 Posts |
On Windows, the yield option works perfectly for PRP, dropping my CPU usage from about 5.5% of 16 threads down to almost nothing. Though the speed is reduced from around 860us/it down to 880us/it, which is insignificant enough and that my CPU would work more efficiently to compensate for that. Thanks Preda for addressing this bug (blame lays on Nvidia for sure).
Last fiddled with by xx005fs on 2019-09-23 at 01:21 |
|
|
|
|
|
#1405 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
10101001111012 Posts |
PRP on GTX1080Ti on gpuowl V6.11-9 with -yield seems to be within 2% of gpu throughput of v6.7-4 (which saturates a cpu core). Observed prime95 throughput penalty with v6.7's cpu use was about 0.5% (2% of one of the 4 workers), thanks to hyperthreading mitigating the impact somewhat. These figures are very approximate. A more accurate check would use about an hour in each condition after ignoring the initial startup of 10 minutes or so for thermal stabilization.
Code:
2019-09-23 09:10:52 gpuowl v6.7-4-g278407a 2019-09-23 09:10:53 Note: no config.txt file found 2019-09-23 09:10:53 config: -device 0 -use ORIG_X2 -maxAlloc 10240 -user kriesel -cpu dodo-gtx1080ti 2019-09-23 09:10:53 87005279 FFT 5120K: Width 256x4, Height 64x4, Middle 10; 16.59 bits/word 2019-09-23 09:10:53 using short carry kernels 2019-09-23 09:10:53 OpenCL args "-DEXP=87005279u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DWEIGHT_STEP=0xa.97d8cd06772f8p-3 -DIWEIGHT _STEP=0xc.1551b6b1158dp-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DORIG_X2=1 -I. -cl-fast-relaxed-mat h -cl-std=CL2.0" 2019-09-23 09:10:53 2019-09-23 09:10:53 OpenCL compilation in 97 ms 2019-09-23 09:10:55 87005279.owl loaded: k 25172000, block 500, res64 2736c9728212e62e 2019-09-23 09:11:03 87005279 OK 25173000 28.93%; 3406 us/sq; ETA 2d 10:30; 8f25ad724e654078 (check 2.09s) 2019-09-23 09:12:36 87005279 25200000 28.96%; 3448 us/sq; ETA 2d 11:12; 7670ca7fa4cba9de 2019-09-23 09:15:32 87005279 OK 25250000 29.02%; 3472 us/sq; ETA 2d 11:34; 1d799dd231b858fc (check 2.11s) 2019-09-23 09:18:27 87005279 25300000 29.08%; 3513 us/sq; ETA 2d 12:12; 2ec8f55bc1a420aa 2019-09-23 09:21:07 Stopping, please wait.. 2019-09-23 09:21:09 87005279 OK 25345500 29.13%; 3515 us/sq; ETA 2d 12:12; b879e7272e09c388 (check 2.12s) 2019-09-23 09:21:10 Exiting because "stop requested" 2019-09-23 09:21:10 Bye Code:
2019-09-23 09:23:09 gpuowl v6.11-9-g9ae3189 2019-09-23 09:23:09 Note: no config.txt file found 2019-09-23 09:23:09 config: -device 0 -use ORIG_X2 -user kriesel -cpu dodo/gtx1080ti -maxAlloc 10240 -yield 2019-09-23 09:23:09 87005279 FFT 5120K: Width 256x4, Height 64x4, Middle 10; 16.59 bits/word 2019-09-23 09:23:10 OpenCL args "-DEXP=87005279u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DWEIGHT_STEP=0xa.97d8cd06772f8p-3 -DIWEIGHT _STEP=0xc.1551b6b1158dp-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DORIG_X2=1 -I. -cl-fast-relaxed-mat h -cl-std=CL2.0" 2019-09-23 09:23:10 2019-09-23 09:23:10 OpenCL compilation in 25 ms 2019-09-23 09:23:19 87005279 OK 25346500 29.13%; 3487 us/sq; ETA 2d 11:43; 2cdfabbcb0e97413 (check 2.15s) 2019-09-23 09:23:32 87005279 25350000 29.14%; 3501 us/sq; ETA 2d 11:58; 5921518eec88bf66 2019-09-23 09:26:28 87005279 25400000 29.19%; 3532 us/sq; ETA 2d 12:26; d6307af21b7c7f77 2019-09-23 09:29:26 87005279 25450000 29.25%; 3555 us/sq; ETA 2d 12:47; f9570edb50396289 2019-09-23 09:32:26 87005279 OK 25500000 29.31%; 3559 us/sq; ETA 2d 12:48; 076dfe1049b7bc9e (check 2.12s) Last fiddled with by kriesel on 2019-09-23 at 14:43 |
|
|
|
|
|
#1406 |
|
Sep 2002
Database er0rr
EB216 Posts |
Has there been any developments on getting gpuOwl to crunch Wagstaff numbers?
|
|
|
|
|
|
#1407 |
|
"Mihai Preda"
Apr 2015
3×457 Posts |
|
|
|
|
|
|
#1408 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
124758 Posts |
Quote:
I think those would be straightforward to implement. (Following numbering arbitrary.)
|
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
| Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
| Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
| Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |