![]() |
![]() |
#2685 |
"Mike"
Aug 2002
23·19·53 Posts |
![]()
Intel UHD Graphics 610
Code:
2021-01-12 06:32:56 Intel(R) UHD Graphics 610-0 OpenCL compilation in 6.50 s 2021-01-12 06:33:20 Intel(R) UHD Graphics 610-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003 2021-01-12 06:34:24 Intel(R) UHD Graphics 610-0 77936867 OK 800 0.00%; 53884 us/it; ETA 48d 14:32; 1579c241dc63eca6 (check 21.50s) 2021-01-12 06:43:02 Intel(R) UHD Graphics 610-0 77936867 OK 10000 0.01%; 53947 us/it; ETA 48d 15:45; fc4f135f7cf4ad29 (check 21.46s) 2021-01-12 06:52:20 Intel(R) UHD Graphics 610-0 77936867 OK 20000 0.03%; 53590 us/it; ETA 48d 07:53; 3cd1bd9d5e09cbc5 (check 21.44s) 2021-01-12 07:02:27 Intel(R) UHD Graphics 610-0 77936867 OK 30000 0.04%; 58429 us/it; ETA 52d 16:28; c4e0ff35e3290d98 (check 22.70s) 2021-01-12 07:12:15 Intel(R) UHD Graphics 610-0 77936867 OK 40000 0.05%; 56653 us/it; ETA 51d 01:52; dffe1b1b0d748128 (check 22.27s) 2021-01-12 07:22:05 Intel(R) UHD Graphics 610-0 77936867 OK 50000 0.06%; 56585 us/it; ETA 51d 00:14; 52e286945371ed29 (check 23.69s) 2021-01-12 07:32:33 Intel(R) UHD Graphics 610-0 77936867 OK 60000 0.08%; 60573 us/it; ETA 54d 14:20; 0945da4dc08bdd95 (check 22.21s) 2021-01-12 07:42:47 Intel(R) UHD Graphics 610-0 77936867 OK 70000 0.09%; 59272 us/it; ETA 53d 10:02; 7131fa4eb77f4bb2 (check 21.81s) 2021-01-12 07:52:06 Intel(R) UHD Graphics 610-0 77936867 OK 80000 0.10%; 53773 us/it; ETA 48d 10:57; 8d76071d27ee4221 (check 21.37s) ![]() |
![]() |
![]() |
![]() |
#2686 |
"Mike"
Aug 2002
23×19×53 Posts |
![]()
3070
Code:
2021-01-14 09:04:27 GeForce RTX 3070-0 OpenCL compilation in 0.02 s 2021-01-14 09:04:32 GeForce RTX 3070-0 77936867 OK 800 0.00%; 2769 us/it; ETA 2d 11:56; 1579c241dc63eca6 (check 1.14s) 2021-01-14 09:09:09 GeForce RTX 3070-0 77936867 OK 100000 0.13%; 2785 us/it; ETA 2d 12:14; 6d7296b9e2830f50 (check 1.15s) 2021-01-14 09:13:49 GeForce RTX 3070-0 77936867 OK 200000 0.26%; 2787 us/it; ETA 2d 12:11; f0b04b45b0855bd2 (check 1.15s) 2021-01-14 09:18:29 GeForce RTX 3070-0 77936867 OK 300000 0.38%; 2787 us/it; ETA 2d 12:06; 990aa099aad5bf9c (check 1.15s) 2021-01-14 09:23:09 GeForce RTX 3070-0 77936867 OK 400000 0.51%; 2788 us/it; ETA 2d 12:02; c03f94396a5aa29e (check 1.15s) 2021-01-14 09:27:49 GeForce RTX 3070-0 77936867 OK 500000 0.64%; 2787 us/it; ETA 2d 11:57; 591eecd8448042ad (check 1.15s) |
![]() |
![]() |
![]() |
#2687 |
"Mike"
Aug 2002
805610 Posts |
![]() |
![]() |
![]() |
![]() |
#2688 |
"Mike"
Aug 2002
1F7816 Posts |
![]()
2080 Ti
Code:
2021-01-29 17:10:19 GeForce RTX 2080 Ti-0 OpenCL compilation in 1.66 s 2021-01-29 17:10:20 GeForce RTX 2080 Ti-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003 2021-01-29 17:10:23 GeForce RTX 2080 Ti-0 77936867 OK 800 0.00%; 2051 us/it; ETA 1d 20:24; 1579c241dc63eca6 (check 0.86s) 2021-01-29 17:17:10 GeForce RTX 2080 Ti-0 77936867 OK 200000 0.26%; 2040 us/it; ETA 1d 20:03; f0b04b45b0855bd2 (check 0.87s) |
![]() |
![]() |
![]() |
#2689 |
"Seth"
Apr 2019
26910 Posts |
![]()
Is it possible to run the stage 2 (of P-1/PM1) after factors where found in stage 1?
|
![]() |
![]() |
![]() |
#2690 |
"Viliam FurÃk"
Jul 2018
Martin, Slovakia
44610 Posts |
![]() |
![]() |
![]() |
![]() |
#2691 |
Aug 2020
2×7 Posts |
![]()
It's using 240W above system baseline running PRP with gpuowl-v7.2-21-g28dbf88.
Last fiddled with by Zenzoma on 2021-01-30 at 17:22 |
![]() |
![]() |
![]() |
#2692 |
"Mike"
Aug 2002
23×19×53 Posts |
![]()
3070
Code:
2021-02-11 16:59:30 GeForce RTX 3070-0 OpenCL compilation in 1.38 s 2021-02-11 16:59:32 GeForce RTX 3070-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003 2021-02-11 16:59:35 GeForce RTX 3070-0 77936867 OK 800 0.00%; 2836 us/it; ETA 2d 13:23; 1579c241dc63eca6 (check 1.17s) 2021-02-11 17:00:31 GeForce RTX 3070-0 77936867 OK 20000 0.03%; 2878 us/it; ETA 2d 14:17; 3cd1bd9d5e09cbc5 (check 1.16s) 2021-02-11 17:01:29 GeForce RTX 3070-0 77936867 OK 40000 0.05%; 2827 us/it; ETA 2d 13:10; dffe1b1b0d748128 (check 1.17s) 2021-02-11 17:02:27 GeForce RTX 3070-0 77936867 OK 60000 0.08%; 2837 us/it; ETA 2d 13:22; 0945da4dc08bdd95 (check 1.16s) 2021-02-11 17:03:25 GeForce RTX 3070-0 77936867 OK 80000 0.10%; 2822 us/it; ETA 2d 13:02; 8d76071d27ee4221 (check 1.16s) 2021-02-11 17:04:22 GeForce RTX 3070-0 77936867 OK 100000 0.13%; 2813 us/it; ETA 2d 12:49; 6d7296b9e2830f50 (check 1.16s) 2021-02-11 17:05:20 GeForce RTX 3070-0 77936867 OK 120000 0.15%; 2813 us/it; ETA 2d 12:49; 79ae5dad855057ad (check 1.16s) 2021-02-11 17:06:17 GeForce RTX 3070-0 77936867 OK 140000 0.18%; 2822 us/it; ETA 2d 12:59; e1db15f897271496 (check 1.16s) 2021-02-11 17:07:15 GeForce RTX 3070-0 77936867 OK 160000 0.21%; 2832 us/it; ETA 2d 13:11; 25b7b6206fc6f085 (check 1.18s) 2021-02-11 17:08:13 GeForce RTX 3070-0 77936867 OK 180000 0.23%; 2837 us/it; ETA 2d 13:16; 6bee5d054f770861 (check 1.16s) 2021-02-11 17:09:12 GeForce RTX 3070-0 77936867 OK 200000 0.26%; 2875 us/it; ETA 2d 14:05; f0b04b45b0855bd2 (check 1.16s) |
![]() |
![]() |
![]() |
#2693 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
7×719 Posts |
![]()
So many benchmarks in this thread. Many are much less useful or comparable than they could be, because they omit:
|
![]() |
![]() |
![]() |
#2694 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
503310 Posts |
![]()
Two great versions, head to head: P-1 speed, 103M, V6.11-380 vs v7.2-53
XFX Radeon VII, gpu ram clock 1150 MHz, Windows 10 X64 build 19041.867, Celeron G1840 cpu V6.11-380-g79ea0cc config.txt: -user kriesel -cpu asr2/radeonvii4 -d 4 -use NO_ASM -maxAlloc 15000 -cleanup B=1000000,B2=30000000;PFactor=AID,1,2,103218151,-1,76,2 P-1 start stage 1, 2021-03-21 11:17:00 P-1 begin stage 2, 2021-03-21 11:37:15, gpu stage 1 time 0:20:15 P-1 begin stage 2 GCD, 2021-03-21 12:02:56, gpu stage 2 time 25:41; gpu both stages time 0:45:56 P-1 finish last stage 2 GCD, 2021-03-21 12:03:50, total elapsed time 0:46:50 This program version can perform 24hr/0:45:56 = 31.35 wavefront P-1 to B1=1M, B2=30M per day on this gpu. (Timing on gpu without final gcd duration is used for computing throughput, since the gcd occurs on the cpu in parallel with the beginning of the next worktodo entry running on the gpu, and in this case the gpu is limiting on throughput.) V7.2-53-ge27846f -maxalloc 15G config.txt: -user kriesel -cpu asr2/radeonvii4 -d 4 -maxAlloc 15G -proof 9 -use NO_ASM -autoverify 10 B1=1000000,B2=30000000;PRP=AID,1,2,103276501,-1,76,2 P-1 start stage 1, 2021-03-25 17:34:31 P-1 begin stage 2, 2021-03-25 18:01:12, gpu stage 1 time 0:26:41 P-1 begin last stage 2 GCD, 2021-03-25 21:31:20, gpu stage 2 time 3:30:18, while no PRP progress occurs; gpu both stages time 3:56:59 P-1 finish last stage 2 GCD, 2021-03-25 21:32:17, total elapsed time 3:57:46 This program version can perform 24h/3:56:59 = 6.08 wavefront P-1 to B1=1M, B2=30M per day on this gpu. Even ignoring the stage 1 time, if the PRP progress occurring with stage 1 is useful, the drop in P-1 productivity is considerable. (24hr/3:30:18 = 7.26/day of stage 2 time, 23.2% of the V6.11-380 rate of both stages. Is stage 2 being slowed by using too much gpu ram, causing paging? retest with -maxalloc 14G) After system restart and v7.2-53 changed to -maxalloc 14G, run v7.2-53 again: B1=1000000,B2=30000000;PRP=AID,1,2,103281509,-1,76,2 start program 2021-03-26 07:20:37 (don't count startup time which is incurred once) P-1 stage 1 start 2021-03-26 07:20:55 P-1 stage 1 end gpu compute, start stage 2 2021-03-26 07:46:35; stage 1 gpu time 0:25:40 P-1 stage 2 end gpu compute 2021-03-26 08:21:30, stage 2 gpu time 0:34:55, combined gpu time 1:00:35 P-1 stage 2 final gcd done & result written 2021-03-26 08:22:35 total elapsed time 1:01:40 (24hr/1:0035 = 23.77 wavefront P-1 to B1=1M, B2=30M per day on this gpu as configured; needed lower -maxalloc than v6.11-380 for possibly full performance; 23.77/31.35=75.82% of v6.11-380 throughput, even after optimistically neglecting the V7.2 stage 1 cost) V7.2-53 apparently does not support B2 extension after initial stage 2 completion, despite https://mersenneforum.org/showpost.p...8&postcount=61. Test method: B1=1000000,B2=30000000;PRP=AID,1,2,103276501,-1,76,2 in worktodo Run until those bounds are complete, results file contains the P-1 result line, halt program. Observe worktodo entry remains B1=1000000,B2=30000000;PRP=AID,1,2,103276501,-1,76,2, gpuowl did not change it to ...,76,0. Modify to B1=1000000,B2=40000000;PRP=AID,1,2,103276501,-1,76,2, restart program. Program continues PRP only, without P-1 stage 2 from 30M to 40M or from 1M to 40M. Perhaps the extension only works during the time window before B2=30000000 is reached. Last fiddled with by kriesel on 2021-03-30 at 14:11 |
![]() |
![]() |
![]() |
#2695 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
7·719 Posts |
![]() Quote:
Code:
Power 2power %max for transition to proof generation version (~point of no return/proof) 10 1024 0.0976 9 512 0.1953 8 256 0.3906 7 128 0.7812 6 64 1.5625 Occasionally I see it's getting close to a cutoff percentage, and save a copy of the exponent folder before continuing to benchmark non-proof versions for comparison purposes. A saved copy is also useful sometimes because gpuowl file versions migrate forwards within limits, but don't migrate backwards, and the fastest version is not always the newest proof-capable version or the same file version number. |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1668 | 2020-12-22 15:38 |
GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |