mersenneforum.org GpuOwl 7.x
 Register FAQ Search Today's Posts Mark Forums Read

2020-12-13, 04:31   #188
preda

"Mihai Preda"
Apr 2015

2×677 Posts

Quote:
 Originally Posted by ixfd64 Has 6.x reached end-of-life, or are you going to continue updating it alongside the 7.x branch?
I'm personally not using 6.x myself, thus I don't have a lot of motivation to improve it. From my POV, 7.x is now better in a couple of ways than 6.x, and I prefer to focus my (limited) resources on 7.x. OTOH in the open spirit I'm not against others having different oppinions, and using different versions in different ways etc.

 2020-12-13, 21:45 #189 frmky     Jul 2003 So Cal 2×3×347 Posts P-1 factors seem to always be reported as found by stage 1. This leads to confusing reports like https://www.mersenne.ca/exponent/104592713. Can this be fixed easily or is this behavior intentional?
 2020-12-13, 22:40 #190 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 10011100111002 Posts Not a problem in gpuowl v6.11-380; 101224807 which unlike gpuowl v7.x, only does the stage2 gcd once after the full B2, so there's no ambiguity. Last fiddled with by kriesel on 2020-12-13 at 22:41
2020-12-19, 15:00   #191
storm5510
Random Account

Aug 2009

19·101 Posts

Quote:
 Originally Posted by preda I'm personally not using 6.x myself, thus I don't have a lot of motivation to improve it. From my POV, 7.x is now better in a couple of ways than 6.x, and I prefer to focus my (limited) resources on 7.x. OTOH in the open spirit I'm not against others having different opinions, and using different versions in different ways etc.
From my POV, removing the ability to run P-1's only is unfortunate...

 2020-12-19, 21:14 #192 kruoli     "Oliver" Sep 2017 Porta Westfalica, DE 46810 Posts Yes, that would exclude everybody who is doing P-1 for exponent with known residues. Last fiddled with by kruoli on 2020-12-19 at 21:15 Reason: Conjunctive.
2020-12-20, 18:56   #193
storm5510
Random Account

Aug 2009

19·101 Posts

Quote:
 Originally Posted by kruoli Yes, that would exclude everybody who is doing P-1 for exponent with known residues.
I run first-time P-1's from Primenet by manual reservation. The only thing done to them prior is TF to a specific bit level, Currently, it is 2^76. I can also get these from GPUto72's website. A P-1 with a "known residue" would indicate to me that is has a factor, meaning it is composite and requires no further effort.

 2021-01-17, 05:21 #194 Cheetahgod   May 2020 52 Posts Does anyone have any benchmarks running gpuowl on the new gtx 30 series or the new AMD 6800x series gpus?
2021-01-17, 05:43   #195
petrw1
1976 Toyota Corona years forever!

"Wayne"
Nov 2006

23·52·23 Posts

Quote:
 Originally Posted by Cheetahgod Does anyone have any benchmarks running gpuowl on the new gtx 30 series or the new AMD 6800x series gpus?
If you buy me one of each I'll run benchmarks.
Deal?

2021-01-17, 12:52   #196
Viliam Furik

"Viliam Furík"
Jul 2018
Martin, Slovakia

44310 Posts

Quote:
 Originally Posted by Cheetahgod Does anyone have any benchmarks running gpuowl on the new gtx 30 series or the new AMD 6800x series gpus?
You should be able to find them in the main gpuOwl thread, and some of them may be in Perpetual benchmarking thread.

 2021-03-07, 19:14 #197 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 116348 Posts Gpuowl speed regression Among the v7.2 builds I've made and run on Win 10, -53 is fastest. But for straight PRP, v6.11-364 or -380 are faster still. There are some caveats to that result summary; tested on one system & one OS, with multiple fast gpus supported by a slow cpu. But some cases were substantial enough to more than erase the ostensible advantage of doing PRP & P-1 stage 1 simultaneously with many of the same squarings, compared to running separate V6.11 P-1 to normal bounds followed by V6.11 PRP. That's comparing time per iteration after initialization time, bringing gpu up to temperature for stable ~steady state iteration times, etc. Additional initialization in V7.2 puts it at an additional disadvantage (trig table setup for example). In all V7.2 tests mentioned here, PRP iterations were comparable to V6.11 iterations; PRP was continuing in iterations after P-1 had already completed, or P-1 was suppressed by B1=0,B2=0; in the worktodo line or tests-saved=0 in the worktodo line for V7.2-x timing tests. Part one PRP iteration speed regression for gpuowl-win v7.2-x on Radeon VII, Win 10 Pro x64 Variation ~3-7% 623M 7312/7087 = 1.032 927M 12263/11458 = 1.070 Code: gpuowl us/it us/it version 623M 927M V7.2-21 7119 11960 V7.2-69 7312 12257 V7.2-63 7311 12263 V7.2-53 x x system restart after these attempts v7.2-53 7088 11459 V7.2-39 fatal cl compile errors, so no timings possible V7.2-21 7106 11955 * good repeatability after restart and other versions v7.2-13 7141 12365 v7.2-53 7087 11458 * fastest and very good repeatability of timing, saves ~8 days on 927M fft size 36M 52M split 4k:9:512 4K:13:512 gpu # 2 3 x = failed to resume after running v7.2-63; stalling after opening few lines. This multigpu slow-cpu system sometimes exhibits loss of communication with a gpu, which can bog down other gpus also while system interrupts occupy 1 of the 2 cpu cores for ~1 hour or until a restart clears the condition. Note, gpus on this system are configured to run at reduced electrical power. Part two 100M, 300M, 900M, v6.11 & 7.2-53 PRP iteration speed regression for gpuowl-win v6.11-x on Radeon VII, Win 10 Pro x64 Variation 1.2 - 5.2% 100M 869/826 = 1.052, v6.11-380 fastest, > ~1.025 cost of usual-bounds (P-1+PRP) / PRP 300M 3360/3310 = 1.015, v6.11-364 fastest, > ~1.01 cost of usual-bounds (P-1 stage 1 + PRP) / PRP 900M 10895/10763 = 1.012, V6.11-364 -380 tie for fastest, ~1.01 cost of usual-bounds (P-1 stage 1 + PRP) / PRP Differences between v6.11-364 and v6.11-380 were +/- 1 count, not significant. Code: gpuowl us/it us/it us/it version 300M 103M 900M 6.11-380 3311 826 10763 6.11-364 3310 827 10763 7.2-53 3360 869 10895 6.11-318 3327 836 10815 fft size 16M 5.5M 52M split 1k:8:1k 1K:11:256 4k:13:512 gpu # 4 4 4 Note, gpus on this system are configured to run at reduced electrical power. The 300M run in progress was subsequently found to have mismatching checksums on some proof early interim files and did not produce a proof. Its logs showed it had no problem verifying its proof before the v6.11-380, -364, and -318 were alternately run on the existing interim file sets. Last fiddled with by kriesel on 2021-03-07 at 20:10
2021-03-07, 20:06   #198
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·5·251 Posts

Quote:
 Originally Posted by storm5510 From my POV, removing the ability to run P-1's only is unfortunate...
I think that can be approximated in v7.2 by using
Code:
-iters <N>         : run next PRP test for <N> iterations and exit. Multiple of 10000.
Select N to be just over what is required for P-1 stage 1. You'll get a little bit of PRP iteration after stage 2, depending on N and bounds. Also specify traditional bounds (~1M B1, 30M B2 for ~103M exponent), not the much higher v7.2 defaults. I think if run briefly without -iters it will show how many it takes. Or estimate by 1.443? x B1 and add a bit.
It may be faster though to stick with v6.11-380 for P-1, if the system is reliable enough and the exponent P-1 quick enough not to need the better error detection in P-1 of V7.2.

Last fiddled with by kriesel on 2021-03-07 at 20:32

 Similar Threads Thread Thread Starter Forum Replies Last Post preda GpuOwl 20 2020-10-17 06:51 SELROC GpuOwl 59 2020-10-02 03:56 GP2 GpuOwl 22 2020-06-13 16:57 M344587487 GpuOwl 14 2018-12-29 08:11 preda PrimeNet 2 2017-10-07 21:32

All times are UTC. The time now is 07:14.

Fri Apr 16 07:14:33 UTC 2021 up 8 days, 1:55, 0 users, load averages: 1.79, 2.01, 2.36