mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2020-12-13, 04:31   #188
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

5·271 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
Has 6.x reached end-of-life, or are you going to continue updating it alongside the 7.x branch?
I'm personally not using 6.x myself, thus I don't have a lot of motivation to improve it. From my POV, 7.x is now better in a couple of ways than 6.x, and I prefer to focus my (limited) resources on 7.x. OTOH in the open spirit I'm not against others having different oppinions, and using different versions in different ways etc.
preda is offline   Reply With Quote
Old 2020-12-13, 21:45   #189
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

3·5·139 Posts
Default

P-1 factors seem to always be reported as found by stage 1. This leads to confusing reports like https://www.mersenne.ca/exponent/104592713. Can this be fixed easily or is this behavior intentional?
frmky is offline   Reply With Quote
Old 2020-12-13, 22:40   #190
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

13AC16 Posts
Default

Not a problem in gpuowl v6.11-380; 101224807 which unlike gpuowl v7.x, only does the stage2 gcd once after the full B2, so there's no ambiguity.

Last fiddled with by kriesel on 2020-12-13 at 22:41
kriesel is offline   Reply With Quote
Old 2020-12-19, 15:00   #191
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

19·101 Posts
Default

Quote:
Originally Posted by preda View Post
I'm personally not using 6.x myself, thus I don't have a lot of motivation to improve it. From my POV, 7.x is now better in a couple of ways than 6.x, and I prefer to focus my (limited) resources on 7.x. OTOH in the open spirit I'm not against others having different opinions, and using different versions in different ways etc.
From my POV, removing the ability to run P-1's only is unfortunate...

storm5510 is offline   Reply With Quote
Old 2020-12-19, 21:14   #192
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

7428 Posts
Default

Yes, that would exclude everybody who is doing P-1 for exponent with known residues.

Last fiddled with by kruoli on 2020-12-19 at 21:15 Reason: Conjunctive.
kruoli is online now   Reply With Quote
Old 2020-12-20, 18:56   #193
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

111011111112 Posts
Default

Quote:
Originally Posted by kruoli View Post
Yes, that would exclude everybody who is doing P-1 for exponent with known residues.
I run first-time P-1's from Primenet by manual reservation. The only thing done to them prior is TF to a specific bit level, Currently, it is 2^76. I can also get these from GPUto72's website. A P-1 with a "known residue" would indicate to me that is has a factor, meaning it is composite and requires no further effort.
storm5510 is offline   Reply With Quote
Old 2021-01-17, 05:21   #194
Cheetahgod
 
May 2020

52 Posts
Default

Does anyone have any benchmarks running gpuowl on the new gtx 30 series or the new AMD 6800x series gpus?
Cheetahgod is offline   Reply With Quote
Old 2021-01-17, 05:43   #195
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

2×5×461 Posts
Default

Quote:
Originally Posted by Cheetahgod View Post
Does anyone have any benchmarks running gpuowl on the new gtx 30 series or the new AMD 6800x series gpus?
If you buy me one of each I'll run benchmarks.
Deal?

petrw1 is offline   Reply With Quote
Old 2021-01-17, 12:52   #196
Viliam Furik
 
"Viliam Furík"
Jul 2018
Martin, Slovakia

2×223 Posts
Default

Quote:
Originally Posted by Cheetahgod View Post
Does anyone have any benchmarks running gpuowl on the new gtx 30 series or the new AMD 6800x series gpus?
You should be able to find them in the main gpuOwl thread, and some of them may be in Perpetual benchmarking thread.
Viliam Furik is offline   Reply With Quote
Old 2021-03-07, 19:14   #197
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·1,259 Posts
Default Gpuowl speed regression

Among the v7.2 builds I've made and run on Win 10, -53 is fastest. But for straight PRP, v6.11-364 or -380 are faster still. There are some caveats to that result summary; tested on one system & one OS, with multiple fast gpus supported by a slow cpu. But some cases were substantial enough to more than erase the ostensible advantage of doing PRP & P-1 stage 1 simultaneously with many of the same squarings, compared to running separate V6.11 P-1 to normal bounds followed by V6.11 PRP. That's comparing time per iteration after initialization time, bringing gpu up to temperature for stable ~steady state iteration times, etc. Additional initialization in V7.2 puts it at an additional disadvantage (trig table setup for example).

In all V7.2 tests mentioned here, PRP iterations were comparable to V6.11 iterations; PRP was continuing in iterations after P-1 had already completed, or P-1 was suppressed by B1=0,B2=0; in the worktodo line or tests-saved=0 in the worktodo line for V7.2-x timing tests.

Part one
PRP iteration speed regression for gpuowl-win v7.2-x on Radeon VII, Win 10 Pro x64
Variation ~3-7%
623M 7312/7087 = 1.032
927M 12263/11458 = 1.070
Code:
gpuowl   us/it  us/it
version  623M   927M
V7.2-21  7119   11960
V7.2-69  7312   12257
V7.2-63  7311   12263
V7.2-53  x      x    system restart after these attempts
v7.2-53  7088   11459
V7.2-39  fatal cl compile errors, so no timings possible
V7.2-21  7106   11955    * good repeatability after restart and other versions
v7.2-13  7141   12365
v7.2-53  7087   11458    * fastest and very good repeatability of timing, saves ~8 days on 927M
fft size 36M    52M
split  4k:9:512 4K:13:512
gpu #    2      3
x = failed to resume after running v7.2-63; stalling after opening few lines.
This multigpu slow-cpu system sometimes exhibits loss of communication with a gpu, which can bog down other gpus also while system interrupts occupy 1 of the 2 cpu cores for ~1 hour or until a restart clears the condition.
Note, gpus on this system are configured to run at reduced electrical power.

Part two
100M, 300M, 900M, v6.11 & 7.2-53
PRP iteration speed regression for gpuowl-win v6.11-x on Radeon VII, Win 10 Pro x64
Variation 1.2 - 5.2%
100M 869/826 = 1.052, v6.11-380 fastest, > ~1.025 cost of usual-bounds (P-1+PRP) / PRP
300M 3360/3310 = 1.015, v6.11-364 fastest, > ~1.01 cost of usual-bounds (P-1 stage 1 + PRP) / PRP
900M 10895/10763 = 1.012, V6.11-364 -380 tie for fastest, ~1.01 cost of usual-bounds (P-1 stage 1 + PRP) / PRP
Differences between v6.11-364 and v6.11-380 were +/- 1 count, not significant.
Code:
gpuowl   us/it   us/it   us/it
version  300M    103M    900M
6.11-380 3311    826     10763 
6.11-364 3310    827     10763
7.2-53   3360    869     10895 
6.11-318 3327    836     10815
fft size 16M     5.5M    52M
split 1k:8:1k 1K:11:256 4k:13:512
gpu #    4       4       4
Note, gpus on this system are configured to run at reduced electrical power.
The 300M run in progress was subsequently found to have mismatching checksums on some proof early interim files and did not produce a proof. Its logs showed it had no problem verifying its proof before the v6.11-380, -364, and -318 were alternately run on the existing interim file sets.

Last fiddled with by kriesel on 2021-03-07 at 20:10
kriesel is offline   Reply With Quote
Old 2021-03-07, 20:06   #198
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

116548 Posts
Default

Quote:
Originally Posted by storm5510 View Post
From my POV, removing the ability to run P-1's only is unfortunate...

I think that can be approximated in v7.2 by using
Code:
-iters <N>         : run next PRP test for <N> iterations and exit. Multiple of 10000.
Select N to be just over what is required for P-1 stage 1. You'll get a little bit of PRP iteration after stage 2, depending on N and bounds. Also specify traditional bounds (~1M B1, 30M B2 for ~103M exponent), not the much higher v7.2 defaults. I think if run briefly without -iters it will show how many it takes. Or estimate by 1.443? x B1 and add a bit.
It may be faster though to stick with v6.11-380 for P-1, if the system is reliable enough and the exponent P-1 quick enough not to need the better error detection in P-1 of V7.2.

Last fiddled with by kriesel on 2021-03-07 at 20:32
kriesel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
GpuOwl PRP-Proof changes preda GpuOwl 20 2020-10-17 06:51
gpuowl: runtime error SELROC GpuOwl 59 2020-10-02 03:56
gpuOWL for Wagstaff GP2 GpuOwl 22 2020-06-13 16:57
gpuowl tuning M344587487 GpuOwl 14 2018-12-29 08:11
How to interface gpuOwl with PrimeNet preda PrimeNet 2 2017-10-07 21:32

All times are UTC. The time now is 12:17.

Thu Apr 22 12:17:39 UTC 2021 up 14 days, 6:58, 0 users, load averages: 2.17, 2.22, 2.09

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.