![]() |
|
|
#56 |
|
"Mihai Preda"
Apr 2015
101010110112 Posts |
It's sort of starting to work. Feel free to experiment. Probably plently of rough corners & bugs remaining, I'm still working on it.
|
|
|
|
|
|
#57 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
31×173 Posts |
Quote:
1) It requires using -pool. Not everyone uses it. I don't. 2) calling stage 1 large memory usage means only one P-1 run per gpu at a time. A pair of P-1-required-only worktodo files on the same gpu would run one process and stall the other. Even if there's enough gpu memory to run 4 stage ones. Traditionally stage 1 does not require much more ram than a primality test, which takes only 573 MB even at 181M exponent in v6.11-380; 371 MB for 100M exponent. 3) Performance advantage from multiple instances was reduced by recent raw performance improvements in gpuowl. As I recall cases were found where running multiple instances reduced performance. Are gpu ram requirements much larger for V7 for the same exponent? With V6.11 I've successfully run P-1 both stages up to 500M exponent on 8GB gpus, and up to 1G on 16GB (single instance/gpu). Last fiddled with by kriesel on 2020-10-07 at 09:04 |
|
|
|
|
|
|
#58 | ||||
|
"Mihai Preda"
Apr 2015
25338 Posts |
Quote:
Quote:
But it's possible -- just give each 50% of RAM, they go a bit slower that's all. Quote:
Quote:
|
||||
|
|
|
|
|
#59 |
|
"Mihai Preda"
Apr 2015
3×457 Posts |
The standalone P-1 worktype was replaced/integrated in PRP. So a worktodo line is a normal PRP line:
PRP=xxxxxAIDxxxx,1,2,100238077,-1,76,2 Note the last digit above, "2", indicating that a P-1 test is desired for this exponent before the PRP (or, in other words, the "2" indicates that the exponent didn't have P-1 done before). The line can optionally be preceded by explicit bounds, e.g.: B1=6000000;PRP=xxxxxAIDxxxx,1,2,100238077,-1,76,2 B2=50000000;PRP=xxxxxAIDxxxx,1,2,100238077,-1,76,2 B1=6000000,B2=50000000;PRP=xxxxxAIDxxxx,1,2,100238077,-1,76,2 (this allows setting bounds per-exponent). The bounds can also be specified "for all exponents" in config.txt or the command line with -b1 and -b2 . GpuOwl will run P-1 during the PRP if either of these is met: - "1" or "2" at the end of the PRP line (instead of "0") - B1 or B2 specified on the PRP line - b1 or b2 on command line or config How the bounds are established: Explicit bounds override defaults. per-exponent bounds override config bounds. The default B1 (used when no explicit B1 is specified) is roughly equal to exponent/20, which comes to 5M or 5.5M at the wavefront. The default B2 is 20xB1. (So for a 100M exponent without explicit bounds but with "2" at the end, the default bounds would be B1=5M,B2=100M). |
|
|
|
|
|
#60 |
|
"Composite as Heck"
Oct 2017
32E16 Posts |
Assuming the large memory requirement is within a single time interval memlock is a simple way to get the processes out of phase and should do the job fine. Assuming the normal case of the queued exponents being close and slowly increasing there should be barely any stalls after the initial one, barring the occasional small exponent from a previously expired allocation knocking the processes back in phase. It might be wise to let the processes get a bit more out of phase than immediately needed to account for most of the variability but that's only if micro-stalls are considered a problem.
|
|
|
|
|
|
#61 |
|
"Mihai Preda"
Apr 2015
3·457 Posts |
The B1 bound can't be changed during a PRP test. It must be specified with the same value over the length of the PRP (and from the beginning).
To change/update B1 after the PRP test started, you need to move-or-delete the exponent folder (with savefiles) and start anew the PRP with the new B1. The B2 can be changed during the PRP. Simply specify the new value, and the P2 will either be extended (if the new value is larger), or end early if smaller etc. Unintentionally changing B1 during an ongoing PRP test can be a bit annoying, as the PRP test will refuse to start with changed B1. If so, one can always override the B1 "only for this exponent" in the worktodo line (to keep it constant for the ongoing test). |
|
|
|
|
|
#62 |
|
"Bill Staffen"
Jan 2013
Pittsburgh, PA, USA
1A816 Posts |
I find myself again on the same page as M344587487 and again confused by the response.
The P-1 portion of the PRP run is in the begining, right? It does FS up to the bound, stops PRP'ing, runs the SS, and then picks back up again with the PRP. Wouldn't it release the memory at the end of the SS and spend the rest of the time with small memory footprint, at which point we could start the second PRP job with large memory? |
|
|
|
|
|
#63 |
|
"Composite as Heck"
Oct 2017
2×11×37 Posts |
I was responding to the description of memlock and theorising how to minimise the number of stalls for the typical use case, which at best is a micro-optimisation as the stalls might be numerous in that case but short. preda is giving us an info dump of how to customise bounds, not responding to me.
Your description is as I understand it, big memory required until it isn't. Memlock is just a simple way to only let one process use big memory at a time. |
|
|
|
|
|
#64 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
165468 Posts |
Quote:
Should manual lock removal become an irritation, the memlock-N directory could contain the process-ID of gpuowl. The memory would be considered locked only if memlock-N exists and the process-ID of gpuowl also matches. |
|
|
|
|
|
|
#65 |
|
Undefined
"The unspeakable one"
Jun 2006
My evil lair
22·1,549 Posts |
How come a mutex or semaphore can't work here? That is the primary use case those primitives were made for.
|
|
|
|
|
|
#66 |
|
"Composite as Heck"
Oct 2017
2·11·37 Posts |
Creating a file lock is implementing a mutex at the process level, to run two gpuowl workers you run two instances of the program so there are two processes. A normal mutex that you'd use to control threads within a process do not apply AFAIK.
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| GpuOwl PRP-Proof changes | preda | GpuOwl | 20 | 2020-10-17 06:51 |
| gpuowl: runtime error | SELROC | GpuOwl | 59 | 2020-10-02 03:56 |
| gpuOWL for Wagstaff | GP2 | GpuOwl | 22 | 2020-06-13 16:57 |
| gpuowl tuning | M344587487 | GpuOwl | 14 | 2018-12-29 08:11 |
| How to interface gpuOwl with PrimeNet | preda | PrimeNet | 2 | 2017-10-07 21:32 |