mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   mersenne.ca "GPU72" target (https://www.mersenneforum.org/showthread.php?t=28307)

tuckerkao 2022-12-16 00:30

mersenne.ca "GPU72" target
 
Pminus1 with the recommended bounds from GPU72 typically take less time to complete.

kriesel 2022-12-16 01:06

[QUOTE=tuckerkao;619925]Pminus1 with the recommended bounds from GPU72 typically take less time to complete.[/QUOTE]Running smaller P-1 bounds than optimal for probable time saved by avoiding some primality tests would be quicker for the P-1 but slower on average for the P-1 plus the needed PRP. It seems you're missing the big picture.

chalsall 2022-12-17 04:19

[QUOTE=James Heinrich;619935]Yes, doing things worse is usually faster. You can lower the bounds to near-zero and be done almost instantly, but that doesn't really benefit anyone. Please let Prime95 decide what bounds should be used based on the exponent status and your available RAM.[/QUOTE]

Just for the record... As usual, tuckerkao has presented inaccurate information.

GPU72 doesn't have recommended bounds for P-1. It actually gives Pfactor lines with overlly aggressive "Tests saved" values (it shouldn't be 2 anymore; more like 1.3), and let's the Prime95/mprime client choose the correct bounds.

I just wanted to correct the tuckerkao's misrepresentation of reality.

tuckerkao 2022-12-17 04:45

[QUOTE=chalsall;620043]GPU72 doesn't have recommended bounds for P-1.[/QUOTE]
Who's the GPU72 that presents the recommendations for the TF bit depth and P-1 bounds(very specific numbers for B1 and B2) on this page - [URL="https://www.mersenne.ca/exponent/108377323"]https://www.mersenne.ca/exponent/108377323[/URL]

Comparison of Actual, Primenet, GPU72, Difference.

Maybe this GPU72 is not your group.

[QUOTE=Prime95;620033]The small FFT torture test does not test RAM as much as a blend torture test[/QUOTE]
I think it's almost certain that you've found the cure. After the memory speed tuned down, Prime95 is no longer experiencing errors for at least several hours.

James Heinrich 2022-12-17 04:49

[QUOTE=chalsall;620043]GPU72 doesn't have recommended bounds for P-1[/QUOTE]No, that's probably my fault -- if you look at the mersenne.ca page for any exponent (e.g. [ca]168455141[/ca]) there's the "Actual", "PrimeNet" and "GPU72" lines. The TF bitlevel is probably pretty close to what GPU72 recommends, the bounds are a vague approximation of what a machine might pick for that exponent with 1 test saved. But those numbers are not endorsed by GPU72, so I should probably rename my data to something other than "PrimeNet" and "GPU72" (which only applies to the TF level).

James Heinrich 2022-12-17 16:22

[QUOTE=Andrew Usher;620095]it's surely his settings that caused it to run hot in the first place
CPUs don't normally reach that temperature with default settings
Most people including me have a max CPU temp around 70 C.[/QUOTE]Historically, yes, but Zen4 is different, AMD has said they're designed to run at up to [URL="https://www.hardwaretimes.com/amd-ryzen-7000-cpus-are-built-to-run-at-95c-24x7-without-affecting-lifespan-or-reliability/"]95°C continuously[/URL].
But hitting 90+C at 156W indicates a cooling problem (small/inadequate cooler).
I went big with my system ([URL="https://www.arctic.de/en/Liquid-Freezer-II-420-A-RGB/ACFRE00109A"]Arctic Liquid Freezer II 420 AIO[/URL]) and the 7950X is still happy to get above 90°C, especially when running small FFTs, but it takes 230+W of power to do so (nominal TDP is 170W for the 7950X, but it's happy to exceed that up to its thermal limit).

I have queued up those 6 suspect P-1s for re-do (on my non-7950X system so it'll take a few days). Running to a bit higher bounds (especially B2).

James Heinrich 2022-12-17 16:35

[QUOTE=James Heinrich;620046]I should probably rename my data to something other than "PrimeNet" and "GPU72"[/QUOTE]I have decided to remove the "PrimeNet" line entirely, since both the TF level and B1/B2 are irrelevant. I now have "Actual" and "Target", and the "Difference" line makes more sense than it did before.

It's easy to ignore inconsistencies because "it's always been that way", so thanks [i]chalsall[/i] for pointing it out.

Andrew Usher 2022-12-17 18:43

I was not aware of those specifics, just saying that his symptoms sounded like they could be heat-related. It is not just the CPU that can be affected and cause instabilities. At least it's one thing to try in addressing the problem.

I think your last change may have been hasty. Although it certainly was no help to have two (not much) different P-1 levels there, the Primenet TF level is not wholly obsolete - the server still uses it. More seriously, I think you changed something you didn't intend, because all the bounds at higher exponents have doubled. Only just above 1M is it about the same; going higher it rapidly approaches twice the B1 and B2 that the 'GPU72' was. For example, the recommendation at the OBD level, which Kriesel accepted, was 17M/1G, and is now an unreasonable 32M/2G. At Tucker Kao's level, it has also doubled, and I hope this isn't at attempt to get him to change because while his B2 would have stood doubling (because of the new 30.8 algorithm) his B1 certainly shouldn't be. I would recommended reverting until you figure out what happened (tests saved value changed?) as there certainly should have been no change for most or all exponents. (By the way, the optimums for sub-wavefront exponents are still relevant as they should be used before PRP tests of cofactors if there's been no P-1 or significant ECM yet - a common situation.)

kriesel 2022-12-17 18:54

And the target values ought not increase, for exponents beyond the reach of mprime v30.8, which maxes out at 1169M on AVX512, 920M on AVX2, 596M on AVX. The polynomial multiplication implementation is only present in mprime/prime95 v30.8 and up, not lower versions, and not at all in CUDAPm1, Gpuowl, or Mlucas released P-1 implementations.

Andrew Usher 2022-12-17 19:02

His bounds are still computed based on the 'old' algorithm (linear/prime-pairing) as always, so _that_ is not the issue. While we are on this subject I'd also advise that 'target' values be removed for exponents below 1M and above 2^32: the former are meaningless and based on a TF value much too far from the actual; the latter are not possible of attainment in the foreseeable future.

James Heinrich 2022-12-17 19:15

[QUOTE=Andrew Usher;620110]I think you changed something you didn't intend, because all the bounds at higher exponents have doubled.[/QUOTE]It was intentional. The P-1 bounds/probability algorithm (of which mersenne.ca uses a simplified version that ignores available RAM and therefore tends to underestimate B2) intrinsically outputs a "minimum", "balanced", "maximum" pair of B1/B2. The former GPU72 line showed "balanced", the new "Target" line shows "maximum". Running the 168M exponent on 32GB Prime95 selects a lower B1 and higher B2 than shown.

Short answer is that P-1 bounds selection is complex and dependent not only on exponent and TF but also available RAM. Actual bounds to be used should always be selected by Prime95/gpuowl/whatever and not based on the rough ballpark estimates on mersenne.ca

Here are a selection on bounds and probabilities selected across a wide range of exponents by Prime95 v30.8b17 with 45GB allocated (targeting 1.0 tests saved):[code]Prime95 v30.8b17, 45GB RAM, 1.0 tests saved:
Pfactor=N/A,1,2,1000003,-1,60,1 = B1= 11000, B2= 26088000, 5.13%
Pfactor=N/A,1,2,1333357,-1,60,1 = B1= 13000, B2= 25856000, 5.62%
Pfactor=N/A,1,2,1777823,-1,61,1 = B1= 18000, B2= 27546000, 5.71%
Pfactor=N/A,1,2,2370451,-1,63,1 = B1= 22000, B2= 30218000, 5.01%
Pfactor=N/A,1,2,3160607,-1,64,1 = B1= 31000, B2= 24500000, 4.91%
Pfactor=N/A,1,2,4214173,-1,65,1 = B1= 31000, B2= 30967000, 4.73%
Pfactor=N/A,1,2,5618903,-1,66,1 = B1= 45000, B2= 40113000, 5.00%
Pfactor=N/A,1,2,7491893,-1,67,1 = B1= 59000, B2= 40842000, 4.96%
Pfactor=N/A,1,2,9989191,-1,68,1 = B1= 80000, B2=102211000, 5.65%
Pfactor=N/A,1,2,13318951,-1,68,1 = B1= 108000, B2=101248000, 6.20%
Pfactor=N/A,1,2,17758603,-1,69,1 = B1= 141000, B2=136199000, 6.36%
Pfactor=N/A,1,2,23678147,-1,70,1 = B1= 181000, B2=152884000, 6.34%
Pfactor=N/A,1,2,31570867,-1,71,1 = B1= 228000, B2=156349000, 6.21%
Pfactor=N/A,1,2,42094513,-1,72,1 = B1= 291000, B2=156084000, 6.08%
Pfactor=N/A,1,2,56126053,-1,73,1 = B1= 372000, B2=167054000, 6.01%
Pfactor=N/A,1,2,74834741,-1,74,1 = B1= 463000, B2=161188000, 5.83%
Pfactor=N/A,1,2,99779663,-1,76,1 = B1= 562000, B2=154313000, 5.14%
Pfactor=N/A,1,2,133039553,-1,77,1 = B1= 685000, B2=145504000, 4.95%
Pfactor=N/A,1,2,177386113,-1,78,1 = B1= 891000, B2=147436000, 4.86%
Pfactor=N/A,1,2,236514827,-1,79,1 = B1=1085000, B2=145788000, 4.71%
Pfactor=N/A,1,2,315353107,-1,80,1 = B1=1357000, B2=154771000, 4.63%
Pfactor=N/A,1,2,420470833,-1,82,1 = B1=1609000, B2=138241000, 4.03%
Pfactor=N/A,1,2,560627831,-1,83,1 = B1=2042000, B2=146766000, 3.97%[/code]

Andrew Usher 2022-12-17 19:31

There seems to be no good reason for that change; 'balanced' would be called that for a reason. As your bounds clearly aren't based on 30.8 (and have little resemblance to those chosen by it), test results should have been from an earlier version - and as Kriesel points out, programs other than prime95 have not implemented the new algorithm at all.

It may be true for the casual user that he should always let the program select bounds for him, but for more mathematically sophisticated people that's not reasonable and sounds condescending.

James Heinrich 2022-12-17 19:38

[QUOTE=Andrew Usher;620117]As your bounds clearly aren't based on 30.8[/quote]The bounds algorithm is my simplified PHP implementation of the one written by [URL="https://github.com/preda/gpuowl/blob/master/pm1/pm1.cpp"]Mihai for gpuowl[/URL], and adopted by George for Prime95 v30.(?) I was going to say v30.8, but perhaps it was v30.7, I don't really remember.
edit: it must be v30.7, my comments say I converted the code on 2020-Aug-10. So maybe perhaps I need to dig into the v30.8 code and see if I can figure out a new algorithm. Not looking forward to that.
edit2: the top of [c]pm1prob.c[/c] shows:[code]/* Copied with permission from https://github.com/preda/gpuowl/pm1 on 2020-08-11 */
/* Code courtesy of Mihai Preda */
/* Modified to work on non-Mersennes and P+1 factoring (variable takeAwaybits) */[/code]

[QUOTE=Andrew Usher;620117]It may be true for the casual user that he should always let the program select bounds for him, but for more mathematically sophisticated people that's not reasonable and sounds condescending.[/QUOTE]Presumably those mathematically sophisticated people (of which I most certainly am [b]not[/b] one) know better and can make their own judgement calls about what bounds to select in their use-case.

Andrew Usher 2022-12-17 20:12

Simply reverting to the old 'GPU72' numbers would be fine. The numbers currently posted [B][I]can't possibly[/I][/B] be correct for 1 test saved, because (except the smallest exponents) the ratio (P-1 GHz-days)/(PRP GHz-days) exceeds the factor probability (since this is pre-30.8, GHz-days are a valid measure). If your algorithm is correctly using just 1 test saved, then its 'maximum' outputs are intentionally overkill.

By keeping those values up, you are advising people to do wasteful P-1, by either using both bounds directly, or using B1 and allowing prime95 to determine the optimal B2 for it.

James Heinrich 2022-12-17 20:17

[QUOTE=Andrew Usher;620121]Simply reverting to the old 'GPU72' numbers would be fine[/QUOTE]I have reverted the Target B1/B2 to what GPU72 line showed before.

In the longer time I'll try to get the v30.8 bounds selection code ported to PHP, but that may take a while.

Andrew Usher 2022-12-17 20:34

Thanks. This really belongs more in the mersenne.ca forum, where I've just posted about a different (low-priority) issue with your site.

chalsall 2022-12-18 01:33

[QUOTE=James Heinrich;620104]I have decided to remove the "PrimeNet" line entirely, since both the TF level and B1/B2 are irrelevant. I now have "Actual" and "Target", and the "Difference" line makes more sense than it did before.[/QUOTE]

Channeling my Mr. Burns... "Excellent...

[QUOTE=James Heinrich;620104]It's easy to ignore inconsistencies because "it's always been that way", so thanks [i]chalsall[/i] for pointing it out.[/QUOTE]

Thanks for the call-out. But, really, this is all your work... Seriously. 8-) To give thanks where thanks are due...

When GPU72 first started we had no idea what we were doing, other than feeding compute questions to answer. This was leveraging on early work Oliver (TheJudger) had done as a proof of concept (which then scaled nicely) using nVidia CUDA.

James quickly stepped in, and did a ***deep-dive*** of the optimal economic cross-over points we were actually working with. As a function of each GPU make and model. With a huge amount of peer review.

[URL="https://www.mersenne.ca/cudalucas.php?model=774"]This is just one example of the reports that informed us.[/URL] Thus, I think it is completely appropriate for the nomenclature change on MERSENNE.CA.

P.S. I sometimes go over the top with compliments. But few appreciate just how much talent, dedication, and *years* of work are involved with maintaining the longest running Distributed Computing project.

P.P.S. The BH3 Christmas Party was this evening. Good tunes. Good friends. Serious conversations. It was nice getting back out into the "Big Blue Room"...

chalsall 2022-12-18 01:37

[QUOTE=James Heinrich;620122]In the longer time I'll try to get the v30.8 bounds selection code ported to PHP, but that may take a while.[/QUOTE]

No good deed goes unpunished... 9^)


All times are UTC. The time now is 16:23.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.