mersenneforum.org (https://www.mersenneforum.org/index.php)
-   -   What determine if P-1 factoring is used? (https://www.mersenneforum.org/showthread.php?t=26849)

 kriesel 2021-07-22 02:27

[QUOTE=Siegmund;583723]It does seem like the default ought to be 2 for LL testers and only 1 for PRP testers.[/QUOTE]When standalone P-1 is performed, how is the server to predict whether the first primality test that will be assigned later and performed later will be LL, PRP without proof or with bad proof, or PRP with a good proof that verifies as correct?

 LaurV 2021-07-22 02:41

My two cents: the value should stay 2. Little bit more P-1 won't hurt anybody, and it may be beneficial on long term.

 Zhangrc 2021-07-22 04:27

[QUOTE=LaurV;583731]the value should stay 2.[/QUOTE]

If one is interested in P-1 factors, he or she of course could use 2-primarity-test-saved bounds. However, some people just want to test as much exponents as possible, using PRP with proof, the 1-test-saved bounds are OK. Of course, we could go into the middle, using 1.2-test-saved bounds, since there are some PRP tests with bad proof or stalling out.

Personally I suggest doing more TF work at current PRP wavefront. By adding the throughput of top 500 producers, we have done 64 million GHZDays on PRP tests in the last year, but over 147 million GHZDays on TF (over twice as much work!) . For this reason, we could TF a bit higher, say 2^77 (even 2^78), just like what we have done to 95M exponents.

 Uncwilly 2021-07-22 04:51

[QUOTE=Zhangrc;583737]Personally I suggest doing more TF work at current PRP wavefront.[/QUOTE]Quite a bit of the TF work is being done away from the area of FTC's. SRBase is moving through exponents bit level by bit level and not staying below 120M. Also user TJAOI is doing a lot of work on low exponents (well below the FTC rang and at lower bit levels). So these 2 should not be counted toward the TF total. Also, with PRP and certs there is less work being done on to test and confirm exponents. This changes the calculus of what makes sense WRT to TF vs primality testing. Those running GPU72 closely watch the front of the Cat 4 FTC wave front, the Cat 3 FTC wavefront, and the currently available TF firepower for working ahead of the FTC's and what sort of work the various users prefer.

 LaurV 2021-07-22 05:26

[QUOTE=Uncwilly;583739]Also... <snip>

Also... <snip>

Also... <snip>
[/QUOTE]
Also... we are comparing apples with watermelons, the TF credit unit and PRP credit unit are a lot different. One good GPU can spit 3000-6000 GHzDays for every day it does TF, but only 300-800 GHzDays for every day it does LL/PRP/P-1. This is remnant from the time when CPUs were used to TF, and the credit values were calculated to be approximately equal per time unit spent [U]by the CPU[/U] for each work type. GPUs joining the fight completely changed the equation: now you can get 5 to 10 times more "credit" if you use your GPU for TF than for PRP, and there are people still motivated by that, especially young gamers whose gaming cards are not good at FP64 flops (needed for PRP) but are excellent at FP32 flops (good for gaming and TF). Unfortunately (or more exactly, fortunately) this was never fixed, because the re-balancing is not easy, and it will upset some people. On the other hand, giving a lot more TF credit per unit of time may be beneficial because that's the ONLY incentive given for TF. Some people with gaming GPUs (which are anyhow better at TF, and worse at LL/PRP) will join and do TF to advance in tops fast - two average gaming cards can put you on the tops in few weeks - therefore helping the project, which is always in need of "more TF". The TF does not have other incentive (unlike PRP, where you can find a prime and take some money) beside altruism ("we want to help the project"), idiocy ("we want to find factors, or to get a lot of credits, albeit we know none of the both are of any use"), or entertainment ("yeah, it is fun! hihi" and make donkey face). So, let TF give more credit, that's ok. I personally will jump to grab some of it! :razz:

 Zhangrc 2021-07-22 05:39

[QUOTE=LaurV;583740]now you can get 5 to 10 times more "credit" if you use your GPU for TF than for PRP.[/QUOTE]
Sometimes it's 30 times more, depending on GPU model. My GPU (GTX1650) earn approximately 900 GHZDays per day doing TF but only less than 30 GHZDays doing PRP. As a result, I factor every exponent I test to at least 2^77, sometimes to 2^79.

 Siegmund 2021-07-22 06:39

[QUOTE=kriesel;583729]When standalone P-1 is performed, how is the server to predict whether the first primality test that will be assigned later and performed later will be LL, PRP without proof or with bad proof, or PRP with a good proof that verifies as correct?[/QUOTE]

Perhaps 2 is reasonable if a person requests a standalone P-1 assignment.

It seems less reasonable when one receives a PRP assignment for a number that has not yet had P-1 done on it. (I have been getting quite a few of these, for the past year or so.)

And is not the intention for all future world-record-sized testing to be PRP, not LL, no? So the expected number of tests saved is something like 1.03?

If extra factors are found, great - it just seems the default ought to be to minimize time needed to resolve each exponent.

 LaurV 2021-07-22 06:47

[QUOTE=Zhangrc;583743]As a result, I factor every exponent I test to at least 2^77, sometimes to 2^79.[/QUOTE]
Yep, that's exactly what I mean. I do the same. But at the end, what should drive you (general you, not personal) should be the speed you eliminate exponent candidates. If you can run 150 TF assignments and find two factors per day, but it will take you more than half day to run a PRP test in the same hardware, and I mean, at the front level, not picking low hanging, large expos, low bitlevel TF assignments, than your hardware should do TF. You help the project more that way.

 kriesel 2021-07-22 06:51

[QUOTE=LaurV;583740]you can get 5 to 10 times more "credit" if you use your GPU for TF than for PRP[/QUOTE]Lowest I had on record from older GPU models is 12:1 (TF credit/day) / (PRP or LL or P-1 / day). [URL]https://www.mersenneforum.org/showpost.php?p=497567&postcount=16[/URL] RTX20xx or GTX16xx are much higher; I think RTX30xx higher yet. (Mersenne.ca has rtx3090 at 4900 TF, ~98 PRP etc; 50:1; rtx2080 2623 TF, 62.5 PRP etc, 42:1)

Radeon VII mfakto v0.15pre6 was ~1300GHzD/day on Windows; there are benchmark results supporting up to 486/day in GpuOwl; that's 2.67:1. And noted for its incomparable PRP performance, at \$700 original list, IIRC beat by a factor of 2, \$2500 used Tesla P100. Some of the Teslas may have sufficiently strong DP to show low ratios also.

CPUs I've checked were 0.7 to 1.3.

Hence the general rule, TF on GPUs, PRP or P-1 or LL on CPUs.
Except on Radeon VII and other recent AMD GPUs, & maybe Teslas.

 drkirkby 2021-07-23 07:49

[QUOTE=Siegmund;583747]Perhaps 2 is reasonable if a person requests a standalone P-1 assignment.
<snip>

If extra factors are found, great - it just seems the default ought to be to minimize time needed to resolve each exponent.[/QUOTE]
Having done a quick test, with results obtained after a few beers,

[URL]https://www.mersenneforum.org/showpost.php?p=583797&postcount=54[/URL]

which I intend doing more thoroughly when 100% sober, I'm not convinced that any value of[B] tests_saved[/B] can really be said to maximise the throughput [B]unless you test the P-1 timing on your computer(s).[/B] I tested the run-time of the P-1 test on my Dell PC under the same circumstances
[LIST][*]Using the exponent [URL="https://www.mersenne.org/report_exponent/?exp_lo=M105216541&full=1"]M105216541[/URL][*]4 workers[*]3 workers doing PRP tests with exponents around 104-105 million (13 cores each)[*]1 worker doing a P-1 test (13 cores)[*]Dell 7920 tower workstation with 2 x Intel Xeon 8167M CPUs.[/LIST]What I found on that quick test was.
[LIST=1][*]Based on saving 1 primality test. B1=434000, B2=21339000. Chance of finding a factor is an estimated 3.60%. Started 15:15. Finished 16:57. [B]Runtime = 1 hour, 42 minutes = 102 minutes.[/B] Used 207872 MB (203 GB) RAM. 9.0812 GHz days credit.[*]Based on saving 2 primality tests. B1=889000, B2=52784000. Chance of finding a factor is an estimated 4.66%. Started 1701. Finished 22:16. [B]Runtime = 5 hours, 15 minutes = 315 minutes.[/B] Used 311330 MB (304 GB) RAM. 21.4559 GHz days credit.[/LIST] The ratio of runtimes of the P-1 tests (315/102=3.08824:1) was a lot more than the ratio of GHz days credits (21.4559/9.0812=2.36267).

Given the optimal bounds for P-1 tested are based on the calculated computational effort (GHz days), the tests_saved will not be optimal if the actual run-time of the test (in minutes) does not reflect the credit in GHz days. It changes where the optimal point is. Clearly that optimal point [B]could[/B] depend on things such as
[LIST=1][*]What else is running on the machine[*]Cache[*]RAM[*]Whether a CPU or GPU are used, and what model.[*]Number or cores devoted to the task.[*]The actual exponent[*]FFT size, which is a function of the exponent.[*]Bounds, B1 and B2.[*]Phase of the moon and direction of wind flow.[*]Things I have not thought of.[/LIST]As I said, I did not do this under ideal circumstances, and therefore the results need double-checking, but I intend testing assuming the tests saved are 0.9 (if mprime accepts <1.0) and 1.1, then measuring the actual run-time of the PRP test.

IMHO it is a shame so much effort (GHz days) is being put into testing exponents well away from the wavefront. They are not really helpful in finding primes.

 axn 2021-07-23 08:42

[QUOTE=drkirkby;583811] I'm not convinced that any value of[B] tests_saved[/B] can really be said to maximise the throughput [B]unless you test the P-1 timing on your computer(s).[/B] I tested the run-time of the P-1 test on my Dell PC under the same circumstances
<snip>
The ratio of runtimes of the P-1 tests (315/102=3.08824:1) was a lot more than the ratio of GHz days credits (21.4559/9.0812=2.36267).
[/QUOTE]
I do not believe the server has been adjusted to give the correct GHzDay credit for the improved P-1 Stage 2 implemented from 30.4 onwards. I believe it still assumes that the P-1 has been done by the older algorithm and credits accordingly. Hence do not draw any hard and fast conclusions based on those numbers. Ideally, due to the improved stage 2, the credit should be suitably discounted.

However, the optimality calculations done by the program itself _do_ take in account the specifics of the new algo. TL;DR dont trust the GHzDay numbers.

All times are UTC. The time now is 21:52.