mersenneforum.org > Data Deep dive TF
 Register FAQ Search Today's Posts Mark Forums Read

2019-01-09, 16:03   #12
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7×491 Posts

Quote:
 Originally Posted by Uncwilly I suggest that you use James H's worktodo.txt balancer. Try to make each chunk posted as close as possible to the same GHz-days. Here is what it looks like as balanced as it can be: Code: [Worker #1] Factor=353000101,72,81 Factor=453000013,78,82 Factor=453000029,71,82 [Worker #2] Factor=153000349,75,77 Factor=253000937,76,79 Factor=203000117,73,79 Factor=303000227,70,80 Factor=403000069,71,81 Factor=353000047,77,81 [Worker #3] Factor=153000277,75,77 Factor=253000079,74,79 Factor=203000101,73,79 Factor=303000119,70,80 Factor=403000067,71,81 This breaks down to: • Worker #1 = 5,572.802 GHz-days • Worker #2 = 4,489.192 GHz-days • Worker #3 = 3,233.924 GHz-days No one has to buy a whole thing, you can just reprocess it for the next batch.
How I had it ordered was in time order assuming one gpu, one worker instance. (Or Laurv takes the front part, someone does the rest later.) The first (lower) few are fast enough on a current model gpu it does not much matter what order they occur in. After that, there's benefit in one exponent per limit exploration bin. That ordering allows for considerable pipelining; a partial report of the TF lets P-1 testing get started, and then P-1 and TF occur mostly in parallel, with a bit of stagger. Putting the hard slow high exponents first in the TF, at which exponents the P-1 may fail already at some lower level and so become unneeded for CUDAPm1 testing, does not make sense to me. (They could still be run eventually on prime95 or on a gpu model that reaches a little higher than others in CUDAPm1; the high water mark so far in CUDAPm1 v0.20 testing is on GTX 1060 3gb, 431M<pmax<432.6M, and 432.2M is looking promising with a little over a day to go.)
I forgot to mention, I didn't find good candidates at 103, 123, 133, or 143 in the first 1000 spans. But any assignment in the neighborhood would do. And if they are already TF to target, there's no TF to delegate before doing a P-1.

Last fiddled with by kriesel on 2019-01-09 at 16:49

2019-01-09, 16:34   #13
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7·491 Posts

Quote:
 Originally Posted by chalsall Actually, the "GPUto72 individual goal bit levels" phrase is giving credit where it's not due. GPU72's targets are guided by James' "economic cross-over" analysis, which has been peer reviewed by many very knowledgeable people. The exact "optimal" TF'ing depth is a function of the range (candidate size) and the particular card's abilities (specifically, the "compute version"). For example, a RTX 2080 Ti (c.v. 7.5) should TF deeper than a GTX 580 (c.v. 2.0). Please keep in mind that James' analysis is based on comparing what will "clear" a candidate faster (using statistical heuristics) ***using the same kit*** running either mfaktc vs a CUDA LL'er. Note that some TF (slightly) beyond the optimal economic cross-over point because they just like finding factors, or can't be bothered to switch between the different software.
I go by exponent lookup at James' site generally. It takes the which-gpu variable out of it, and that's where I adopted the GPU[to]72 naming from.

My impression is the difference between primenet TF target and GPUto72 target there, is a difference of 4 bits. For example, https://www.mersenne.ca/exponent/453000013 73-69=4 or https://www.mersenne.ca/exponent/53000039 82-78=4 or https://www.mersenne.ca/exponent/53000039 85-81=4. It's about 3 bits per exponent doubling (less than 3 at high values).

Since the GPUto72 project is apparently successfully keeping cpus off TF, and doing little P-1, and there's active interest in P-1 and primality testing on NVIDIA and AMD, and I'm running a lot of P-1 tests in CUDAPm1, going to the higher GPUto72 TF levels seems to me to make sense.

The distinction that the optimal bit level might shift depending on _which_ gpu model is concerned is a useful one. I haven't wrestled sufficiently with the question of where an optimal lies or what optimal means or how many dimensions an optimal-description may have, when considering multiple work types on multiple models of gpus and cpus. I have the impression the GIMPS community "jury is still out on that one". The practical difference between the simpler single TF level expressed as GPU72 level on James' excellent useful site and the ideal optimal for any combination of cpu or gpu0 primality test, gpu1 P-1, gpu2 TF is probably small in percentage throughput terms.

For owners of gpus that are in some way not typical, such as RTX20xx or Intel igps where the TF/LL throughput ratio is significantly higher, or the really old NVIDIA cards where the SP/DP ratio is significantly lower, relying on the gpu-model-specific curves is likely more important.

Last fiddled with by kriesel on 2019-01-09 at 16:38

2019-01-09, 18:05   #14
Mark Rose

"/X\(‘-‘)/X\"
Jan 2013
Ͳօɾօղէօ

22×5×139 Posts

Quote:
 Originally Posted by kriesel The distinction that the optimal bit level might shift depending on _which_ gpu model is concerned is a useful one. I haven't wrestled sufficiently with the question of where an optimal lies or what optimal means or how many dimensions an optimal-description may have, when considering multiple work types on multiple models of gpus and cpus. I have the impression the GIMPS community "jury is still out on that one". The practical difference between the simpler single TF level expressed as GPU72 level on James' excellent useful site and the ideal optimal for any combination of cpu or gpu0 primality test, gpu1 P-1, gpu2 TF is probably small in percentage throughput terms. For owners of gpus that are in some way not typical, such as RTX20xx or Intel igps where the TF/LL throughput ratio is significantly higher, or the really old NVIDIA cards where the SP/DP ratio is significantly lower, relying on the gpu-model-specific curves is likely more important.
RTX20xx will soon be very typical. But you are right in that the ratios are significantly different. For a 90M exponent, the crossover for my GTX 580s is 76 bits, for my GTX 1070s it's 77 bits, and for a RTX 20xx it's 78 bits.

In reality, I'm too lazy to switch software. So I TF a little higher than optimal on the 580s.

2019-01-09, 18:11   #15
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7·491 Posts

Quote:
 Originally Posted by Mark Rose RTX20xx will soon be very typical.
I used typical, in the sense of this gpu model is like or unlike other models.
I agree that they will become popular / common, as measured in units sold over time, and percentage of the deployed fleet.

There's no need to apologize for or defend time management.
Thanks for the 90M point of reference for bit levels on the various models.

Last fiddled with by kriesel on 2019-01-09 at 18:13

2019-01-09, 22:41   #16
chalsall
If I May

"Chris Halsall"
Sep 2002

882910 Posts

Quote:
 Originally Posted by kriesel I haven't wrestled sufficiently with the question of where an optimal lies or what optimal means or how many dimensions an optimal-description may have, when considering multiple work types on multiple models of gpus and cpus.
There is something known as "analysis paralysis".

Those who have been in the "game" a while know that making a decision and then moving forward is often better than over-thinking, and never moving. Yes, mistakes might be made, but one tends to learn from mistakes....

2019-01-09, 23:40   #17
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7·491 Posts

Quote:
 Originally Posted by chalsall There is something known as "analysis paralysis". Those who have been in the "game" a while know that making a decision and then moving forward is often better than over-thinking, and never moving. Yes, mistakes might be made, but one tends to learn from mistakes....
That applies to the game of life in general.
Plan, do, observe, adjust. DO is essential.
Henry Ford didn't build an optimal automobile from the start, he fed an engine gas by teaspoon initially.

 2019-01-10, 04:30 #18 potonono     Jun 2005 USA, IL 193 Posts I had grabbed all the 152 bin candidates. I'll add individual exponents with more variety as those finish over the next week-ish.
 2019-01-16, 03:46 #19 potonono     Jun 2005 USA, IL 193 Posts I've grabbed a couple more exponents, but I'm curious on your expectations for some that the Primenet server won't hand out. An exponent like 100,000,471 is only trial factored up to 73 bits, but already has P1 and two matching LL tests done. I assume that's why it can no longer be reserved for more work thru the manual gpu assignment page. Is that an exponent you would still want taken up to 76 bits, or not since it's already confirmed composite? Edit: nevermind, I see you indicated "without P-1 result or primality result" in the original request Last fiddled with by potonono on 2019-01-16 at 03:51
 2019-02-03, 15:19 #20 AJ Alon   Oct 2018 22×3 Posts I'll donate some time as I'm finishing up a round of GPU72 DCTF. I've reserved the following from the 121 bin. Let me know if this makes sense. Factor=N/A,121100117,72,77 Factor=N/A,121100171,72,77 Factor=N/A,121100219,72,77 Factor=N/A,121100233,72,77 Factor=N/A,121100269,72,77 Factor=N/A,121100351,72,77 Factor=N/A,121100383,72,77 Factor=N/A,121100407,72,77 Factor=N/A,121100411,72,77 Also, possibly not relevant, but I did recently take M421000049 to 82 bits (just exploring the much higher bit ranges...)
 2019-02-08, 20:10 #21 AJ Alon   Oct 2018 22·3 Posts Reserved a few exponents in the 123 bin. Code: 123449987,72,77 123449939,72,77 123449917,72,77 123449819,72,77 123449791,72,77 123449743,72,77 123449737,72,77 123449663,72,77 123449611,72,77 123449591,72,77
 2019-03-07, 05:35 #22 AJ Alon   Oct 2018 1210 Posts Just reserved these in the 353 bin. Think I'll be good for a while in TF. Code: 353000059,72,81 353000177,72,81 353000071,72,81

 Similar Threads Thread Thread Starter Forum Replies Last Post cheesehead Science & Technology 47 2014-12-14 13:45 diep Math 5 2012-10-05 17:44 MercPrime Software 22 2009-01-13 20:10 lavalamp Open Projects 53 2008-12-01 03:59 ixfd64 Lounge 5 2005-07-06 13:46

All times are UTC. The time now is 23:50.

Wed Apr 1 23:50:31 UTC 2020 up 7 days, 21:23, 3 users, load averages: 1.66, 1.52, 1.35