mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data

Reply
 
Thread Tools
Old 2019-01-09, 16:03   #12
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7×491 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
I suggest that you use James H's worktodo.txt balancer. Try to make each chunk posted as close as possible to the same GHz-days. Here is what it looks like as balanced as it can be:
Code:
[Worker #1]
Factor=353000101,72,81
Factor=453000013,78,82
Factor=453000029,71,82

[Worker #2]
Factor=153000349,75,77
Factor=253000937,76,79
Factor=203000117,73,79
Factor=303000227,70,80
Factor=403000069,71,81
Factor=353000047,77,81

[Worker #3]
Factor=153000277,75,77
Factor=253000079,74,79
Factor=203000101,73,79
Factor=303000119,70,80
Factor=403000067,71,81
This breaks down to:
• Worker #1 = 5,572.802 GHz-days
• Worker #2 = 4,489.192 GHz-days
• Worker #3 = 3,233.924 GHz-days

No one has to buy a whole thing, you can just reprocess it for the next batch.
How I had it ordered was in time order assuming one gpu, one worker instance. (Or Laurv takes the front part, someone does the rest later.) The first (lower) few are fast enough on a current model gpu it does not much matter what order they occur in. After that, there's benefit in one exponent per limit exploration bin. That ordering allows for considerable pipelining; a partial report of the TF lets P-1 testing get started, and then P-1 and TF occur mostly in parallel, with a bit of stagger. Putting the hard slow high exponents first in the TF, at which exponents the P-1 may fail already at some lower level and so become unneeded for CUDAPm1 testing, does not make sense to me. (They could still be run eventually on prime95 or on a gpu model that reaches a little higher than others in CUDAPm1; the high water mark so far in CUDAPm1 v0.20 testing is on GTX 1060 3gb, 431M<pmax<432.6M, and 432.2M is looking promising with a little over a day to go.)
I forgot to mention, I didn't find good candidates at 103, 123, 133, or 143 in the first 1000 spans. But any assignment in the neighborhood would do. And if they are already TF to target, there's no TF to delegate before doing a P-1.

Last fiddled with by kriesel on 2019-01-09 at 16:49
kriesel is offline   Reply With Quote
Old 2019-01-09, 16:34   #13
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7·491 Posts
Default

Quote:
Originally Posted by chalsall View Post
Actually, the "GPUto72 individual goal bit levels" phrase is giving credit where it's not due.

GPU72's targets are guided by James' "economic cross-over" analysis, which has been peer reviewed by many very knowledgeable people.

The exact "optimal" TF'ing depth is a function of the range (candidate size) and the particular card's abilities (specifically, the "compute version"). For example, a RTX 2080 Ti (c.v. 7.5) should TF deeper than a GTX 580 (c.v. 2.0).

Please keep in mind that James' analysis is based on comparing what will "clear" a candidate faster (using statistical heuristics) ***using the same kit*** running either mfaktc vs a CUDA LL'er. Note that some TF (slightly) beyond the optimal economic cross-over point because they just like finding factors, or can't be bothered to switch between the different software.
I go by exponent lookup at James' site generally. It takes the which-gpu variable out of it, and that's where I adopted the GPU[to]72 naming from.

My impression is the difference between primenet TF target and GPUto72 target there, is a difference of 4 bits. For example, https://www.mersenne.ca/exponent/453000013 73-69=4 or https://www.mersenne.ca/exponent/53000039 82-78=4 or https://www.mersenne.ca/exponent/53000039 85-81=4. It's about 3 bits per exponent doubling (less than 3 at high values).

Since the GPUto72 project is apparently successfully keeping cpus off TF, and doing little P-1, and there's active interest in P-1 and primality testing on NVIDIA and AMD, and I'm running a lot of P-1 tests in CUDAPm1, going to the higher GPUto72 TF levels seems to me to make sense.

The distinction that the optimal bit level might shift depending on _which_ gpu model is concerned is a useful one. I haven't wrestled sufficiently with the question of where an optimal lies or what optimal means or how many dimensions an optimal-description may have, when considering multiple work types on multiple models of gpus and cpus. I have the impression the GIMPS community "jury is still out on that one". The practical difference between the simpler single TF level expressed as GPU72 level on James' excellent useful site and the ideal optimal for any combination of cpu or gpu0 primality test, gpu1 P-1, gpu2 TF is probably small in percentage throughput terms.

For owners of gpus that are in some way not typical, such as RTX20xx or Intel igps where the TF/LL throughput ratio is significantly higher, or the really old NVIDIA cards where the SP/DP ratio is significantly lower, relying on the gpu-model-specific curves is likely more important.

Last fiddled with by kriesel on 2019-01-09 at 16:38
kriesel is offline   Reply With Quote
Old 2019-01-09, 18:05   #14
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013
Ͳօɾօղէօ

22×5×139 Posts
Default

Quote:
Originally Posted by kriesel View Post
The distinction that the optimal bit level might shift depending on _which_ gpu model is concerned is a useful one. I haven't wrestled sufficiently with the question of where an optimal lies or what optimal means or how many dimensions an optimal-description may have, when considering multiple work types on multiple models of gpus and cpus. I have the impression the GIMPS community "jury is still out on that one". The practical difference between the simpler single TF level expressed as GPU72 level on James' excellent useful site and the ideal optimal for any combination of cpu or gpu0 primality test, gpu1 P-1, gpu2 TF is probably small in percentage throughput terms.

For owners of gpus that are in some way not typical, such as RTX20xx or Intel igps where the TF/LL throughput ratio is significantly higher, or the really old NVIDIA cards where the SP/DP ratio is significantly lower, relying on the gpu-model-specific curves is likely more important.
RTX20xx will soon be very typical. But you are right in that the ratios are significantly different. For a 90M exponent, the crossover for my GTX 580s is 76 bits, for my GTX 1070s it's 77 bits, and for a RTX 20xx it's 78 bits.

In reality, I'm too lazy to switch software. So I TF a little higher than optimal on the 580s.
Mark Rose is offline   Reply With Quote
Old 2019-01-09, 18:11   #15
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7·491 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
RTX20xx will soon be very typical.
I used typical, in the sense of this gpu model is like or unlike other models.
I agree that they will become popular / common, as measured in units sold over time, and percentage of the deployed fleet.

There's no need to apologize for or defend time management.
Thanks for the 90M point of reference for bit levels on the various models.

Last fiddled with by kriesel on 2019-01-09 at 18:13
kriesel is offline   Reply With Quote
Old 2019-01-09, 22:41   #16
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

882910 Posts
Default

Quote:
Originally Posted by kriesel View Post
I haven't wrestled sufficiently with the question of where an optimal lies or what optimal means or how many dimensions an optimal-description may have, when considering multiple work types on multiple models of gpus and cpus.
There is something known as "analysis paralysis".

Those who have been in the "game" a while know that making a decision and then moving forward is often better than over-thinking, and never moving. Yes, mistakes might be made, but one tends to learn from mistakes....
chalsall is online now   Reply With Quote
Old 2019-01-09, 23:40   #17
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7·491 Posts
Default

Quote:
Originally Posted by chalsall View Post
There is something known as "analysis paralysis".

Those who have been in the "game" a while know that making a decision and then moving forward is often better than over-thinking, and never moving. Yes, mistakes might be made, but one tends to learn from mistakes....
That applies to the game of life in general.
Plan, do, observe, adjust. DO is essential.
Henry Ford didn't build an optimal automobile from the start, he fed an engine gas by teaspoon initially.
kriesel is offline   Reply With Quote
Old 2019-01-10, 04:30   #18
potonono
 
potonono's Avatar
 
Jun 2005
USA, IL

193 Posts
Default

I had grabbed all the 152 bin candidates. I'll add individual exponents with more variety as those finish over the next week-ish.
potonono is offline   Reply With Quote
Old 2019-01-16, 03:46   #19
potonono
 
potonono's Avatar
 
Jun 2005
USA, IL

193 Posts
Default

I've grabbed a couple more exponents, but I'm curious on your expectations for some that the Primenet server won't hand out. An exponent like 100,000,471 is only trial factored up to 73 bits, but already has P1 and two matching LL tests done. I assume that's why it can no longer be reserved for more work thru the manual gpu assignment page. Is that an exponent you would still want taken up to 76 bits, or not since it's already confirmed composite?

Edit: nevermind, I see you indicated "without P-1 result or primality result" in the original request

Last fiddled with by potonono on 2019-01-16 at 03:51
potonono is offline   Reply With Quote
Old 2019-02-03, 15:19   #20
AJ Alon
 
Oct 2018

22×3 Posts
Default

I'll donate some time as I'm finishing up a round of GPU72 DCTF. I've reserved the following from the 121 bin. Let me know if this makes sense.

Factor=N/A,121100117,72,77
Factor=N/A,121100171,72,77
Factor=N/A,121100219,72,77
Factor=N/A,121100233,72,77
Factor=N/A,121100269,72,77
Factor=N/A,121100351,72,77
Factor=N/A,121100383,72,77
Factor=N/A,121100407,72,77
Factor=N/A,121100411,72,77

Also, possibly not relevant, but I did recently take M421000049 to 82 bits (just exploring the much higher bit ranges...)
AJ Alon is offline   Reply With Quote
Old 2019-02-08, 20:10   #21
AJ Alon
 
Oct 2018

22·3 Posts
Default

Reserved a few exponents in the 123 bin.
Code:
123449987,72,77
123449939,72,77
123449917,72,77
123449819,72,77
123449791,72,77
123449743,72,77
123449737,72,77
123449663,72,77
123449611,72,77
123449591,72,77
AJ Alon is offline   Reply With Quote
Old 2019-03-07, 05:35   #22
AJ Alon
 
Oct 2018

1210 Posts
Default

Just reserved these in the 353 bin. Think I'll be good for a while in TF.
Code:
353000059,72,81
353000177,72,81
353000071,72,81
AJ Alon is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
What does Glib Deepak have to do with deep doo-doo? cheesehead Science & Technology 47 2014-12-14 13:45
Deep Hash diep Math 5 2012-10-05 17:44
Question on going deep and using cores MercPrime Software 22 2009-01-13 20:10
Deep Sieving 10m Digit Candidates lavalamp Open Projects 53 2008-12-01 03:59
NASA's Deep Impact... ixfd64 Lounge 5 2005-07-06 13:46

All times are UTC. The time now is 23:50.

Wed Apr 1 23:50:31 UTC 2020 up 7 days, 21:23, 3 users, load averages: 1.66, 1.52, 1.35

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.