mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data

Reply
 
Thread Tools
Old 2019-01-08, 02:51   #1
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

341410 Posts
Default Deep dive TF

Hi, I'm looking for a few intrepid volunteers, or more than a few, to take some scattered but strategically placed exponents to up to the full GPUto72 individual goal bit levels, or toward them, well distributed from 100 million to the mersenne.org upper limit of a billion. You can think of this as creating full-TF islands above the prevailing water line of bulk TF effort.

The main purpose is to provide some already TF complete candidates for P-1 factoring software testing. Whoever does the TF gets the computing credit and credit for any factors found along the way. Reservation of exponents is highly recommended, and reasonably prompt completion.

Consider the exponents from 100 million to 101 million as a bin. Strategic TF would focus in the first 1000, that is, 100,000,000 to 100,001,000, and aim for completing TF on at least two that have no P-1 or primality test done and with no factor found by TF, so in need of P-1 and perhaps primality testing. The known location within the bin is to make it easy to find them via https://www.mersenne.org/manual_gpu_assignment/ or https://www.mersenne.org/report_exponent/

It would be good to have multiple close spaced bins each containing an island of two or more full-depth TF exponents without P-1 result or primality result. Having multiple closely spaced bins allows using one set of well spaced bins to test CUDAPm1, and another well spaced set prime95, another gpuOwL PRP-1 or any future Preda P-1, etc. Also, occasionally software has an issue in a small range of exponents but is ok on either side of the trouble spot. (I've seen CUDAPm1 have trouble with one gpu but not another, even sometimes the same model, at 84M, 128M, and 171M.) Having more than one fully TF completed exponent per island is insurance against finding a stage 1 factor and so being unable to test stage two, and could act as a spare in case of a nearby island being a trouble spot for one of the applications. At the same time, staggering the bins a bit between applications or versions means a slightly wider distribution of exponents tested.
For example, and from now on giving bin identifications as millions (for example 100 instead of 100 million):
P95 owl CUDAPm1
100 101 102 103
120 121 122 123
150 151 152 153
200 201 202 203
250 251 252 253
300 301 302 303
350 351 352 353
400 401 402 403
500 501 452 453
600 601
700 701
800 801
900 901

After running CUDAPm1 on a given gpu model on several widely spaced exponents (which are usually chosen spaced about 50M or 100M apart so they plot nicely), often, as in CUDAPm1 v0.20, I find some exponents can not be run successfully to completion on a given gpu or any gpu. Then I start doing a binary search to see what the limits are. Closer spaced islands later would be useful for that. When I need to TF-qualify the exponents, it really slows down the testing of P-1 limits since I'm using the same gpus for both.
I'm getting ready to start the testing and limit mapping of several gpu models on CUDAPm1 v0.22, and am started testing in prime95. Any helpful TF island building would be appreciated. The end result is tabulation of run times and plotting of scaling, and documentation of limits, NRP trends, software issues encountered, run time scaling, etc, as in https://www.mersenneforum.org/showthread.php?t=23389

Users like mikr and rudimeier have already done some of this deeper TF at the front of a million bin. It is very useful when prequalifying a few exponents for P-1 software testing on high exponents on gpu by finishing them myself to gputo72 factoring goal levels. Thank you to the pioneers who have already done some of this, to or near primenet goal bit levels several years ago, for example.

The higher ones will represent a considerable amount of total work per exponent. A few examples of computing effort per exponent to full GPUto72 TF depth:
https://www.mersenne.ca/exponent/101000117 114 GhzD to go to full gputo72 bit level (76)
https://www.mersenne.ca/exponent/171000043 346 GhzD to go to full gputo72 bit level (78)
https://www.mersenne.ca/exponent/371000039 1.2 ThzD to go to full gputo72 bit level (81)
https://www.mersenne.ca/exponent/919000001 8.4 ThzD to go to full gputo72 bit level (85)
https://www.mersenne.ca/exponent/999000061 15.6 ThzD to go to full gputo72 bit level (86)
kriesel is offline   Reply With Quote
Old 2019-01-08, 04:26   #2
potonono
 
potonono's Avatar
 
Jun 2005
USA, IL

3018 Posts
Default

I can volunteer some TF, but I'm not sure about what bit levels any particular range should be taken to. Are the 'full GPUto72 individual goal bit levels' posted somewhere?
potonono is offline   Reply With Quote
Old 2019-01-08, 05:19   #3
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

17×251 Posts
Default

Quote:
Originally Posted by potonono View Post
I can volunteer some TF, but I'm not sure about what bit levels any particular range should be taken to. Are the 'full GPUto72 individual goal bit levels' posted somewhere?
I understand it to be the yellow boundary line here:
https://www.mersenne.ca/status/tf/0/0/1/0

Click on any line to drill down for finer limits.
petrw1 is offline   Reply With Quote
Old 2019-01-08, 05:25   #4
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

23×3×17×19 Posts
Default

James H had posted a chart a while back (based upon perfomance data). That and Chris mentioned that GPU's should do about 3 bits deeper than Prime95's default.
See this post of mine and James' response:
https://mersenneforum.org/showthread.php?p=389094
and
https://mersenneforum.org/showthread.php?p=490542
Uncwilly is offline   Reply With Quote
Old 2019-01-08, 05:50   #5
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×3×569 Posts
Default

Quote:
Originally Posted by potonono View Post
I can volunteer some TF, but I'm not sure about what bit levels any particular range should be taken to. Are the 'full GPUto72 individual goal bit levels' posted somewhere?
Super! What I usually go by is individual lookups, since it gives both start and end level for the particular exponent, as well as whether LL, PRP, or P-1 have been run or assigned yet, for example:

https://www.mersenne.ca/exponent/101000117 after

https://www.mersenne.org/report_expo...1001000&full=1
One must be careful about mersenne.ca for current status, since it lags a bit until it syncs overnight from mersenne.org

If you're just after the TF level to go up to, going by the red curve for first LL on charts like https://www.mersenne.ca/cudalucas.ph...=100&mmax=1000 is not bad. You can get to those by clicking on any gpu in the list at https://www.mersenne.ca/cudalucas.php, and the low and high exponent limits are 50M to 300M by default but can be adjusted as shown in the URL above.

Or, I suppose I could add a target TF column. Lots of choices.

TFH P95 owl CUDAPm1
76 100 101 102 103
77 120 121 122 123
77 150 151 152 153
79 200 201 202 203
79 250 251 252 253
80 300 301 302 303
81 350 351 352 353
81 400 401 402 403
82 500 501 452 453
83 600 601
84 700 701
85 800 801
85 900 901

Last fiddled with by kriesel on 2019-01-08 at 06:29
kriesel is offline   Reply With Quote
Old 2019-01-09, 04:27   #6
potonono
 
potonono's Avatar
 
Jun 2005
USA, IL

193 Posts
Default

Thanks for the links and list everyone. Yes, that makes sense.

Will it help your efforts more to work on any specific bins first, like smallest to largest, or just anything as available?
potonono is offline   Reply With Quote
Old 2019-01-09, 05:27   #7
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

202628 Posts
Default

Make an worktodo file (list of exponents with bitlevels, which I can summarily edit and paste to my rig) and pass it to me.
LaurV is offline   Reply With Quote
Old 2019-01-09, 11:58   #8
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×3×569 Posts
Default

Quote:
Originally Posted by potonono View Post
Thanks for the links and list everyone. Yes, that makes sense.

Will it help your efforts more to work on any specific bins first, like smallest to largest, or just anything as available?
I suggest pick a column or portion of a column, describe it unambiguously in a post here so others can choose differently, and do an exponent per bin smallest first, then repeat for second exponent per bin.
I usually go from small to large, in the first part of testing, because it gives a quick feel for scaling and more rapidly and efficiently explores limits.
TF in the same order seems like it would work well along with that.
Examples of description: "entire left cudapm1 column"; "gpu column up to 401"; "p95 column 400 to 900"; an actual exponent list would work too.
I am running testing on different applications on different gear in parallel;
prime95 on cpus, gpuowl on AMD gpus, CUDAPm1 on NVIDIA gpus. So no particular priority between columns.
kriesel is offline   Reply With Quote
Old 2019-01-09, 12:26   #9
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

D5616 Posts
Default

Quote:
Originally Posted by LaurV View Post
Make a worktodo file (list of exponents with bitlevels, which I can summarily edit and paste to my rig) and pass it to me.
How about for the cudaPm1 right column,
Factor=153000277,75,77
Factor=153000349,75,77
Factor=203000101,73,79
Factor=203000117,73,79

Factor=253000937,76,79
Factor=303000119,70,80
Factor=353000047,77,81
Factor=403000067,71,81
Factor=453000013,78,82

Factor=253000079,74,79
Factor=303000227,70,80
Factor=353000101,72,81
Factor=403000069,71,81
Factor=453000029,71,82
(Please reserve them to avoid duplications especially at 70 or 71 bit starting points)

Last fiddled with by kriesel on 2019-01-09 at 12:27
kriesel is offline   Reply With Quote
Old 2019-01-09, 15:49   #10
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

22×32×5×72 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
James H had posted a chart a while back (based upon perfomance data). That and Chris mentioned that GPU's should do about 3 bits deeper than Prime95's default.
Actually, the "GPUto72 individual goal bit levels" phrase is giving credit where it's not due.

GPU72's targets are guided by James' "economic cross-over" analysis, which has been peer reviewed by many very knowledgeable people.

The exact "optimal" TF'ing depth is a function of the range (candidate size) and the particular card's abilities (specifically, the "compute version"). For example, a RTX 2080 Ti (c.v. 7.5) should TF deeper than a GTX 580 (c.v. 2.0).

Please keep in mind that James' analysis is based on comparing what will "clear" a candidate faster (using statistical heuristics) ***using the same kit*** running either mfaktc vs a CUDA LL'er. Note that some TF (slightly) beyond the optimal economic cross-over point because they just like finding factors, or can't be bothered to switch between the different software.
chalsall is online now   Reply With Quote
Old 2019-01-09, 15:53   #11
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

171108 Posts
Default

I suggest that you use James H's worktodo.txt balancer. Try to make each chunk posted as close as possible to the same GHz-days. Here is what it looks like as balanced as it can be:
Code:
[Worker #1]
Factor=353000101,72,81
Factor=453000013,78,82
Factor=453000029,71,82

[Worker #2]
Factor=153000349,75,77
Factor=253000937,76,79
Factor=203000117,73,79
Factor=303000227,70,80
Factor=403000069,71,81
Factor=353000047,77,81

[Worker #3]
Factor=153000277,75,77
Factor=253000079,74,79
Factor=203000101,73,79
Factor=303000119,70,80
Factor=403000067,71,81
This breaks down to:
• Worker #1 = 5,572.802 GHz-days
• Worker #2 = 4,489.192 GHz-days
• Worker #3 = 3,233.924 GHz-days

No one has to buy a whole thing, you can just reprocess it for the next batch.
Uncwilly is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
What does Glib Deepak have to do with deep doo-doo? cheesehead Science & Technology 47 2014-12-14 13:45
Deep Hash diep Math 5 2012-10-05 17:44
Question on going deep and using cores MercPrime Software 22 2009-01-13 20:10
Deep Sieving 10m Digit Candidates lavalamp Open Projects 53 2008-12-01 03:59
NASA's Deep Impact... ixfd64 Lounge 5 2005-07-06 13:46

All times are UTC. The time now is 22:36.

Sun Mar 29 22:36:37 UTC 2020 up 4 days, 20:09, 3 users, load averages: 1.07, 1.29, 1.35

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.