mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet > GPU to 72

Reply
Thread Tools
Old 2020-01-30, 15:59   #4599
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23·271 Posts
Default

Quote:
Originally Posted by chalsall View Post
Strategic clearing out of abandoned assignments.

These are going to be recycled in mid-February, so your machines which work though the proxy (and have a reliable and predictable production rate) and mine have been clearing out as many as we can before then. Actually, next week I was going to stop giving your machines work to ensure all the assignments were completed before being recycled.
Ahh, I see. That makes sense - thanks for clearing that up!
kracker is offline   Reply With Quote
Old 2020-01-31, 06:02   #4600
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default

Quote:
Originally Posted by chalsall View Post
Yeah... We really need to discuss this as a team, and figure out what's the best thing to do.

Basically, because of a certain individual, GIMPS LL throughput has more than tripled in the last three months (!). This is ***amazingly*** cool news! Thanks Ben!

However, this has completely messed with the goal of having all LL assignments "optionally" TF'ed, and P-1'ed (ideally "well", by P-1'ing "specialists").

Currently, it is "optimal" to TF to 77 "bits", but with the current TF'ing "firepower" we're only producing about 50 a day; to stay "steady-state" we would need to produce 900 a day!

We do have ~60,000 candidates already ready for LL assignment, but that's only going to last us two months. What I've currently got GPU72 doing is giving out work in such a way as we "chase" ahead of the Cat 2 assignments such that they are optimally TF'ed and P-1'ed. However, it won't take long until Cat 2 also gets into the 10xM ranges.

Then, I don't know... Should we start releasing at 75, hoping to occasionally get to 76?

And/or, should we bring in work in the 10xM ranges, and start bringing them up (many are still only at 72 bits).

I would really welcome suggestions as to what people want to see happen/thinks makes sense.

And, as always (but particularly now), if anyone has any GPU compute they could bring to bear, it would be much appreciated!

Thoughts?
Find a way to make more of the 95M-100M TF available to ordinary GIMPSters that are not GPU72 participants. I'm generally relegated to 100M+ and putting recently ~3ThzD/day into it, despite requesting lowest exponents from the manual assignments page. Concentrate the firepower close in front of the primality wavefront. Any significant TF work on exponents greater than ~1.2x the primality testing wavefront leading edge is kind of wasted, other than software testing and benchmarking.

I'm working through the following on my little fleet to help out a bit:

1) Thoroughly tuning mfaktc and upgrading to 2047Mib-capable-gpusievesize and tuning again for that on most of my gpus; squeezing out up to 10% more from existing gear, and almost all running multiple mfaktc instances in parallel for the last additional bit of throughput; they can be driven to 100% indicated gpu load in gpu-z or nvidia-smi (benchmarking results for several models were posted in the mfaktc thread); any gpu model over ~100GhzD/day seems to benefit a little from multiple TF instances.

2) Reactivating some older gpus I had lying idle, now that I have an open-frame 6-PCIE up and running mostly (Asrock H81 Pro BTC 2.0, lowly 8GB ram single-DIMM i7-4790, also running prime95 PRPDC and only using a third of the system ram), nice big high efficiency PS. IGP refuses to take an OpenCL driver, and a couple of old gpus are not starting up currently. And ample cooling in the form of winter weather.

3) Diverting short term a GTX1080Ti from another use to TF (3 instances in parallel after 2047-capable and serious tuning gets it to 99-100% gpu load) which takes its throughput to ~1.4ThzD/day, and shifting lesser gpus to TF also;

4) Getting ready for more incoming hardware.

5) Popping the covers off some old gpus to remove the dust/lint/felt buildup after the fan; even fixed-clock old Quadros have some sort of thermal protection, perhaps shutting down some cores. One was so clogged I think it was interfering with the fan rotor and had been removed from service, and is now after a cleaning, back in the fray.

6) Further development of my own multi-gpu-app management program (makes monitoring status and collecting results easier and more efficient, especially important when running 2 and 3 TF instances per gpu in a system)

This combined TF throughput is mostly just softening up the low end of the 100M bin a bit, outside of the GPU72 flow. Lately manual TF assignments direct from mersenne.org "lowest exponents" have dropped off from 75/76 bit assignments, to recently as low as 72/73. Occasionally I would get some 95M before GPU72 scooped them up again, but that hasn't happened in a while. I have some Quadro 2000s slogging through some 95M 75/76 at ~one a day each!

For the faster newer gpus, if mfaktc and mfakto were modified a bit more, to raise the max gpusievesize above 2047Mib (currently using a signed 32-bit variable to compute bit address) to perhaps 4095Mib (unsigned 32-bit), there appears to be a bit more gain yet to be had there; at least for GTX1080Ti and up. It's likely to matter more as faster gpus come out, judging by tests from a wide variety of gpu speeds. A percent here and there, times how many gpus? Probably the equivalent of adding whole gpus to the project.

Perhaps Ben could shift some of his horsepower from first primality tests to LLDC and PRPDC. Even 10% would help those a lot; 20+% to LLDC would be better, as it's lagging several years behind.

Last fiddled with by kriesel on 2020-01-31 at 06:09
kriesel is offline   Reply With Quote
Old 2020-01-31, 09:50   #4601
axn
 
axn's Avatar
 
Jun 2003

117328 Posts
Default

Quote:
Originally Posted by kriesel View Post
Perhaps Ben could shift some of his horsepower from first primality tests to LLDC and PRPDC.
That is the dumbest thing I've heard. It is equivalent to saying, "slow down, you might find a prime too quickly!"

Y'all are putting the cart before the horse. TF is supposed to help the project by accelerating the LL/PRP wavefront; and now that somebody has deployed LL resources to do just that, you want them to slow down?!

Just do TF 1 or 2 bits less than optimal and call it a day. That last bit has very negligible impact on project thruput compared to the previous bits. It will be a crying shame if, in the pursuit of mathematical optimality, you're letting many undersieved exponents thru to P-1 and Cat 3/4 testers.
axn is online now   Reply With Quote
Old 2020-01-31, 11:08   #4602
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

22·3·17·23 Posts
Default

Quote:
Originally Posted by kriesel View Post
Perhaps Ben could shift some of his horsepower from first primality tests to LLDC and PRPDC. Even 10% would help those a lot; 20+% to LLDC would be better, as it's lagging several years behind.
Ryan P is going great guns there
petrw1 is offline   Reply With Quote
Old 2020-01-31, 11:53   #4603
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by kriesel View Post
Find a way to make more of the 95M-100M TF available to ordinary GIMPSters that are not GPU72 participants. I'm generally relegated to 100M+ and putting recently ~3ThzD/day into it, despite requesting lowest exponents from the manual assignments page. Concentrate the firepower close in front of the primality wavefront.
The issue is that manual TF assignments from Primenet survive for six months, and so if not processed appropriately they risk being recycled and then being given to an LL'er sub-optimally TF'ed.

GPU72 very carefully targets its resources to "feed" the various wavefronts optimally, including Cats 3 and 4 to at least 75 bits and the P-1'ers to 77. Most of the Cat 3 and 4 assignments will be recycled, and can then be brought up to 77 before being given as a Cat 2 or lower.

Please note that Cat 3 and 4 are already in the 10xM ranges, and Cat 2 is about to enter there. So any work being done there will be "useful" quite quickly (particularly considering George's new assignment sort on TF depth clause).

Lastly, while I appreciate that ~3 THzD/D is impressive, please note that for the last month GPU72's participants have averaged a total of ~300 THzD/D.

I would argue that it's better to keep the disciplined targeted firepower working the way it is now. And, again, work in the 10xMs (ideally to 76 or 77) is needed right now.

Last fiddled with by chalsall on 2020-01-31 at 11:55 Reason: Smelling mistake.
chalsall is offline   Reply With Quote
Old 2020-01-31, 14:01   #4604
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default

Quote:
Originally Posted by axn View Post
That is the dumbest thing I've heard.
Balance is good. The lag between first-test and DC is around 8 years. If a mix of effort is LL 80% DC 20%, the lag will hold about constant. I feel reducing the lag would be good. The lag has been growing, before Delo joined; ten years ago the lag was about 6 years; 20 years ago, only about 3 years.

Outrunning the collective TF effort so some of the first-time primality testing is wasted on factorable candidates does not occur to me as an ideal plan.

If one very well funded user does essentially all of the first-primality testing as a percentage, he gets essentially all the probability of the next prime discovery, and does not contribute in other areas too, other participants may begin to question whether their TF, P-1, and DC in support of that is worth their time and money. Participation is dropping. It used to be over 7000 with results in the past year; now it's below 6200 and seems to be steadily declining.
Quote:
Originally Posted by petrw1 View Post
Ryan P is going great guns there
Propper is listed in top producers as ~98% ECM 1%DC, 1% other. Delo is 99% first-primality, 1% DC 0% everything else. Not exactly balanced. All contributions are welcome. But the heaviest hitters are encouraged to consider how their choice of mix may affect the project, including other participants' responses.

Last fiddled with by kriesel on 2020-01-31 at 14:27
kriesel is offline   Reply With Quote
Old 2020-01-31, 14:14   #4605
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

100110001001112 Posts
Default

Quote:
Originally Posted by kriesel View Post
Outrunning the collective TF effort so some of the first-time primality testing is wasted on factorable candidates does not occur to me as an ideal plan.
That isn't happening. And this is GIMPS, not GIMFS.
chalsall is offline   Reply With Quote
Old 2020-01-31, 14:19   #4606
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

36448 Posts
Default

Quote:
Originally Posted by chalsall View Post
The issue is that manual TF assignments from Primenet survive for six months, and so if not processed appropriately they risk being recycled and then being given to an LL'er sub-optimally TF'ed...

...And, again, work in the 10xMs (ideally to 76 or 77) is needed right now.
"Survive for six months." This is way too long. Ten days would be plenty. If whoever takes more than they can run in this period of time, they need to cut it back.

I believe many will TF to whatever level they feel is practical. Personally, I do not take on anything above 2^75. It is the time spent versus chance of finding a factor, or not finding one. 75's take an hour on my hardware in the 98M to 100M area. Anything beyond, that is 20xx territory.
storm5510 is offline   Reply With Quote
Old 2020-01-31, 14:31   #4607
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

100110001001112 Posts
Default

Quote:
Originally Posted by storm5510 View Post
"Survive for six months." This is way too long. Ten days would be plenty. If whoever takes more than they can run in this period of time, they need to cut it back.
This is a policy decision made by George, further exasperated by the fact that TF and P-1 assignments are not constrained by the LL/DC assignment rules.

Quote:
Originally Posted by storm5510 View Post
I believe many will TF to whatever level they feel is practical. Personally, I do not take on anything above 2^75.
And that's perfectly fine. Your kit; your choice.
chalsall is offline   Reply With Quote
Old 2020-01-31, 14:52   #4608
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default

Quote:
Originally Posted by storm5510 View Post
Ten days would be plenty.
Not in my opinion. I'm personally running over 30 manually queued and reported gpu application instances, and that's climbing over time as I add hardware and add instances per gpu for greater throughput. (Working on automating managing that small but growing herd.) Ten TF assignments 75/76 queued on a slow (Quadro2000) gpu is 11 days to complete. I reserve assignments in blocks of 10 or more per gpu instance, and try to avoid them ever running dry, so latency is likely to be more than two weeks; months occasionally is not out of the question. I do my best to avoid expiration.

People do go on long vacations sometimes, or business travel, or get sick or injured, or have a term paper due or exam coming up, also.

Last fiddled with by kriesel on 2020-01-31 at 15:12
kriesel is offline   Reply With Quote
Old 2020-01-31, 15:31   #4609
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

265A16 Posts
Default

Quote:
Originally Posted by kriesel View Post
Not in my opinion. I'm personally running over 30 manually queued and reported gpu application instances
Why are you not using Misfit? Or a home brewed script to hand this?

Would 2 months be ok with you? or 3 or 4?
Assignment recycling is important. Old TF assignments should be recycled ahead of the first time LL wave in enough time that they can all get done.

Ben may discourage some (since he depresses the chance of them finding a prime). But, overall there is more total throughput. And total number of users does fall in the months after the spike around a new prime discovery.

Last fiddled with by Uncwilly on 2020-01-31 at 15:31
Uncwilly is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Status Primeinator Operation Billion Digits 5 2011-12-06 02:35
62 bit status 1997rj7 Lone Mersenne Hunters 27 2008-09-29 13:52
OBD Status Uncwilly Operation Billion Digits 22 2005-10-25 14:05
1-2M LLR status paulunderwood 3*2^n-1 Search 2 2005-03-13 17:03
Status of 26.0M - 26.5M 1997rj7 Lone Mersenne Hunters 25 2004-06-18 16:46

All times are UTC. The time now is 08:25.


Mon Aug 2 08:25:23 UTC 2021 up 10 days, 2:54, 0 users, load averages: 2.28, 2.07, 1.84

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.