mersenneforum.org  

Go Back   mersenneforum.org > Fun Stuff > Lounge

Reply
 
Thread Tools
Old 2011-06-28, 13:43   #199
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

70316 Posts
Default

Quote:
Originally Posted by davieddy View Post
Why is that?

David
I think it's because P-1, like LL-D, is *never* going to get a prize for finding M48. It only makes it easier for someone else to find it.

Therefore, not too many do it...or, as Mr P-1 states, do a good job....leaving those of us doing a good job finding factors more quickly than we would if we ran LL. Remember, the minimum work to proving status of any given large set of exponents happens when we are indifferent to whether we prove the status by any particular method.

P95:
If we get a 4x speedup on LL or (eventually) P-1 on GPUs, and LL isn't all that memory hungry, does it make sense to run multiple exponents in parallel on LL? [I'm doubtful this will speed things up much on P-1, due to its memory-intensiveness]. How bad is the CPU impact of running LL on CUDA?
Christenson is offline   Reply With Quote
Old 2011-06-28, 13:54   #200
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

22×5×373 Posts
Default

Quote:
Originally Posted by Prime95 View Post
I can see merit to your argument that we should do more P-1 rather than more TF, but the GPUs are only 100x faster at TF. IIRC, they are 4x faster at LL. GPUs cannot do P-1, but if they could they'd also be only 4x faster (or less).
If they (GPU) can do LL, then they can do P-1; the iterations are
the same. They are modular multiplications mod M_p. And, as you say,
they would only be 4x faster.

The difference in speed between TF and LL on a GPU does mean that
it is better to spend more time on the former, but one gets fairly
rapid diminishing returns on TF. If there is no factor less than (say) 70 bits,
it is unlikely that there is one of 71 or 72 bits.....

One can indeed optimize the parameter selections, but the calculations
would be quite sensitive to the speed differences in the various pieces
of hardware.
R.D. Silverman is offline   Reply With Quote
Old 2011-06-28, 14:23   #201
davieddy
 
davieddy's Avatar
 
"Lucan"
Dec 2006
England

2×3×13×83 Posts
Default

Quote:
Originally Posted by R.D. Silverman View Post
If they (GPU) can do LL, then they can do P-1; the iterations are
the same. They are modular multiplications mod M_p. And, as you say,
they would only be 4x faster.

The difference in speed between TF and LL on a GPU does mean that
it is better to spend more time on the former, but one gets fairly
rapid diminishing returns on TF. If there is no factor less than (say) 70 bits,
it is unlikely that there is one of 71 or 72 bits.....

One can indeed optimize the parameter selections, but the calculations
would be quite sensitive to the speed differences in the various pieces
of hardware.
All becomes clear.

This is what a math argument is.

David
davieddy is offline   Reply With Quote
Old 2011-06-28, 14:47   #202
davieddy
 
davieddy's Avatar
 
"Lucan"
Dec 2006
England

194A16 Posts
Default

Quote:
Originally Posted by Christenson View Post
I think it's because P-1, like LL-D, is *never* going to get a prize for finding M48. It only makes it easier for someone else to find it.
The question was intended as "rhetorical".

But since you have risen to the bait (like others)
I may as well point out that a double check LL might
find M48, but TF/P-1 won't.

David
davieddy is offline   Reply With Quote
Old 2011-06-28, 15:00   #203
science_man_88
 
science_man_88's Avatar
 
"Forget I exist"
Jul 2009
Dumbassville

100000110000002 Posts
Default

Quote:
Originally Posted by davieddy View Post
The question was intended as "rhetorical".

But since you have risen to the bait (like others)
I may as well point out that a double check LL might
find M48, but TF/P-1 won't.

David
David it's the first ll that found it it's the second that confirms it so LL-D will not find it it will confirm it or deny it. Even I know this and I'm usually the thickest skulled person on these forums.
science_man_88 is offline   Reply With Quote
Old 2011-06-28, 15:51   #204
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

23×11×73 Posts
Default

If someone finds a zero residual, then there will be a double-check done immediately on the fastest hardware available to confirm.

What LL-D offers is the possibility of finding a number for which the correct residual is zero, but the first run maybe years earlier hit a hardware problem and so the Mersenne prime was missed.
fivemack is offline   Reply With Quote
Old 2011-06-28, 16:29   #205
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

179510 Posts
Default

Bob: It is certainly possible to get a GPU to do P-1. However, at the moment, the possibility is still theoretical -- you can't download the program just yet. And, as a student, I'm not quite ready to make it real.

I'm curious as to why LL is only 4x faster on a GPU...that is, comparatively, where are the bottlenecks on TF versus LL iterations, and why does that leave us with a 100x speedup on TF but only a 4x speedup on LL?
Christenson is offline   Reply With Quote
Old 2011-06-28, 16:52   #206
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

11101001001002 Posts
Default

Quote:
Originally Posted by Christenson View Post
Bob: It is certainly possible to get a GPU to do P-1. However, at the moment, the possibility is still theoretical -- you can't download the program just yet. And, as a student, I'm not quite ready to make it real.

I'm curious as to why LL is only 4x faster on a GPU...that is, comparatively, where are the bottlenecks on TF versus LL iterations, and why does that leave us with a 100x speedup on TF but only a 4x speedup on LL?
Memory bandwidth will be a bottleneck. TF takes only a few bytes
of storage per processor. TF is also embarrasingly parallel; FFT's are not.
There may be other problems as well.
R.D. Silverman is offline   Reply With Quote
Old 2011-06-28, 17:14   #207
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

22·3·641 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Quote:
Originally Posted by R.D. Silverman View Post
I am suggesting that you can SAVE TIME by NOT running TF at all; Just run P-1 with slightly higher bounds that are currently used.
Getting this thread back to the math/optimization problem....

P-1 and LL tests are done on the same caliber of Intel/AMD machine. The P-1 bounds are selected comparing the time it takes to run P-1 to the cost of running 2 LL tests times the chance of P-1 finding a factor. Barring any bugs in my understanding of P-1's chance of finding a factor or in my programming, increasing P-1 bounds would be a bad idea because that extra P-1 time would remove more candidates if it were used for LL testing instead.
Since Silverman's suggestion is to run P-1 without any preceding (server-assigned) TF, the P-1 bounds selection algorithm will find that the TF bit level for an exponent is in the low 60s (to which the past LMH projects had taken all exponents), rather than the high 60s or low 70s as it now usually faces. In that case, it will select higher bounds as optimal, because of the new possibility of finding factors in the high 60s to low 70s bits. Since the probability that P-1 will find a factor for such an exponent will be greater than it used to be, higher bounds are justified.

The extra P-1 time will be less than the skipped TF time, and thus a net savings, by Silverman's argument.

(Personally, though that seems reasonable, I'd be more comfortable seeing specific numbers and time-costs/savings for how many would-also-have-been-TF-found factors P-1 will find between the powers of 2 we've been covering with server-assigned TF. How many would P-1 with those higher B1/B2 bounds miss that TF to currently standard power-of-2 limits would have found?

I see that Silverman has anticipated me in post #172.)

This is simply the mirror image of your later statement "When extra TF is done before P-1 has been performed, the extra TF reduces the P-1 bounds (because the chance of finding a factor is reduced)." <= less, less, increases, increased

- - -

Quote:
Originally Posted by R.D. Silverman View Post
Suppose you run P-1 to steps B1/B2. Thus, we are sure there are
no factors smaller than 2 B2 p + 1.
For exponents in the range 55M, I see these current factoring limits:
Code:
55000031,71,675000,29750000
55000181,71,660000,19140000
55000189,71,660000,19140000
55000207,71,660000,19140000
55000277,71,660000,19140000
 . . .
So, the first line shows that P-1 by itself with B1,B2 = 675000,29750000 eliminated all factors below 2 * 29750000 * 55000031 + 1 = 3272501844500001, which is below 2^52 and not very close to 2^71.

In order to guarantee finding factors up to 2^71 with P-1 alone, we'd have to use B2 = 2^71 / p = 42930580192487, over 100,000 time higher than currently-chosen P2. Which P-1 routines can handle B2 ~ 2^46?

So the no-TF scheme might be more efficient at eliminating LL tests, but it would leave much lower power-of-2 limits to which one could say definitely there were no smaller factors. Indeed, since exponents in the 55M range were already TFed to at least 2^60 by LMH, a P-1 B2 bound of 2^60 / p = 20962197360, which is ~1000 times as large as our current customary B2, would be necessary just to guarantee the same level of completeness that we already have through LMH TF.

Last fiddled with by cheesehead on 2011-06-28 at 18:12
cheesehead is offline   Reply With Quote
Old 2011-06-28, 18:22   #208
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

769210 Posts
Default

Oops. I used p instead of 2p in the divisions. Revised numbers are below.

Quote:
Originally Posted by cheesehead View Post
In order to guarantee finding factors up to 2^71 with P-1 alone, we'd have to use B2 = 2^71 / p = 42930580192487, over 100,000 time higher than currently-chosen P2. Which P-1 routines can handle B2 ~ 2^46?

So the no-TF scheme might be more efficient at eliminating LL tests, but it would leave much lower power-of-2 limits to which one could say definitely there were no smaller factors. Indeed, since exponents in the 55M range were already TFed to at least 2^60 by LMH, a P-1 B2 bound of 2^60 / p = 20962197360, which is ~1000 times as large as our current customary B2, would be necessary just to guarantee the same level of completeness that we already have through LMH TF.
In order to guarantee finding factors up to 2^71 with P-1 alone, we'd have to use B2 = 2^71 / 2p = 21465290096243, over 70,000 times as large as currently-chosen P2. Which P-1 routines can handle B2 ~ 2^45?

So the no-TF scheme might be more efficient at eliminating LL tests, but it would leave much lower power-of-2 limits to which one could say definitely there were no smaller factors. Indeed, since exponents in the 55M range were already TFed to at least 2^60 by LMH, a P-1 B2 bound of 2^60 / 2p = 10481098679, which is over 300 times as large as our current customary B2, would be necessary just to guarantee the same level of completeness that we already have through LMH TF.

Completeness isn't everything, but eliminating server-assigned TF would dash the current GIMPS claim to comprehensive factor search up to levels of 2^70 and beyond. Would LMH or any other specialty group take on the project of extending TF on P-1ed-only-and-already-LLed exponents if doing so made no contribution to the search for Mersenne primes?

Last fiddled with by cheesehead on 2011-06-28 at 18:30
cheesehead is offline   Reply With Quote
Old 2011-06-28, 18:33   #209
davieddy
 
davieddy's Avatar
 
"Lucan"
Dec 2006
England

2×3×13×83 Posts
Default

Quote:
Originally Posted by science_man_88 View Post
David it's the first ll that found it it's the second that confirms it so LL-D will not find it it will confirm it or deny it. Even I know this and I'm usually the thickest skulled person on these forums.
Quote:
Originally Posted by fivemack View Post
If someone finds a zero residual, then there will be a double-check done immediately on the fastest hardware available to confirm.

What LL-D offers is the possibility of finding a number for which the correct residual is zero, but the first run maybe years earlier hit a hardware problem and so the Mersenne prime was missed.
See "Is this a bug...or...?" thread in the Primenet forum.

David

Last fiddled with by davieddy on 2011-06-28 at 18:40
davieddy is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Predict M50 Uncwilly Lounge 65 2018-01-06 17:11
Predict M#50... Raman Lounge 3 2016-10-03 19:23
Predict M44... Xyzzy Lounge 66 2014-02-01 14:45
Predict M45... ewmayer Lounge 215 2008-09-17 21:14
Predict M42 Uncwilly Lounge 22 2005-02-27 02:11

All times are UTC. The time now is 06:00.


Fri Aug 6 06:00:51 UTC 2021 up 14 days, 29 mins, 1 user, load averages: 3.15, 3.17, 3.14

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.