mersenneforum.org optimal memory settings for the P-1 stage
 Register FAQ Search Today's Posts Mark Forums Read

2006-10-13, 20:02   #1
S485122

Sep 2006
Brussels, Belgium

1,571 Posts
optimal memory settings for the P-1 stage

I did a search through the FAQs, the help file and the forum and I still have some questions about the optimal amount of memory for the P-1 stage. In the help file there is a table with some obsolete values (it gives three values : for 6M, 10M and 33M.)

There is a thread where the optimum amount is said to be 5.5 times the number of millions of the exponent but I suppose this is not actual anymore because in the current readme files the text is missing :

Quote:
 Originally Posted by wackyeh George even states in the Prime95's README: From my own experience, using any more than about [desirable] ~5.5x the exponent (say 220MB for a exponent around 40M), it can significantly increase the amount of time to do the test. For example, 3000 iterations on a 42M exponent went from 500+ secs to around 450 secs. simply by reducing memory from 320MB used to 256MB used (42 x 5.5 = 231MB)
With the current version Prime95 does allocating the maximum amount of memory constitute the optimal setting ?

2006-10-13, 23:08   #2

"Richard B. Woods"
Aug 2002
Wisconsin USA

22·3·641 Posts

Quote:
 Originally Posted by Jacob Visser With the current version Prime95 does allocating the maximum amount of memory constitute the optimal setting ?
YES!

When reading about P-1 stage 2 memory allocations, it's sometimes important to distinguish between two different cases because the behavior differs between the two. But this is often difficult because most users aren't even aware that more than one case exists!

First case (the one to which most users, including wackeyh in your linked quote as far as I can tell, refer in their posts): the worktodo.ini line command is Test=nnnnnnnn... or Pfactor=nnnnnnnn...

In this case, prime95 calculates an optimum combination of B1 and B2 bounds, taking into account several factors including the Available Memory setting.

Example (where I'm guessing at roughly realistic values -- don't quote these as authoritative): For a given exponent such as 33500153, if Available Memory is the 8M default or other small value < 30M, prime95 may assign B1 = B2 = 505000, so that only Stage 1 is run. If Avail. Mem. = 60M, prime95 may decide on B1 = 380000, B2 = 9500000. For Avail. Mem. > 150M, prime95 may choose B1 = 325000, B2 = 21775000. The first choice (<30M) will take the shortest time, and has the smallest probability of finding a factor. The second choice (60M) will take longer, but has a noticeably higher chance of finding a factor. The third case (>150M) will take the longest time, but also has the highest chance of finding a factor.

Thus we can see why wackeyh writes that increasing the Available Memory will increase the time. What wackeyh leaves out, however, is that prime95 has calculated that the increased time is justified by the increased chance of finding a factor. (Perhaps wackeyh is even unaware that increasing the memory allows prime95 to have a greater chance of finding a factor in P-1.) My advice is to trust prime95's algorithm for choosing B1/B2 limits -- the more memory you give it, the better your result will be. Prime95's B1/B2 choosing algorithm takes more factors into account than you know of (unless you read the source code or my past posting where I list them :), and its choice will optimize GIMPS throughput!

Second case: The worktodo.ini command line is Pminus1=nnnnnnnn...

Here, the user explicitly specifies the B1 and B2 bounds. Then the only effect of the Available Memory figure is to limit the number of auxiliary work areas that prime95 can allocate during Stage 2 to speed up execution. Here, increasing Available Memory will decrease running time, the opposite of the effect it had in the first case! And if you give it a whole lot of memory (like a GB or more), it may even shift into a "higher gear" (The Brent-Suyama Extension) that increases its chance of finding a factor!

Last fiddled with by cheesehead on 2006-10-13 at 23:16

 2007-05-25, 12:03 #3 cheesehead     "Richard B. Woods" Aug 2002 Wisconsin USA 22×3×641 Posts Update: "P-1 RAM Vs ????" at http://mersenneforum.org/showthread.php?t=8251, where I explain another possible reason for misunderstanding. Note that my previous post here doesn't mention the possibility of system thrashing if "Available Memory" is set too high. Last fiddled with by cheesehead on 2007-05-25 at 12:09
 2007-05-25, 17:41 #4 crash893     Sep 2002 23·37 Posts wasnt there something else too like p-1 with 512 ram you had lets say 50% chance of finding a factor (i know its much lower) but with 1 gig you only had like a 60% and with 2 gigs you only had like a 65% chance something like that
2007-05-26, 04:53   #5

"Richard B. Woods"
Aug 2002
Wisconsin USA

22×3×641 Posts

Quote:
 Originally Posted by crash893 like p-1 with 512 ram you had lets say 50% chance of finding a factor (i know its much lower) but with 1 gig you only had like a 60% and with 2 gigs you only had like a 65% chance
Do you think something's wrong with that? (Other than the real percentages being much lower than 50%/60%/65% -- maybe substitute 1.0%/1.2%/1.3% ) If so, what numbers would you think were "normal"?

Increasing memory does not increase the chance of factor-finding in _linear proportion_.

(-- to be continued in a few minutes --)

Last fiddled with by cheesehead on 2007-05-26 at 05:21

 2007-05-26, 06:06 #6 cheesehead     "Richard B. Woods" Aug 2002 Wisconsin USA 1E0C16 Posts P-1 Stage 2 uses "Available Memory" to set up work areas for storing certain repeatedly-used intermediate computed values so that when they are needed again, they need not be re-computed from scratch but can be simply copied from a work area. So the first-order effect of increased memory is to speed up calculations. One specifies larger "Available Memory" in order to reach a given B2 bound more rapidly. Or ... (ways it's more commonly thought of) ... One can specify larger "Available Memory" in order to reach higher B2 bounds in the same amount of time ... or to reach even higher B2 bounds in an amount of time that is larger, but not as much larger as it would take at a lower "Available Memory". - - When Prime95 is choosing the B1/B2 combination that optimizes GIMPS throughput, it is only amounts of L-L time saved per time of P-1 run that are optimized, not space used to perform the P-1 calculation. Example: (Note: In this example, I made up the limit numbers, probabilities, and run times, to approximate realistic actual numbers I've seen in P-1 runs. Yeah, I chose numbers so that the results came out the way I wanted, but I think they're close enough to reality for illustration purposes.) Suppose that P-1 is choosing its own B1/B2 limits, and it's choosing among three alternatives: combination X (B1=590000,B2=590000 -- so only Stage 1) has a 1.0% chance of finding a factor, combination Y (B1=460000,B2=2100000) has a 1.3% chance of finding a factor, and combination Z (B1=410000,B2=3900000) has a 1.5% chance of finding a factor. Suppose that if "Available Memory" is 512M, combination X takes 10 hours, combination Y takes 15 hours, and combination Z takes 16 hours. The efficiency of each choice is proportional to the ratio (chance of finding a factor)/(time to find a factor), so that ratio calculation is the one I use in the following: Prime95 will prefer combination X because 1.0%/10 (.010/10) is greater than 1.3%/15 (.013/15) or 1.5%/16 (.015/16). Suppose that if "Available Memory" is 1024M, then combination X takes 10 hours, combination Y takes 11 hours, and combination Z takes 15 hours. Prime95 will prefer combination Y because .013/11 is greater than .010/10 or .015/15. Suppose that if "Available Memory" is 2048M, then combination X takes 10 hours, combination Y takes 10 hours, and combination Z takes 11 hours. Prime95 will prefer combination Z because .015/11 is greater than .010/10 or .013/10. So with 512M, P-1 has a 1.0% chance of finding a factor, with 1024M it has a 1.3% chance, and with 2048M it has a 1.5% chance. Last fiddled with by cheesehead on 2007-05-26 at 06:42
2007-05-26, 09:42   #7
axn

Jun 2003

469310 Posts

Quote:
 Originally Posted by cheesehead The efficiency of each choice is proportional to the ratio (chance of finding a factor)/(time to find a factor), so that ratio calculation is the one I use in the following: Prime95 will prefer combination X because 1.0%/10 (.010/10) is greater than 1.3%/15 (.013/15) or 1.5%/16 (.015/16).
No

Quote:
 Originally Posted by http://www.mersenne.org/math.htm So how does GIMPS intelligently choose B1 and B2? We use a variation of the formula used in trial factoring. We must maximize: chance_of_finding_factor * 2 * primality_test_cost - factoring_cost
You cannot define efficiency of P-1 without considering the cost of an LL test.

2007-05-26, 20:32   #8

"Richard B. Woods"
Aug 2002
Wisconsin USA

22·3·641 Posts

Quote:
Originally Posted by axn1
Quote:
 Originally Posted by cheesehead The efficiency of each choice is proportional to the ratio (chance of finding a factor)/(time to find a factor), so that ratio calculation is the one I use in the following: Prime95 will prefer combination X because 1.0%/10 (.010/10) is greater than 1.3%/15 (.013/15) or 1.5%/16 (.015/16).
No
You know, as I was writing that, my subconscious indicated that something was wrong, but I couldn't figure out what. I should have obeyed my impulse to add a note that I was uncertain.

Quote:
 You cannot define efficiency of P-1 without considering the cost of an LL test.

I was treating the cost of an LL test as an implicit multiplication constant (constant for a specific exponent, that is -- so that it would multiply each of the calculations in my example by exactly the same number and thus need not be explicitly included -- hence my "proportional" remark).

How would you rewrite my example, or at least part of it?

(It's quite possible that when I see your explanation, I'll think, "Oh, how could I have forgotten?" or "There it is, right in the middle of my earlier posting in this thread, but I got mixed-up!" I eagerly await your explanation.)

- - -

Added: Possible source of problem: I have used the term "efficiency" privately in certain calculations for a certain concept. It's possible that:

(a) "Efficiency" has a formal meaning in the context of real, public P-1 discussion that conflicts with my private definition, or

(b) I may have a "fuzzy" private definition of "efficiency" and have gone-wrong in using it here, or

(c) I may have two separate private definitions of "efficiency" for two different contexts and mixed them up here.

My private usage: One of my private definitions of "efficiency" in this context is:

(total L-L time saved by finding a factor)
/
(time spent in P-1 effort to find that factor)

Another (apparently equivalent, to me) is:

(Total L-L time expected to be saved by finding factors of a given set of Mnumbers with P-1, on average given fixed B1/B2/"Available Memory" for all Mnumbers)
/
(Total time spent in P-1 effort to find factors of a given set of Mnumbers with P-1, on average given fixed B1/B2/"Available Memory" for all Mnumbers)

Last fiddled with by cheesehead on 2007-05-26 at 21:02

2007-05-27, 05:08   #9
axn

Jun 2003

125516 Posts

Quote:
 Originally Posted by cheesehead My private usage: One of my private definitions of "efficiency" in this context is: (total L-L time saved by finding a factor) / (time spent in P-1 effort to find that factor)
As far as efficiency goes, this does /seem/ like a useful metric. And as you noted, with this metric, you can discount the LL cost as a constant multiplier.

However, as per the expression I have given, the correct metric is: (total L-L time saved by finding a factor) - (time spent in P-1 effort to find that factor) i.e. the net savings in the execution of an LL test. This is what we need to maximize. Here, one cannot discount the LL cost.

To illustrate: (contd in next post)

2007-05-27, 05:50   #10
axn

Jun 2003

111258 Posts

Let us consider your example scenario #1:
Quote:
 Originally Posted by cheesehead Suppose that P-1 is choosing its own B1/B2 limits, and it's choosing among three alternatives: combination X (B1=590000,B2=590000 -- so only Stage 1) has a 1.0% chance of finding a factor, combination Y (B1=460000,B2=2100000) has a 1.3% chance of finding a factor, and combination Z (B1=410000,B2=3900000) has a 1.5% chance of finding a factor. Suppose that if "Available Memory" is 512M, combination X takes 10 hours, combination Y takes 15 hours, and combination Z takes 16 hours.
Let $t$ be the LL cost (depending on the context, this could be 1 LL test or 2 LL tests).
Code:
P-1 level | Expected cost | Break even pt
----------+---------------+------------------------
no P-1    | t             |     N/A
combo X   | 0.990t + 10   | t > 10/0.010 = 1000.00 hrs
combo Y   | 0.987t + 15   | t > 15/0.013 = 1153.85 hrs
combo Z   | 0.985t + 16   | t > 16/0.015 = 1066.67 hrs
The table gives the break even pts for each of the P-1 levels as specified.

Now, the thing to note is that, given a sufficiently large $t$, a combo with a lower coefficient for $t$ /will/ be better. So let's calculate how large is sufficiently large.

Break even pt between Combo X & Y

$0.990t + 10 > 0.987t + 15$

or $t > (15-10)/(0.990-0.987) = 1666.67 hrs$

i.e, for tests longer than 1666.67 hrs, combo Y is preferred to X.

Similarly, X vs Z: $t > (16-10)/(0.990-0.985) = 1200 hrs$

And, Y vs Z: $t > (16-15)/(0.987-0.985) = 500 hrs$

To summarize, if $t$ <= 1000, no P-1. For 1000 < $t$ <= 1200, combo X. For $t$ > 1200, Combo Z

--------------
The same calculation can be repeated for the other scenarios.

2007-05-27, 10:33   #11

"Richard B. Woods"
Aug 2002
Wisconsin USA

22·3·641 Posts

Quote:
 Originally Posted by axn1 Here, one cannot discount the LL cost. < snip > Let http://mersenneforum.org/cgi-bin/mimetex.cgi?t be the LL cost < snip > The table gives the break even pts for each of the P-1 levels as specified. Now, the thing to note is that, given a sufficiently large $t$, a combo with a lower coefficient for $t$ /will/ be better. So let's calculate how large is sufficiently large.
No, no, that's not how Prime95 works.

It's not choosing an exponent (or choosing an L-L cost) to best utilize combination X, or Y, or Z. If one wants to analyze that way, one can, but that's not what the Prime95 software is doing.

When Prime95 is given the task of choosing a B1/B2 combination for doing P-1 on some exponent, it's selecting the combination that's most efficient to use on the given exponent (while also taking into consideration the specified "Available Memory", the CPU type, and the bit-level to which the exponent has been previously TFed). So the L-L cost is not a variable whose value can be chosen; it's a constant during any one search for an optimal B1/B2 combination.

In my example, combinations X, Y, and Z are just representing three out of hundreds or thousands of possible combinations that Prime95 examines each time it's choosing P-1 bounds for one single exponent.

Last fiddled with by cheesehead on 2007-05-27 at 11:17

 Similar Threads Thread Thread Starter Forum Replies Last Post Fred Software 5 2016-05-03 00:51 Uncwilly Lounge 5 2013-05-15 23:29 gamer30 Software 17 2012-08-23 20:02 Unregistered Information & Answers 4 2010-07-30 21:49 Discobadger Information & Answers 3 2009-04-03 11:48

All times are UTC. The time now is 00:54.

Wed Sep 23 00:54:12 UTC 2020 up 12 days, 22:05, 0 users, load averages: 2.06, 1.90, 1.82