mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2004-09-19, 13:59   #23
garo
 
garo's Avatar
 
Aug 2002
Termonfeckin, IE

276810 Posts
Default

One last question: For these numbers GMP-ECM's default B2 is 2.5577e10 whereas mprime/Prime95 are of course limited to 4e9. I have started running these numbers with B2=4e10.

Can anyone tell me if it is better to stick with the default bounds? Or should I go for the maximum 1.8e11? And how will these curves be reported to George since one curve with B2=4e10 is obviously worth more than a curve with B2=4e9?
garo is offline   Reply With Quote
Old 2004-09-19, 14:34   #24
garo
 
garo's Avatar
 
Aug 2002
Termonfeckin, IE

24×173 Posts
Default

Three posts in a row!

A couple of more questions.

Is the 5.1-beta for Athlon that is posted on the primenumbers yahoogroups site the fastest or is there a faster build available?

Alex mentioned above:
Quote:
If you do stage 1 on two (or more) numbers simultaneously, you should split the residue modulo the individual numbers before feeing it to gmp-ecm. It saves both time and memory. The thread http://www.mersenneforum.org/showthread.php?t=2326 has more info.
But in the other thread he mentions that:

Quote:
You can do the stage 1 on M1574 and then stage 2 in GMP-ECM on M787 and P787 individually, but it's a little awkward. Produce the results.txt file with stage 1 residues as usual, then copy it to two files (i.e. results.m787 and results.p787). Get the remaining cofactors of M787 and P787 in decimal or hexadecimal form (GMP-ECM can parse either) and in a text editor, substitute in results.m787 the "N=0x....;" by "N=<cofactor of M787 here>;", similarly for results.p787. GMP-ECM will then reduce the stage 1 residue modulo the cofactor and all should work out nice.
which implies that if the N values are properly broken down the residue is reduced appropriately by GMP-ECM. This makes sense as reducing the residue should be easy and a logical step for GMP-ECM to take every time.

So I'm assuming that the steps outlines two posts above are correct and sufficient.
garo is offline   Reply With Quote
Old 2004-09-19, 18:15   #25
akruppa
 
akruppa's Avatar
 
"Nancy"
Aug 2002
Alexandria

2,467 Posts
Default

Quote:
2) GMP-ECM seems to be having trouble with N values expressed as hex regardless of what other people have posted here. When I converted the hex value into decimal for N, it worked fine. otherwise it complains about finding an "x" which comes from the "0x" before the hex value. Funnily, enough the residues can be expressed in hex without any problem whatsoever.
This is a problem of the "GMP-ECM5.1-beta" binary. It's more of a CVS snapshot than a beta and was made public without Paul Zimmermann's approval. Both Paul and I advise against using it. But since it's apparantly the only Windows binary of GMP-ECM floating around out there, many people use it anyways. Afaik, the MULMODN bug that was fixed in 5.0.3 is also present in "5.1-beta", beware.

Quote:
One last question: For these numbers GMP-ECM's default B2 is 2.5577e10 whereas mprime/Prime95 are of course limited to 4e9. I have started running these numbers with B2=4e10.
The default B2 choice aims to keep the time spent in stage 2 at about half of that of stage 1. Since you use the much faster Prime95 for stage 1, you should use a lower B2. With B1=11M, 40G may even be a bit high.

Here are some (approximate) expected number of curves for B1=11M:
B2=1.1e9: 11000
B2=5e9: 8200
B2=10e9: 7100
B2=20e9: 6200
B2=40e9: 5500

If you do stage 2 with B2=40G, each curve has an about twice as high chance of finding a factor as B2=1.1e9, so when reporting your effort, simply double the number of curves.

Quote:
which implies that if the N values are properly broken down the residue is reduced appropriately by GMP-ECM. This makes sense as reducing the residue should be easy and a logical step for GMP-ECM to take every time.
GMP-ECM does the mod reduction when it does arithmetic, but unfortunately not immeadiately when it reads residues from files. Sorry, that was an oversight on my part. I'm attaching a diff file that fixes that.

I'm not 100% certain that reading unreduced residues from a save file works correctly. In case of base-2 reduction, I expect is does, which is what will typically be used when resuming Prime95 residues. If, however, the remaining cofactor is only ~70% of the full 2^n+-1 number, or for Aurifeullian factors, MODMULN will be used which may have problems. I did a test just now and it worked, but I haven't looked at the code hard enough yet to be sure it always will.

If you can arrange it at all, I'd recommend using a patched 5.0.3 for stage 2. I can't offer a Windows binary, though, lacking Windows.

Alex

EDIT: Duh. Of course the residue gets properly reduced when it's converted to a mpres_t type. No patch neccessary, but I'll attach it anyways.
Attached Files
File Type: zip resumediff.zip (392 Bytes, 173 views)

Last fiddled with by akruppa on 2004-09-19 at 18:22
akruppa is offline   Reply With Quote
Old 2004-09-19, 21:49   #26
garo
 
garo's Avatar
 
Aug 2002
Termonfeckin, IE

24×173 Posts
Default

Thanks Alex for your detailed post. I used 5.1-beta because that was the only version for which I could find a compiled Athlon binary. I do not have the appropriate tools for compiling on my Athlon so I took the easy way out.

I guess I will download them now. :sigh:

I have since found an Athlon compiled version for ecm-5.0c which I assume is the same as 5.0.3. For 2^1033-1, stage 2 it was significantly slower than 5.1-beta. The reason it seems is that it chooses k=5 instead of k=4 with 5.1-beta. As a result the B2' value is higher and the d and dF values are different as well. The slowdown was about 16% so it cannot be accounted for by the higher B2' only.

I also used the test posted on the 5.0.3 page and it worked fine so if that checks for the MULMODN bug, 5.1-beta does not have it.


On a 2.8 P4, stage 1 on M4132 with mprime takes 440 seconds. I am running four separate stage 2s on an Athlon 1333 for 1033-, 1033+,2066L and 2066M and they take from 635sec to 825 secs depending on the size of the cofactor. So, I should certainly reduce the B2 bounds. I will use the table that you posted above and the time it takes for B2 to these limits to figure out what is the optimal B2 bound. It is likely to be less than even the default.

Also, for both 1033- and 1033+ GMP-ECM recognized that the residues cofactors were for these numbers and used the special mults. Your edit confirms my suspicion that the residues are indeed reduced.
garo is offline   Reply With Quote
Old 2004-09-20, 18:04   #27
garo
 
garo's Avatar
 
Aug 2002
Termonfeckin, IE

1010110100002 Posts
Default

Ok! So I finally downloaded and compiled ECM 5.0.3 with GMP4.1.3. I ran it on the P4 and found that k=3 or k=4 was usually optimal and that ecm does not always choose the k that is best. But it is usually reasonably close.

I ran GMP-ECM for various values of B2. I found that 40e9 is indeed very high and that the best performance according to the curves required table given by Alex a few posts above is actually 1.1e9 or B2=100*B1! I give the running times below for p1033 but since the cofactors are similar in size the numbers or at least the trends are representative.

Code:
B2=1.1e9: 11000 *33s = 343k secs
B2=  5e9: 8200 *101s = 828k secs
B2= 10e9: 7100 * 152s = 1079k secs
B2= 20e9: 6200 * 257s = 1593k secs
B2= 40e9: 5500 * 496s = 2728k secs
Ah! I was thinking that this obviously points to using 1.1e6 as the best solution but then I realized that I was not factoring in stage 1 times. Stage 1 on mprime takes 440 secs but that works for 4 numbers so it can be divided by 4 to give about 110 secs per curve. So the new running times are:

Code:
B2=1.1e9: 11000 *33+110s = 1210k secs
B2=  5e9: 8200 *101+110s = 1730k secs
B2= 10e9: 7100 * 152+110s = 1860k secs
B2= 20e9: 6200 * 257+110s = 2275k secs
B2= 40e9: 5500 * 496+110s = 3333k secs
Hmm. So B2=1.1e9 is still the winner though not by as much. I guess this result holds for B1=11e6. For B2 =3.3e9: ??? *66+110 = say 1584k secs for 9000 curves. And B2=2.2e9 took 53+110 * say 10000 = 1630k secs

I'm just posting this so I can get a sanity check on these numbers from Geoff or Alex. Is it okay for B2=100*B1 to be the optimal bound? I thought GMP-ECM was more efficient for stage 2 but I guess the difference is not so much to increase the bounds for such small numbers.

Oh yeah, BTW I ran mprime for stage 2 on M4132 with the same bounds 1.1e9 and it took 192 seconds whereas all four GMP-ECM stage 2's took a sum of 142 seconds. So it still is worth doing separate stage2's on the numbers using GMP-ECM.
garo is offline   Reply With Quote
Old 2004-09-20, 18:33   #28
R.D. Silverman
 
R.D. Silverman's Avatar
 
"Bob Silverman"
Nov 2003
North of Boston

22×1,877 Posts
Thumbs up

Quote:
Originally Posted by garo
Ok! So I finally downloaded and compiled ECM 5.0.3 with GMP4.1.3. I ran it on the P4 and found that k=3 or k=4 was usually optimal and that ecm does not always choose the k that is best. But it is usually reasonably close.


<snip>

I'm just posting this so I can get a sanity check on these numbers from Geoff or Alex. Is it okay for B2=100*B1 to be the optimal bound? I thought GMP-ECM was more efficient for stage 2 but I guess the difference is not so much to increase the bounds for such small numbers.
Query: When you say that "k= 3 or k = 4" was usually optimal, what do you
mean? What are you optimizing? Won't the optimal k depend on the step 2
limit chosen? {I assume here that k is the number of partitions for step 2;
one partitions the step 2 interval into pieces}

If anyone would bother to read my joint paper with Sam Wagstaff Jr. they
would find out that in order to "maximize the probability of success per unit time", then you should spend the same amount of time in step 2 as in step 1.
Adjust your limits accordingly.

Before one can "optimize", one must specify what is being optimized.

Theoretically, with an "idealized FFT implementation of step 2", then one
could take step 2 to (step 1)^2 in an amount of time equal to that spent in
step 1. This is a theoretically best bound by FFT's.

GMP-ECM does indeed use better algorithms for step 2 than that used by Mprime.

Reading Peter Montgomery's dissertation would also be useful.. He did a
superb job.

Last fiddled with by smh on 2004-09-20 at 20:35 Reason: Fixed quotes
R.D. Silverman is offline   Reply With Quote
Old 2004-09-20, 19:04   #29
akruppa
 
akruppa's Avatar
 
"Nancy"
Aug 2002
Alexandria

246710 Posts
Default

I'm a little surprised, too, I'd have expected that gmp-ecm makes larger B2 worthwhile.

But then again, Prime95 is really awfully efficient on a P4, whereas all-integer GMP is not. M4132 makes excellent use of Prime95's length 192 transform (threshold 4355) and at a cofactor size of ~250 digits, GMP really starts lagging behind Prime95's DWT.

For comparison: on an Athlon XP 3000+, stage 1 with mprime takes 730s, stage 2 on the c255 of P1033 with gmp-ecm 5.0.3 takes 26.5s. The ratio is more than twice of what you see on the P4. If everything were done on an Athlon, B2=5e9 or 10e9 would be optimal.

If you can do stage 2 on an Athlon, a slightly larger B2 may become worthwhile. Otherwise just stick with B2=1.1e9, at least it saves a little time wrt Prime95.

Alex
akruppa is offline   Reply With Quote
Old 2004-09-20, 21:06   #30
garo
 
garo's Avatar
 
Aug 2002
Termonfeckin, IE

276810 Posts
Default

Bob, you are correct. it is bad form to say optimal without saying what I was optimizing against. Well, I was running gmp-ecm 5.0.3 on a P4 in linux and for the same B2 bound I was running stage 2 and using different k values (yes, number of big steps) and observing how time taken to run varies as k varies.

The times I posted above should be taken with a slight pinch of salt, since for different k values, the B2' and b2 values chosen are different. Here b2 is the size of the step and B2' is simply k*b2 such that it exceeds the specified B2.

So yes, k does depend on the limit chosen for stage 2 and what's more k affects the "actual" limit for stage 2 as well. Regardless, the difference in times for similar B2' values that result from different k values chosen are significant enough to say that the value of k can affect the efficiency of step 2.

I did skim through Montgomery's dissertation and have downloaded your paper though I have not had time to read it yet. I will likely do that this weekend. The rule of spending the same time in stage 1 and stage 2 is something I did not know of. Alex had mentioned previously that the amount of time spent in stage 2 should be half of time spent in stage 1. But I'm assuming that since he was saying this in the thread linked to by geoff where geoff was doing stage 1 using mprime on M1574 and then stage 2 using gmp-ecm on P787 and M787 separately, what he meant was that the time stage 2 for either P787 or M787 should be half that of the total time spent in stage 1 on M1574.

I would be inclined to follow your advice except that I am running stage 1 on a P4 2.8 with mprime. And stage 2 on an Athlon 1333 with gmp-ecm. So it is a little hard to match the timings. Stage 1 on mprime runs approximately 3 times faster on the P4 whereas stage 2 on gmp-ecm is only 50% faster. As Alex also indicates in his post, that probably muddles things up a bit.

Before I have time to sit down and read your paper and PM's dissertation, I will probably go ahead and run a few curves such that the time spent in stages 1 and 2 is approximately the same after taking into account the difference in the efficiency of mprime and gmp-ecm for P4 and Athlon. At this point it seems that this is at B2=5e9 or so.
garo is offline   Reply With Quote
Old 2004-09-21, 10:44   #31
garo
 
garo's Avatar
 
Aug 2002
Termonfeckin, IE

24·173 Posts
Default A question

I just started using Richard McIntosh's maple program to calculate the number of curves required. With p=45.5 (# of digits) and B1=11e6 I got the following number of curves:

B2=11e8: 10200
B2=22e8: 8657
B2=26e9: 5227 (GMP-ECM default)
B2=40e9: 4830

These numbers are less than what Alex posted and this presumably is not even taking into account the Brent-Suyama extension.

I also cross checked with the numbers on http://www.mersenneforum.org/showthread.php?t=2326 and found that the maple program is underestimating from what has been posted there as well. However, if the Brent-Suyama extension figures posted by Alex are taken into account the maple program is overestimating.

Any clue as to why this is happening and what should I trust?

Last fiddled with by garo on 2004-09-21 at 10:47
garo is offline   Reply With Quote
Old 2004-09-21, 12:40   #32
R.D. Silverman
 
R.D. Silverman's Avatar
 
"Bob Silverman"
Nov 2003
North of Boston

22·1,877 Posts
Default

Quote:
Originally Posted by garo
I just started using Richard McIntosh's maple program to calculate the number of curves required. With p=45.5 (# of digits) and B1=11e6 I got the following number of curves:

B2=11e8: 10200
B2=22e8: 8657
B2=26e9: 5227 (GMP-ECM default)
B2=40e9: 4830

These numbers are less than what Alex posted and this presumably is not even taking into account the Brent-Suyama extension.

I also cross checked with the numbers on http://www.mersenneforum.org/showthread.php?t=2326 and found that the maple program is underestimating from what has been posted there as well. However, if the Brent-Suyama extension figures posted by Alex are taken into account the maple program is overestimating.

Any clue as to why this is happening and what should I trust?
Underestimating.
Overestimating..

By how much???

If you read my paper with Sam Wagstaff Jr., you will discover that the
response surface [probability of success with given limits and multiple curves]
is VERY shallow in the neighborhood of the optimum.

You are trying to estimate number of curves with much too fine a precision.
I would want multi-precise versions of Dickman's function before trying to
estimate #curves with the degree of precision you (seemingly) want.
The response surface is just too flat in the region of the optimum.

Furthermore, do you know how #curves gets computed? It is 1/(per curve probability). Now look at 1/(per curve probability - epsilon). #curves is
quite sensitive to small changes in the estimated probability.

Furthermore, trying to compute #curves for 45.5 digits is just silly in my
opinion. I presume that you mean sqrt(10)* 10^44 for 45.5 digits?

I would use the GMP default. It does a good job balancing step 1/step 2
times.

Note also the probability depends on whether curves are chosen with
group order a priori divisible by 12, or whether curves are chosen so the
order is divisible by 8 with prob=.5 and 16 with prob = .5
R.D. Silverman is offline   Reply With Quote
Old 2004-09-21, 12:57   #33
garo
 
garo's Avatar
 
Aug 2002
Termonfeckin, IE

24·173 Posts
Default

Hi Bob,
Yes I do know how the # of curves is calculated. I looked through the maple code. Thanks for pointing out that I shouldn't try to estimate the number of curves with too fine a degree of precision.
The maple program output is approximately 5% lower than the the estimates posted by geoff and 5% higher than the estimates posted by Alex post Brent-Suyama extension.

Also, if you read the post by philmoore where he posted the maple program you will find that it explains the reason for choosing 45.5 digits. It is indeed a way to estimate the probability of finding a factor of size sqrt(10).10^44

I'll hold my tongue (or rather my fingers) till I read your paper and have some time to think about it. But in the meantime, please try not to treat me just like those jokers who keep coming up with super-efficient new ways to prove primality
garo is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Nice progress! schickel FactorDB 29 2012-07-18 17:03
Nice pic Dubslow Forum Feedback 0 2012-05-02 02:13
Let's do another nice big GNFS job! fivemack Factoring 84 2011-04-26 10:22
M971 factored philmoore Lounge 5 2004-09-14 20:18
Nice link... Xyzzy Lounge 4 2003-06-28 13:37

All times are UTC. The time now is 10:15.


Sun Feb 5 10:15:08 UTC 2023 up 171 days, 7:43, 1 user, load averages: 0.75, 0.73, 0.72

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔