![]() |
|
|
#12 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
7,537 Posts |
OK. I'm seeing similar behavior ECMing M5000000. I'm not seeing it for M500000. This leads me to believe the estimates go bad when ECM data exceeds my 6MB L2 cache.
I'll investigate some more by profiling. |
|
|
|
|
|
#13 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
753710 Posts |
Here's the deal:
ECM has two possible stage 2 implementations. The first does 2 FFTs (the same cost as one squaring) and one addition per stage 2 prime. It also does some modular inverses (the same cost as a GCD) and some muls and adds based on the number of temporaries that can be allocated. The second implementation does 4 FFTs and two additions per stage 2 prime. For small ECM numbers, Prime95 does a fairly good job of estimating the time. It is probably off by ~10% in both stage 1 and 2 because it is only considering the cost of the FFTs -- and the additions aren't free. A large number of temps can be allocated, meaning we don't do many GCDs, and they don't cost much anyway. As numbers get larger, fewer temporaries can be allocated and GCDs get more costly. Consequently, stage 2 approaches twice as long as expected. Furthermore, cache miss penalties get larger (you need triple the prefetch bandwidth to do a multiply with 2 sources and 1 destination as opposed to an LL test which does a squaring where the source and destination are the same). The extra stage 1 and 2 overhead is more like 20%. Finally, as numbers get very large Prime95 switches to the 4 FFT stage 2 which costs exactly double what Prime95 is estimating. The amount of memory you let prime95 use, your CPUs cache sizes and miss penalties make an exact formula near impossible. And for the Primenet server it is impossible as it doesn't have near enough information available. So, I'm considering something along these lines. For exponents below say 100,000 add a 10% overhead to the time estimate and Primenet credit. Between 100,000 and say 5,000,000 I'll linearly increase the overhead from 10% to 20% and linearly increase the stage 2 estimate from 1x to 2x. It isn't perfect, but it would be better. For example, exponents >= 5M currently get 13 "units of credit" in stage 1 and 6 in stage 2. This would increase to (13+12)*1.2. In other words, increase from 19 to 30. Last fiddled with by Prime95 on 2010-10-28 at 02:37 |
|
|
|
|
|
#14 |
|
Aug 2002
North San Diego County
5·137 Posts |
Looks like a rational solution to me. Thanks for taking the time to suss it out.
|
|
|
|
|
|
#15 |
|
May 2010
32·7 Posts |
Your explanation makes sense.
I had been doing some OCing and benchmarking with LL and noticed a signficant (4%) increase in iteration time between 2 and 3 LL tasks running simultaneously. Dropping to 1 core improved my iteration time slightly, but not by much (1%). I figured the various threads were fighting over a common resource (memory or L3 cache). I was about to repeat the benchmarking using ECM tests to see if they too were fighing over memory, but didn't get around to it before you posted. All my previous posts were referring to a PC with saturated cores (1 ECM + x LL) on my K8 dual core and i7 quad. On with the bug search! |
|
|
|
|
|
#16 | |
|
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
22·3·17·23 Posts |
Quote:
Just let me know when or what version I need to upgrade to. Thanks |
|
|
|
|
|
|
#17 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
7,537 Posts |
The server is now handing out the enhanced ECM cpu credit.
Past ECM credit has been bumped about 20%. |
|
|
|
|
|
#18 |
|
Aug 2002
North San Diego County
68510 Posts |
Cool! Thanks!
|
|
|
|
|
|
#19 | |
|
Dec 2007
Cleves, Germany
2·5·53 Posts |
Quote:
New ... Rank 41/425 GHzd 283.8886 Count 644 How come I'm not exactly satisfied?
Last fiddled with by ckdo on 2010-11-06 at 05:49 |
|
|
|
|
|
|
#20 | |
|
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
22·3·17·23 Posts |
Quote:
To compare: New: 5277187 NF-ECM 2010-11-06 05:24 45.0 3 curves, B1=50000, B2=5000000 0.7204 Old: 5277169 NF-ECM 2010-11-05 17:52 44.5 3 curves, B1=50000, B2=5000000 0.4501 Note: credit went from .4501 to .7204 (60% more in this case) Seems reasonable to me. Thanks |
|
|
|
|
|
|
#21 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
11101011100012 Posts |
Quote:
Code:
Pre-2009 651000 1.1813 1/1/9 – 4/1/9 1103000 1.2506 4/1/9-7/1/9 632344 1.1785 7/1/9-10/1/9 734200 1.1939 10/1/9-1/1/10 733926 1.1939 1/1/10-4/1/10 746563 1.1958 4/1/10-7/1/10 843000 1.2105 7/1/10-10/1/10 999628 1.2346 10/1/10-pres. 909239 1.2207 |
|
|
|
|
|
|
#22 |
|
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
224268 Posts |
Significantly less important, but maybe the following belongs in this thread:
The PRP estimated times are about a half of what it really takes; for example, try: [Worker #1] PRP=2,10,249448,1,0,0,"3" PRP=2,10,249447,1,0,0,"3" PRP=2,10,249442,1,0,0,"3" PRP=2,10,249435,1,0,0,"3" PRP=2,10,249431,1,0,0,"3" It takes about 7 minutes on a particular computer, while the Status page "promises" ~3.5 minutes: |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Estimated relations Factmsieve | cimpresovec | Msieve | 21 | 2016-01-17 15:58 |
| Question about Estimated Days to Complete | Mark Rose | GPU to 72 | 5 | 2013-10-04 06:12 |
| Estimated completion dates | Yura | Software | 3 | 2012-11-13 19:45 |
| Time it takes to select polynomials for 154 digits | John5788 | Factoring | 23 | 2008-08-27 07:54 |
| Prime95 takes over machine???!!! | kwstone | Software | 4 | 2003-08-10 22:46 |