![]() |
![]() |
#45 | |
Serpentine Vermin Jar
Jul 2014
5·677 Posts |
![]() Quote:
Now I took those 240 results and between these same systems I can run 10 at a time without causing memory issues (I'm assuming 20 GB per instance). Each instance will process 24 of the results. In detail, it's 4 machines (2x6 core Xeons). 3 of the 4 had enough extra RAM to run 2 at once and one of them was only using 50GB out of 144 GB so I'm giving it 4 to run at once. Too bad that machine has slightly slower CPU's (2.53 GHz instead of 3.47 GHz...oh well). I just slightly tweaked the process I put together for having gmp-ecm do everything... mostly I didn't want to click around a bunch to set affinity and priority, and I only had to modify it slightly to grab a particular results file. Anyway, those are all going now. I can't remember what it was last time, but I think it was taking something like 90 minutes or so for it to run stage 2 on each result. I guess in 36 hours I should have all 240 curves finished with stage 2. BTW, I just let gmp-ecm figure out B2 (it's using B2=105101237217912). |
|
![]() |
![]() |
![]() |
#46 | |
Serpentine Vermin Jar
Jul 2014
5×677 Posts |
![]() Quote:
I guess I could try setting the affinity on multiple threads to different NUMA nodes, but I don't even know if that would help. When the process started up it didn't have an affinity so the Windows scheduler (which is NUMA aware) probably didn't know to try and give it memory on those channels? One more reason the affinity setting *should* be baked into the EXE so it's defined at launch. |
|
![]() |
![]() |
![]() |
#47 |
Serpentine Vermin Jar
Jul 2014
5·677 Posts |
![]()
[QUOTE=Madpoo;401029]It occurred to me that my previous tests were just running one per machine. Doing 2 (or 4) on the same system may create memory contention well beyond what I saw before and slow down the individual threads. I guess I'll find out./QUOTE]
It seems like it might be just fine. On the faster 3.47 GHz systems it's still doing about 94 minutes to do the stage 2 on each one, which is about the same as before when only one was running (this one has 2 going). I did manually change the affinity so each instance was on it's own NUMA node which may have been a good idea. On the slower 2.53 GHz systems which I didn't benchmark previously it's doing each stage 2 in about 128 minutes. 2.53 GHz / 3.47 GHz = 0.73 94 minutes / 128 minutes = 0.73 So yeah, seems like it scales pretty evenly with CPU speed, everything else being equal (similar servers, similar mem speed, same Xeon class CPU, etc) |
![]() |
![]() |
![]() |
#48 |
"GIMFS"
Sep 2002
Oeiras, Portugal
1,571 Posts |
![]()
Well, that´s what I call a powerhouse! Those 240 curves will end up counting as probably more than 1000 @ default/current bounds set in the Primenet server.
Have you measured the time each curve is taking on S1? According to some posts in this thread, the ratio S1:S2 should be ~ 1:0.7. So if S2 is taking 94 minutes and S1 more than 94/0.7=134 mins you could ftry larger B2 values until you attain that ratio. |
![]() |
![]() |
![]() |
#49 | |
"GIMFS"
Sep 2002
Oeiras, Portugal
1,571 Posts |
![]() Quote:
It´s already becoming apparent, though, that for smaller exponents the difference is not as big as for larger ones, in line with what you wrote in your post. |
|
![]() |
![]() |
![]() |
#50 | |
Serpentine Vermin Jar
Jul 2014
5×677 Posts |
![]() Quote:
![]() The server that was taking 128 minutes in stage 2 with gmp-ecm was taking 5 hours, 45 minutes in stage 1 with P95. Sounds like you're saying that I should really be goosing up B2 until it takes around 4 hours in stage 2? |
|
![]() |
![]() |
![]() |
#51 | |
"Curtis"
Feb 2005
Riverside, CA
33·11·19 Posts |
![]() Quote:
Edit: It's less clear that increasing B2 without increasing memory (which increases the number of steps to finish stage 2, a parameter GMP-ECM calls "k") will prove more efficient. If you are memory-limited to this current stage 2 footprint, it may be that only a small increase in B2 is worthwhile. B2 increases in large steps, corresponding to a unit change in k-value. If your current test uses k=3, the next B2 would be 1/3rd bigger and k=4 for same memory footprint. GMP-ECM by default uses k values 2 through 6, followed by a doubling of memory and reset to k=2 for a bigger more efficient work-chunk. If you set maxmem={number too small for default k choice}, the program will stick to the smaller work-chunk-size, increasing k beyond 6. This is usually less efficient, but experimentation is required (depends on individual machine specs). Last fiddled with by VBCurtis on 2015-04-28 at 04:23 Reason: added detail about k parameter |
|
![]() |
![]() |
![]() |
#52 |
Sep 2010
Scandinavia
3·5·41 Posts |
![]()
I'm pretty sure B2 should be increased, even with a memory constraint.
|
![]() |
![]() |
![]() |
#53 | |
"Bob Silverman"
Nov 2003
North of Boston
22×1,877 Posts |
![]() Quote:
optimal ECM performance is obtained when one spends the same amount of TIME in step 1 and step 2. |
|
![]() |
![]() |
![]() |
#54 |
"GIMFS"
Sep 2002
Oeiras, Portugal
1,571 Posts |
![]()
According to posts #16, 22 and 32 of this thread, written by someone that apparently has read your papers, the ratio is 1:0.7, hence my observation. I lack the math background to fully understand what´s involved, so I trusted what seemed to come from a reliable source. I´m obviously happy to be corrected from someone qualified in the subject as yourself.
|
![]() |
![]() |
![]() |
#55 | |
Einyen
Dec 2003
Denmark
19·181 Posts |
![]() Quote:
Code:
M1277 B1=110M Prime95:16 min GMPECM: 27 min M1277 B1=44M Prime95:6.5min GMPECM: 9.5 min M2137 B1=44M Prime95:8.5min GMPECM: 29 min M10061 B1=44M Prime95:34.5min GMPECM: 277 min Last fiddled with by ATH on 2015-04-28 at 12:33 |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
GMP-ECM & Prime95 Stage 1 Files | Gordon | GMP-ECM | 3 | 2016-01-08 12:44 |
Stage 1 with mprime/prime95, stage 2 with GMP-ECM | D. B. Staple | Factoring | 2 | 2007-12-14 00:21 |
Need help to run stage 1 and stage 2 separately | jasong | GMP-ECM | 9 | 2007-10-25 22:32 |
P4 Prescott - 31 Stage Pipeline ? Bad news for Prime95? | Angular | Hardware | 18 | 2004-11-15 07:04 |
Stage 1 and stage 2 tests missing | Matthias C. Noc | PrimeNet | 5 | 2004-08-25 15:42 |