mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > GMP-ECM

Reply
 
Thread Tools
Old 2016-07-13, 16:53   #12
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

144238 Posts
Default

The script as I shipped it will never recommend [1,x] because I had no figures for B1=1e6 in the file; and its timings for 140 digits will be extrapolations from data for 150..250. Did you replace all the timings with ones you measured yourself, or just add the ones for B1=1e6 ?

I would be quite inclined to remeasure ECM timings with various b1 on input 10^140+13 and 10^130+1113, if you're wanting to draw any conclusions for those smaller inputs.
fivemack is offline   Reply With Quote
Old 2016-07-13, 18:30   #13
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

23·3·5·72 Posts
Default

Quote:
Originally Posted by fivemack View Post
The script as I shipped it will never recommend [1,x] because I had no figures for B1=1e6 in the file; and its timings for 140 digits will be extrapolations from data for 150..250. Did you replace all the timings with ones you measured yourself, or just add the ones for B1=1e6 ?

I would be quite inclined to remeasure ECM timings with various b1 on input 10^140+13 and 10^130+1113, if you're wanting to draw any conclusions for those smaller inputs.
I ran some timings myself and added them for 1e6. The cpu I was running on the 3e6 curves about 1.5x faster so I multiplied the time by that. The results were approximate compared with yours. If you have an automated way then that would be better than mine. I interpolated a fair few(which I would guess the script does itself).
It probably would be worth running 1e6 on smaller numbers.
Ideally the more levels of ecm we add the better. Once the initial timings are run then it is all automated. I think it would be interesting to show the aliquot sequence people how inefficient their ecm is. Would it be possible to show the chance that the number goes to nfs after all the ecm? It would be interesting to compare this to the value for the standard yafu ecm. I reckon there could be a fairly large difference.
There is more to be gained per number with larger numbers but many more small numbers are done.
One feature that would be useful would be automatic increasing/decreasing the number of curves added per run. For 1e6 and 3e6 100 is too much but when you get to 11e7 it can be too little.

edit: It maybe be worth making a script that people can run that generates all the timing and probability data. This would allow full flexibility in the optimization of ecm effort. There is no reason that we need to be restrained by the traditional bounds. The optimal solution would have a different bound for each curve.

Last fiddled with by henryzz on 2016-07-13 at 18:44
henryzz is offline   Reply With Quote
Old 2016-07-16, 07:14   #14
yoyo
 
yoyo's Avatar
 
Oct 2006
Berlin, Germany

10011010012 Posts
Default

The current meassurements for B1=29e8 without memory limitation and with -maxmem 1800

Click image for larger version

Name:	B1_2900e6.PNG
Views:	240
Size:	70.8 KB
ID:	14647

Learnings from this:
- should fit to a Boinc user system, even more if I split it into phase1 jobs and phase2 jobs.
- phase2 runtimes doubled with -maxmem 1800. This will be even worse for larger composites and larger B1. So I will test larger B1 with -maxmem 12G.
yoyo is offline   Reply With Quote
Old 2016-07-16, 07:48   #15
cgy606
 
Feb 2012

32×7 Posts
Default

Quote:
Originally Posted by yoyo View Post
The current meassurements for B1=29e8 without memory limitation and with -maxmem 1800

Attachment 14647

Learnings from this:
- should fit to a Boinc user system, even more if I split it into phase1 jobs and phase2 jobs.
- phase2 runtimes doubled with -maxmem 1800. This will be even worse for larger composites and larger B1. So I will test larger B1 with -maxmem 12G.
Let me try to offer some ideas for your project. It sounds like your trying to find very larger factors (~70 digits) to certain composites which should be (as xilman said) of length > 225 (assuming a 4/13 standard ecm depth). Thus you shouldn't even look at run-times for composites with less than this many digits (you could for the sake of completeness test these B1 limits with composites smaller than this perhaps to better extrapolate run-times for larger composites).

It sounds like, from previous posts, that that machinery used for this project is not "state of the art", i.e. perhaps 3-5 years old and limited to around 12GB RAM per machine. It would be nice to know what kind of CPUs these machines have (and if they have GPUs of any sort too), as you could probably 'divide' the workload among this hardware to better speed up your project. For me personally, I wouldn't even attempt B1 = 2.9G until I ran something like 70K curves at B1 = 850M, which is a hell of a lot!! But let's assume you have done this level of work on some composites. I wouldn't be so concerned with time consideration between 1 thread using 16GB of ram on stage 2 vs 1.8GB of ram per thread on stage 2. The reason is that you will need to run so many curves, it is better to think of maximizing parallelzation of your problem then speed per curve.

To illustrate, I'll assume that your 5 year old computers have 16GB of ram and intel dual core processors with hyper-threading (with is pretty common CPU technology from 5 years ago) and 64 bit architecture. If we take a 250 digit composite, stage 2 using 12GB (assume 11GB for ecm and another 1GB for transiant CPU activity) of ram will take 1.73hr, while it will take 4.66hr at 1.8GB. Now, in reality you can use up to 4GB per thread, so this time might go down. However, these numbers assume a single thread, which is all you can run using your 16GB machine at the default 11GB per curve on stage 2. But if you use only 4GB per thread, you can now run 4 curves at once (which is possible on a hyper-threaded dual core processor). Now, running a single curve vs 4 curves at once will be about 50% faster (per thread, based on my empirical observations), so the effective speed up of using less RAM per curve is 4/1.5 = 2.67X. Now, when you take into account that reducing the RAM per curve from 11GB to 2.75GB should take about 2 times longer (I noticed this when comparing stage 2 run-times at B1 = 3M to B1 = 260M on a C200 using the default memory and when reducing the memory by a factor of 4), this results in speed up of about 35%. Not bad for 'rationing' you ram usage.

Of course, if you have a GPU you can realized even higher speed ups (compared to just using CPUs) as you can let the GPUs do hundreds or thousands of stage 1 curves at high B1 values, save the files, an let your CPUs do the stage 2. I think that is the way to go, but ultimately you know what hardware is available to you and hopefully you can find the optimal strategy to get your desired factors. Best of luck to you!

Last fiddled with by cgy606 on 2016-07-16 at 07:58
cgy606 is offline   Reply With Quote
Old 2016-07-16, 08:33   #16
yoyo
 
yoyo's Avatar
 
Oct 2006
Berlin, Germany

617 Posts
Default

Thank you for your hints.

Yes it is an older system with 16GB Ram (this one http://universeathome.pl/universe/sh...p?hostid=41792). Main purpose of my tests are to learn how runtime and memory increases by increased composite length and larger B1. Afterwards I derive formulas for runtime=f(C,B1) and memory=f(C,B1), which I need as prediction to send Boinc workunits to my users and to avoid sending workunits to systems which are not able to handle them.

I checked the active hosts. There are more than 200 who have more than 20GB RAM.

I run currently ecm up to B1=850M with -maxmem 1800. 1.8G because many user have 2G per core and 2G is the max size a 32bit system can handle.

yoyo
yoyo is offline   Reply With Quote
Old 2016-07-18, 19:57   #17
pinhodecarlos
 
pinhodecarlos's Avatar
 
"Carlos Pinho"
Oct 2011
Milton Keynes, UK

3×17×97 Posts
Default

yoyo,

Do you think yoyo can have a large ECM effort on the Cunningham base 2 numbers (with B1 ~ 2^32)?
Perhaps we might get Bruce to indicate the total ECM work he has performed on these numbers before making a decision. EPFL ran 19000 curves with B1 = 10^9 on all of the 2^n+1, n odd, composites.

Carlos and Silverman
pinhodecarlos is offline   Reply With Quote
Old 2016-07-18, 20:21   #18
yoyo
 
yoyo's Avatar
 
Oct 2006
Berlin, Germany

26916 Posts
Default

Currently I tend to make 2 Boinc Apps, one for phase1 and one for phase2.
Phase2 I want to limit to 10G RAM usage, not to much less because it limits runtime and phase 2 seems to have no checkpoints.

With this at least B1=2.9e9 up to C400 is possible.

Tests for large B1 are still running.
yoyo is offline   Reply With Quote
Old 2016-12-04, 21:43   #19
yoyo
 
yoyo's Avatar
 
Oct 2006
Berlin, Germany

617 Posts
Default

Which command line options I have to use to run only Phase1 and only phase 2 for a given composite and B1?
yoyo is offline   Reply With Quote
Old 2016-12-04, 22:16   #20
amphoria
 
amphoria's Avatar
 
"Dave"
Sep 2005
UK

23×347 Posts
Default

Quote:
Originally Posted by yoyo View Post
Which command line options I have to use to run only Phase1 and only phase 2 for a given composite and B1?
For stage 1 only set B2=1 and create a stage 1 save file with -savea.

For stage 2 start with -resume and the stage 1 save file.
amphoria is offline   Reply With Quote
Old 2016-12-05, 11:29   #21
yoyo
 
yoyo's Avatar
 
Oct 2006
Berlin, Germany

617 Posts
Default

If stage 1 is done on the GPU of a single computer, and stage 2 on multiple computer. Can than the stage 1 save file also be used on those many computer which runs stage 2?
yoyo is offline   Reply With Quote
Old 2016-12-05, 11:56   #22
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

72·131 Posts
Default

Quote:
Originally Posted by yoyo View Post
If stage 1 is done on the GPU of a single computer, and stage 2 on multiple computer. Can than the stage 1 save file also be used on those many computer which runs stage 2?
Yes, you can in theory run each line from the stage 1 save file on a separate computer; I have often sliced up the file and run stage 2 on 48 cores

Last fiddled with by fivemack on 2016-12-05 at 11:56
fivemack is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
P73 found by a yoyo@home user yoyo GMP-ECM 3 2017-11-08 15:20
yoyo@home and wikipedia Dubslow Factoring 1 2015-12-06 15:56
Yoyo factors 10metreh ElevenSmooth 20 2013-05-14 03:27
Anyone want to compile an OS X ecm for yoyo? jasong GMP-ECM 1 2009-03-14 11:22
Second yoyo-BOINC factor wblipp ElevenSmooth 0 2009-02-20 00:25

All times are UTC. The time now is 04:15.


Sat Jul 17 04:15:05 UTC 2021 up 50 days, 2:02, 1 user, load averages: 2.84, 2.67, 2.29

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.