mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GMP-ECM (https://www.mersenneforum.org/forumdisplay.php?f=55)
-   -   ECM for CUDA GPUs in latest GMP-ECM ? (https://www.mersenneforum.org/showthread.php?t=16480)

Batalov 2012-02-25 20:10

Small factors will be found with very many sigma values - and it doesn't matter if random or sequential values are used for sigma.
Prefactor your input number (with yafu, for example) until it will become interesting to ECM it.

xilman 2012-02-26 10:14

[QUOTE=Cyril;290764]Hey

As said above the more recent GPU-ECM program is in the gpu_ecm subdirectory (and NOT gpu_ecm_cc13 even for cards of compute compatibility 1.3).

The NB_DIGITS stuff is still highly experimental and will most of the time either crash or return wrong results. Only the 1024 arithmetic (the default case) is, for now, working. But every feedbacks from experiments with NB_DIGITS are welcome.

Cyril[/QUOTE]Hi Cyril, welcome aboard!

Thanks for the warnng. I'd not seen any difficulties myself with halving NB_DIGITS but it's good to know that I'm on dangerous ground.

Paul

rcv 2012-03-29 02:35

Can someone answer a theory question on the ECM algorithm, please.

I know it is normal to run curves one level at a time to remove small factors. But does the presence of small factors inhibit higher level curves from finding big factors.

Practically speaking, if gmp_ecm runs at the same speed for all numbers, regardless of whether they are 769 bits or 1018 bits, then could I just run a set of curves at B1=260e6, without bothering to remove the small factors.

For example, if I run (2^1009+1) through the program, at B1=260000000, nearly every curve will find factors. Are those small factors merely noise or do the small factors prevent the algorithm from finding factors of the big C244?

I'm talking ONLY stage 1, here.

Thanks in advance.

firejuggler 2012-03-29 03:18

the factor here is *time* if 1 curve take 10 sec at B1=11e3, it will take 1000 sec at B1=1e6 ( or close). I would rather do 200 curve at B1 = 11e3 and find *all* the 15 digits factor than 3 at B1=1e6 to find them. and with a large B1 you are more prone to find a *composite* factor you will have to ... factor

rcv 2012-03-29 04:04

[QUOTE=firejuggler;294599]the factor here is *time* ...[/QUOTE]
Thank you. I understand all that.

But, I'm really asking a deeper question about the theoretical behavior of the ECM algorithm.

jyb 2012-03-29 04:20

[QUOTE=rcv;294605]Thank you. I understand all that.

But, I'm really asking a deeper question about the theoretical behavior of the ECM algorithm.[/QUOTE]

I believe firejuggler's answer did address at least one aspect of the theoretical behavior you're talking about. If you run ECM with a high B1 when there are small factors, those small factors don't in any way prevent the algorithm from finding larger factors. But if it does find a large factor, it will probably also find the small factors, so the result of the algorithm will be a composite number which is the product of all the prime factors found.

In other words, it will find large factors with the same probability (per curve) as if there were no small factors. But it may not give them to you in a form you want.

And of course, as firejuggler pointed out, the total time you take to find factors will likely be much greater than if you first ran curves with lower B1 to find and divide out the small factors. But subject to your hypothetical ("if gmp_ecm runs at the same speed for all numbers, regardless of whether they are 769 bits or 1018 bits"), then it will find the large factors just as quickly whether or not the small factors are there. It just won't find them alone.

Does that answer your question adequately, or are you looking for something else?

rcv 2012-03-29 12:15

[QUOTE=jyb;294608]... it will find large factors with the same probability (per curve) as if there were no small factors. ...

Does that answer your question adequately, or are you looking for something else?[/QUOTE]
Thank you. That's *exactly* what I was asking.

xilman 2012-04-06 06:49

Result
 
[code]********** Factor found in step 1: 123606794672656155230910235277199616421317
Found probable prime factor of 42 digits: 123606794672656155230910235277199616421317
Probable prime cofactor 14866190468551593309306643619217939222105550143777577617032523708180871799254017557065876729519990261631832681060060230768948753357462444734554678109987149185755720805584735539302523326567649552404524657646651599763316335399327139435288373515331 has 245 digits
Factor found with (d*2^32) mod N = 1001097
[/code]The input number was the 286-digit cofactor of 296*11^296+1, aka GC(11,296). The factor was found on the 1098'th curve of a batch 1120.

Two things are especially noteworthy. You don't often see p42 factors found in stage 1, even with B1 as high as 110M. The other can be seen in this snippet if you know what you are looking for.
[code]time GPU: 209474.766s
time CPU: 716.798s (0.001+716.737+0.060)
Throughput: 0.005

real 3491m28.677s
user 8m59.778s
sys 3m4.908s
[/code]This was my first production run with gpu_ecm incorporating a patch provided by Rocke Verser (rcv of this parish) which replaced the old busy-wait behaviour. System cpu time fell from an expected 50+ hours to a little over three minutes. Very impressive in my view. Thanks Rocke.

Paul

R.D. Silverman 2012-04-06 13:40

[QUOTE=rcv;294605]Thank you. I understand all that.

But, I'm really asking a deeper question about the theoretical behavior of the ECM algorithm.[/QUOTE]

Read my joint paper with Sam Wagstaff: A Practical Analysis of ECM.

rcv 2012-04-06 22:45

[QUOTE=xilman;295541]Thanks Rocke.[/QUOTE]
You're welcome. It's not that I did anything great. It's more that NVIDIA's CUDA drivers are disrespectful of the CPU. When trying to keep the GPU busy, special care must be taken to prevent them from wasting an entire core in spin-loops. [I've now used the technique on three separate CUDA programs with significant reductions in CPU wastage in each case.]

BTW, Cyril also has the patch, and hints that it will be in a future release of his code.

[QUOTE=R.D. Silverman;295562]Read my joint paper with Sam Wagstaff: A Practical Analysis of ECM.[/QUOTE]
Thanks for the reference! For others who need a more complete reference, the paper was published in Mathematics of Computation, July 1993, v61, n203, pp 445-462. The paper is presently available online at [URL]http://www.ams.org/journals/mcom/1993-61-203/S0025-5718-1993-1122078-7/[/URL]

While a little dated (the largest ECM factors discovered at the time of its writing were 38 digits), I would recommend the paper to those interested in a derivation of the various parameters used by GMP-ECM and other factoring software.

While the paper is interesting, I fail to see that it directly answers my question. Table 3, for example, gives L, the expected number of curves to find a factor of a given size. However, the assumptions that lead to that derivation are not true in the theoretical question I was asking.

As with other papers and articles I have read concerning ECM, a basic assumption is that you factor out the small factors as you find them. My question was more "...but what happens if you don't remove the small factors?"

While not referring to my specific theoretical question, the second Remark on page 460 seems to acknowledge that analysis is difficult when the number being factored "does not have the same expected smoothness properties of some other arbitrary integer of about the same size."

I will reread the paper in more depth over the next few days.

debrouxl 2012-04-07 15:49

In SVN HEAD, the CPU usage was greatly reduced (thanks to rcv's patch, I guess), and the PRINT_REMAINING_ITER define is interesting for making the program's progress more observable - great :smile:


I've just made a trivial patch for adding timestamps to the output of GPU-ECM, which makes it easier (for me, at least) to estimate when the job will finish.
Here it is, in case someone else is interested:
[code]
Index: cudakernel.cu
===================================================================
--- cudakernel.cu (révision 1908)
+++ cudakernel.cu (copie de travail)
@@ -191,7 +191,14 @@
if (j < 1000000) jmod = 100000;
if (j < 100000) jmod = 10000;
if (j % jmod == 0)
- printf("%lu iterations to go\n", j);
+ {
+ char buf[32];
+ time_t curtime = time(NULL);
+
+ strcpy(buf, ctime(&curtime));
+ *(strchr(buf, '\n')) = 0;
+ printf("%s %lu iterations to go\n", buf, j);
+ }
#endif
}

[/code]
(unsurprisingly, the timestamping code is basically the same as in msieve)


All times are UTC. The time now is 04:22.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.