![]() |
[QUOTE=ET_;330241]I've been thinking about a CUDA program for P-1 factoring for quite a bit, and think that many other Mersennaries had.
First of all, note that I have only a limited knowledge about the math involved, but I'm willing to expand this limitation studying under the guide of more informed people, and eventually start coding something with the ir help. [/QUOTE] Read: P. Montgomery & R.D. Silverman An FFT Extension to the P-1 Factoring Algorithm Math. Comp. It is available on the net. It should tell you everything you need to know. [QUOTE] I'd like to gather ideas about how such a program should be designed. Some questions will be trivial, some other maybe deeper, but all of them will be enclosed in this thread. Some naif subjects to talk about: - Parallelization of tasks - Limitations due to the memory factor of the GPU (how far may we go having 0.5, 1, 2,3 or 6GB of memory? - Description of steps 1 and 2 (from MersenneWiki I got a grasp of it, but a talk would explain more). [/QUOTE] It discusses all of this. [QUOTE] - Is a parallel Montgomery multiplication algorithm out of question for such algorithm? [/QUOTE] It would be OK for step 1. It is not relevant to step 2. |
[QUOTE=R.D. Silverman;331247]Read:
P. Montgomery & R.D. Silverman An FFT Extension to the P-1 Factoring Algorithm Math. Comp. It is available on the net. It should tell you everything you need to know. It discusses all of this. It would be OK for step 1. It is not relevant to step 2.[/QUOTE] Thank you very much for the information! I'm going to study that paper and eventually will come back for more questions. Luigi |
Output from proto-CUDA-P-1:
[CODE] filbert@filbert:~/Build/cudapm1-0.00$ ./CUDA-P-1 1594559 Starting Stage 1 P-1, M1594559, B1 = 1, fft length = 96K Stage 1 complete, estimated total time = 0:00 3189119 is a factor of M1594559 [/CODE] Of course with B1 = 1 its easy. |
[QUOTE=owftheevil;331380]Output from proto-CUDA-P-1:
[CODE] filbert@filbert:~/Build/cudapm1-0.00$ ./CUDA-P-1 1594559 Starting Stage 1 P-1, M1594559, B1 = 1, fft length = 96K Stage 1 complete, estimated total time = 0:00 3189119 is a factor of M1594559 [/CODE] Of course with B1 = 1 its easy.[/QUOTE] It found a factor with B1=1 in "no time"! :et_: Obviously it had k=1, but what the hell, it works!!! :smile: Luigi |
Well, it only has to do 21 squares and a few multiply by 3s with a 96k fft.
The output time is calibrated in seconds, so less than 1 second won't show up. |
[QUOTE=henryzz;331226]Run the following code through pfgw:
[CODE]SCRIPT DIM expo, 31 DIM base, 3 DIM max_n, 100 DIM min_n, 1 OPENFILEAPP r_file,results.txt DIM n, min_n DIM result, 0 DIM mers, 2^expo-1 DIMS rstr LABEL next_n POWMOD result,base,n,mers WRITE r_file,result SET n, n+1 IF (n<=max_n) THEN GOTO next_n LABEL end[/CODE][/QUOTE] Thanks for the script. |
Stage 1 seems to be working ok.
[CODE] filbert@filbert:~/Build/cudapm1-0.00$ ./CUDALucas 60981299 -f 3360k Starting Stage 1 P-1, M60981299, B1 = 580000, fft length = 3360K Stage 1 complete, estimated total time = 1:34:14 7124551590389340568394253966215081 is a factor of M60981299 [/CODE] To put the elapsed time in context, this is on a 570 that completes a 28xxxxxx double check in 21:00--21:30 hours. |
[QUOTE=owftheevil;331625]Stage 1 seems to be working ok.
[CODE] filbert@filbert:~/Build/cudapm1-0.00$ ./CUDALucas 60981299 -f 3360k Starting Stage 1 P-1, M60981299, B1 = 580000, fft length = 3360K Stage 1 complete, estimated total time = 1:34:14 7124551590389340568394253966215081 is a factor of M60981299 [/CODE] To put the elapsed time in context, this is on a 570 that completes a 28xxxxxx double check in 21:00--21:30 hours.[/QUOTE] KUDOS! :bow: Should you need a tester with a GTX580, I'm here to help. :smile: I still can't understand the "estimated total time" (1:34:14) thingie. Do you mean thet the code completes stage 1 at B1=580,000 in 1 hour and 34 minutes the DC assignment in 21 hours with CuLu? Which version of CuFFT are you using? Luigi |
[QUOTE=ET_;331649]KUDOS! :bow:[/QUOTE]
+1!!! :smile: [QUOTE=ET_;331649]Should you need a tester with a GTX580, I'm here to help.[/QUOTE] Ditto. I only have a 560, but it is the 2GB version, so when you need some Stage 2 testing.... :wink: |
[QUOTE=ET_;331649]KUDOS! :bow:
Should you need a tester with a GTX580, I'm here to help. :smile: I still can't understand the "estimated total time" (1:34:14) thingie. Do you mean thet the code completes stage 1 at B1=580,000 in 1 hour and 34 minutes the DC assignment in 21 hours with CuLu? Which version of CuFFT are you using? Luigi[/QUOTE] Stage 1 of a ~61M exponent in 1:34:14 and a DC of a ~28M exponent. I am intrigued how fast this would run small numbers upto a much higher B1. How much does the exponent affect the runtime? Could you try a sub 1000-bit exponent? Maybe a range of different size exponents would be appropriate. |
1H and 34 min? that's 5 to 7 time faster than a ordinary CPU. nice.
|
| All times are UTC. The time now is 23:18. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.