![]() |
|
|
#1 |
|
"Jason Goatcher"
Mar 2005
3×7×167 Posts |
I would like to learn the math behind the factoring programs. I am totally ignorant of the way things work, so please forgive the following question if it is stupid.
I'm hoping that the factoring algorithms are able to be learned by expanding one's knowledge of algebra. Does this idea have any credibility? I have had an education up to college algebra, and am hoping there are books that can teach me factoring math, assuming my guess about factoring relating to algebra is correct. If I'm totally off my rocker, just tell me and I'll shut up. (Actually, now that I think about it, it's possible I'm correct AND totally off my rocker. Now where did I put my meds? )
|
|
|
|
|
|
#2 | |
|
Bamboozled!
"๐บ๐๐ท๐ท๐ญ"
May 2003
Down not across
22·3·983 Posts |
Quote:
There certainly are books out there. I would recommend learning number theory, rather than factorization per se, at least for a start. You need a grounding in number theory to get anywhere and books specializing in factorization usually assume at least a basic knowledge of number theory. My personal favourite introductory book is Niven, Zuckermann & Montgomery, An Introduction to the Theory of Numbers. I've the 5th edition in paperback. There are several other good introductory books; other folk here may post their recommendations and Amazon will prove a valuable resource. As for factoring in particular, the one's I'd recommend are Riesel's Prime Numbers and Computer Methods for Factorization and Crandall & Pomerance's Prime Numbers, a computational perspective. You should probably learn some computer science, if you want to get anywhere implementing the algorithms. If the names "Knuth" and "TAOCP" doesn't mean anything to you, research them immediately! Beware: you may be shocked by the price of these books if purchased new. ![]() I'm pretty sure I've typed material like this before. Perhaps it should be in a FAQ. Paul |
|
|
|
|
|
|
#3 | |
|
Jul 2005
4010 Posts |
Quote:
I've recently taught myself the Verilog hardware design language and have implemented Field Programmable Gate Array (FPGA) circuits for Fermat's factorization method, Pollard-Rho, and Lenstra's Elliptic Curve factoring method (ECM). Fermat's method is the only design that's actually running on a real FPGA at present (it's factoring the RSA-704 challenge number ) because the ECM design for RSA-704 is too large to fit into any FPGA that I can afford.Meanwhile I have some time on my hands and am looking for some other project with which to entertain myself. I have a hunch that the FFT portion of the Prime95 code would be an ideal candidate for an FPGA based hardware accelerator board (if one hasn't been designed and built already). I have both Spartan3 and ProAsic3E FPGA development boards to work with, but no idea where to obtain the Prime95 source code. By the way, does anyone have any idea how many eons I should expect to wait for the Lattice FPGA running Fermat's method at 1.16 microseconds per iteration on the RSA-704 challenge number to find the solution? Ron |
|
|
|
|
|
|
#4 | |
|
Tribal Bullet
Oct 2004
1101111011012 Posts |
Quote:
You could get started by checking out the many papers on building an FFT into dedicated hardware, then estimate what it would take to convert one of those algorithms to use double precision floating point, and use that to estimate how much parallelism you would need at ~100MHz to beat a Pentium at 3000MHz. The above sounds cynical, but I'd actually like to know the answers to those questions. Did the references I pointed out previously help at all? jasonp |
|
|
|
|
|
|
#5 | |
|
Sep 2002
2×331 Posts |
Quote:
Is that a fixed time or an average ? Does the time depend on how close the difference is to the square root of RSA704 ? What percent of differences, (ie potential squares), do you evaluate, (do a square root on) ? Later today I'll try to figure out the maximum amount of time it will take. Of course the minimum could be in the next week, day, hour, minute or second. Dan |
|
|
|
|
|
|
#6 |
|
Sep 2002
2×331 Posts |
I figure you only need to evaluate 1/4 of the differences.
Ones that are 9 mod 16, ( 9 AND 15). |
|
|
|
|
|
#7 | |||
|
Jul 2005
23·5 Posts |
Quote:
Quote:
I have ample textbooks describing the FFT multiplication method step by step, but I'm more interested in finding out exactly where and how I might incorporate the technique to improve the speed of Prime95 (or other algorithms) that use a software FFT multiplier. In other words, I only want to carve out a very small specific module or subroutine(s) within the greater Prime95 program that would benefit from FPGA implementation rather than trying to implement the entire thing in an FPGA essentially from scratch. That also has the advantage of giving me immediate and accurate performance numbers to compare against the existing software implementation. Quote:
I can get some timing figures from the Verilog simulator, but not for any realistic data because the simulator is incredibly slow compared to real time. It's also more than just a matter of how much parallelism would be needed, because you have to keep in mind that unlike a conventional general purpose sequential computer with a predetermined bus width, etc., an FPGA is a totally blank slate that allows the designer to implement not only a particular algorithm, but to design an entire computer architecture to suit the algorithm rather than the other way around. It's almost like comparing apples and oranges. For example, the bus width on my ECM design is parameterizable at compile/synthesis time so that I can very easily set it to whatever bus width is needed. That way everything is always "single-precision" even if the single-precision numbers are 704 bits wide. The references you supplied in a previous thread were very interesting, but I got bogged down in all the implementation details. In order for an application to be suitable for FPGA implementation, the dataset the FPGA works with at any given time must be relatively small, local, and the task should be computationally intensive. Otherwise data transfer bandwidth issues would erase any potential gains from using an FPGA. I would like to find some small portion of Prime95 that is reasonably well isolated (encapsulated) from the rest of the program and that has the qualities mentioned above. Ron |
|||
|
|
|
|
|
#8 | ||||
|
Jul 2005
508 Posts |
Quote:
Quote:
Quote:
Quote:
Ron |
||||
|
|
|
|
|
#9 | ||
|
Tribal Bullet
Oct 2004
5·23·31 Posts |
Quote:
The last time I looked, opencores had verilog for an IEEE floating point unit but not for an FFT. Admittedly that was years ago. Quote:
http://lettuce.edsc.ulst.ac.uk/gimps/ Their most recent source collection is for v24.14 (last August) From memory, the core inner loops in prime95 implement a fairly standard radix-4 real-valued FFT. All the fun seems to depend on using the macro in a way that fully exploits caches, write buffers, prefetching and other things that dedicated hardware doesn't have to worry about :) p4notes.doc is also a very interesting read. jasonp Last fiddled with by jasonp on 2006-06-05 at 13:09 |
||
|
|
|
|
|
#10 | |
|
"Bob Silverman"
Nov 2003
North of Boston
11101100011012 Posts |
Quote:
But trying to do RSA704 with ECM, Polllard-Rho, or Fermat's Method (except to test the hardware) will get you labelled as a kook in a hurry. These methods are hopeless for this number. Implementing the lattice sieve on FPGA is a VERY VERY hard problem. I have strong doubts whether they can be used effectively. And what do you mean by "per-iteration" for the lattice sieve??? The time will be dependent on factor base size. If you mean a single lattice point addition every 1.16 usec, (along with all the overhead) then I am afraid that this will not be competitive with current PC based implementations. My laptop, for example, (1.5GHz Pentium M) does an entire special-q value on a 14M x 7M array (98 million lattice points) using a factor base of 1.8 million primes per polynomial in just 55 seconds. This is twice as fast as your 1.16 usec result. The time includes all the overhead to do the lattice reduction, compute the start points, do the sieving, and to compute the actual factorizations of the smooth norms. My lattice sieve code could do the sieving for RSA704 in "about" 8 months on a collection of "about" 400 Pentium-4 PC's (at about 3GHz each). [this estimate is within a factor of 2]. Dealing with the matrix would be a MAJOR problem. |
|
|
|
|
|
|
#11 |
|
Sep 2002
2×331 Posts |
Maximum iterations, with the bigger of the two factors of RSA704 being 2^352 - 1
17675749973855171424998162026114852226112656948184937770657347111317090767408538639881686924493222509658 Years 649728432126397408326294393435914916923044910255993098776919748305569032188566456963227775 Last fiddled with by dsouza123 on 2006-06-05 at 14:54 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Links to factoring programs | smh | Factoring | 86 | 2022-05-13 18:08 |
| Links to Factoring Programs | rogue | Factoring | 32 | 2009-09-17 11:40 |
| factoring programs | henryzz | Factoring | 6 | 2007-09-19 13:47 |
| looking for Fermat factoring programs | ixfd64 | Factoring | 1 | 2005-09-08 12:13 |
| any good GNFS factoring programs? | ixfd64 | Factoring | 1 | 2004-04-27 09:41 |