![]() |
i thought George said somewhere that prime95 uses all integer arithmetic or something like that
|
[QUOTE=cheesehead;151440]As you correctly point out, the claim in post #1 that "... any project that works by multiplying large numbers (LLR, GIMPS) must use fast Fourier transforms (FFTs) implemented in double precision (DP) floating point" needs to be amended. It is, of course, quite possible to use integers entirely in the Fourier transforms. The statement as it stands is especially misleading in a FAQ about computations performed by hardware other than general-purpose CPUs.[/quote]
[QUOTE=jasonp;133411][b]Q: What is this 'Number Theoretic Transform' that the threads I dutifully read above mention?[/b] A: After explaining at great length why you have to have double precision floating point to do this FFT stuff, I'll backpedal and say that floating point arithmetic is not needed to compute the equivalent of an FFT multiply. You can instead use modular math in the finite field of elements modulo a prime integer. This sounds great, but instead of floating point multiplication you must instead use modular multiplication of integers. There are many tricks available to make this fast, but they all require [i]32-bit or 64-bit integer multiplication, that produces a 64-bit or 128-bit product[/i]. This unfortunately is not an operation needed in graphics cards or most other special processors, so it's not implemented in dedicated hardware, so it's not fast. In fact, we use floating point because [i]most[/i] processors, general-purpose or special-purpose, don't care much about integer multiply throughput but care a great deal about making floating point fast. We go where the hardware is, and NTTs are not at present a good substitute for the way things are done now.[/quote] The OP is rather more comprehensive than you realize. |
[QUOTE=henryzz;151444]i thought George said somewhere that prime95 uses all integer arithmetic or something like that[/QUOTE]
Only for Trial Factoring. LL (and P-1 and ECM) use DP floating point FFT. However, Prime95's world-beating speed isn't just about usage of floating point. It took years of handtuning assembly code to optimize memory prefetch and what nots before the code got to where it is now. It would be really interesting to benchmark and compare _all_ the versions of P95 on all the CPUs (past and present) :smile: |
[QUOTE=henryzz;151444]i thought George said somewhere that prime95 uses all integer arithmetic or something like that[/QUOTE]
The very, very first version in 1995 - never released to the public - used all integer arithmetic. Then I learned about IBDWT and have been using floating point ever since. |
i wonder what it was i was thinking about then?:smile:
|
There are parts in George's code that are all integer and are still used, e.g. for Generalized Fermat PRP in openpfgw. If you try
[FONT=Courier New][B]> pfgw -f0 -q"10^8192+1"[/B][/FONT] you will see the message [FONT=Courier New][B]Switching to special GFN DWT logic[/B][/FONT] and the calculations are done in GWIntegers. I haven't read too deep into the source though. Note that this acts as a kludge, and most people never used this part of the source, - because they would enter 10^(2^23)+1 and then gwPRP_GFN is not called (the code expects TWO numbers, no additional parsing). It is also [I]not[/I] faster than other methods, unless you stretch the trick and realize that you can change the base until gwPRP_GFN() [I]is[/I] actually faster (at least within the framework of pfgw). Here's how I actually checked the GFN(10,23) -- [B][FONT=Courier New]> pfgw -q"10000^2097152+1"[/FONT][/B] Geoff Reynolds used LR (afaik) for a similar task (GFN(12,22) which I DC'd) and the speeds were compatible. (But only if you use a raised base, 12^4, in this case.) Anyway, the point is just that some things can be computed in all integers not just theoretically (and not in spite of common sense). For me it was fun, because I could run it on some old PC which I wouldnt use for anything else - it ran over a month. Here's a homework for someone - implement integer PRP GFN in GPU and test GFN(10,24) and GFN(12,24), all right? Good luck! --Serge |
[quote=axn;151493]The OP is rather more comprehensive than you realize.[/quote]... and. therefore, if the not-necessarily-DP-FP information is not to be moved up into the offending paragraph to replace the misleadingly-unconditional declarative statement, then at least there should be a correction and note added, such as:
"... any project that works by multiplying large numbers (LLR, GIMPS) [strike]must[/strike] [B]will usually[/B] use fast Fourier transforms (FFTs) implemented in double precision (DP) floating point, [B]but for some alternatives see the question "[B]What is this 'Number Theoretic Transform' that the threads I dutifully read above mention?[/B]" below.[/B]" so that the same realization will not escape others. :-) |
[QUOTE=cheesehead;151623]at least there should be a correction and note added, such as:
[/QUOTE] Note added. Sheesh. PS: Henry, in the early days there was a lot of dicussion on the GIMPS mailing lists about using NTTs to perform LL tests; perhaps you read that discussion in an archive somewhere? |
[quote=jasonp;151652]Note added.
Sheesh. PS: Henry, in the early days there was a lot of dicussion on the GIMPS mailing lists about using NTTs to perform LL tests; perhaps you read that discussion in an archive somewhere?[/quote] no it wasnt that it was something i read in the last couple of months possibly in this thread or a similar thread |
[quote]Q: What is this 'Number Theoretic Transform' that the threads I dutifully read above mention?
A: After explaining at great length why you have to have double precision floating point to do this FFT stuff, I'll backpedal and say that floating point arithmetic is not needed to compute the equivalent of an FFT multiply. You can instead use modular math in the finite field of elements modulo a prime integer. This sounds great, but instead of floating point multiplication you must instead use modular multiplication of integers. There are many tricks available to make this fast, but they all require 32-bit or 64-bit integer multiplication, that produces a 64-bit or 128-bit product. This unfortunately is not an operation needed in graphics cards or most other special processors, so it's not implemented in dedicated hardware, so it's not fast. In fact, we use floating point because most processors, general-purpose or special-purpose, don't care much about integer multiply throughput but care a great deal about making floating point fast. We go where the hardware is, and NTTs are not at present a good substitute for the way things are done now.[/quote] I'm not sure how long this "answer" has been here, but it's simply wrong, the AMD based LLR program I mentioned an hour or so ago in another thread uses this method. Saying it isn't fast based on opinion is even more of a religious belief than many Christians belief that Christ died and rose again. There's NO evidence that the "integer transform is slower" belief is true. It's simply people parroting each other, though I'll admit that they're parroting for the same reason I'm parroting, the reputation of the person they got the information from. |
1 Attachment(s)
[quote=jasong;153353]It's simply people parroting each other, though I'll admit that they're parroting for the same reason [I][B]I'm parroting[/B][/I], the reputation of the person they got the information from.[/quote]
This is a fine example of a [I]belief[/I] that you have, based on nothing other than your own experience. Did you ever try [URL="http://en.wikipedia.org/wiki/Scientific_method"]Scientific_method[/URL]? Or you can only retell [I]your friend's[/I] stories? |
| All times are UTC. The time now is 21:07. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.