![]() |
![]() |
#1 |
"Åke Tilander"
Apr 2011
Sandviken, Sweden
2·283 Posts |
![]()
In the source code of Prime95 ver. 26.6 in the file gwnum.h I find the following lines:
#define MAX_PRIME 79300000L /* Maximum number of x87 bits */ #define MAX_PRIME_SSE2 596000000L /* SSE2 bit limit */ #define MAX_FFTLEN 4194304L /* 4M FFT max for x87 */ #define MAX_FFTLEN_SSE2 33554432L /* 32M FFT max for SSE2 */ To the best of my understanding this means that if you have on old computer using an old floating-point utility (X87) the size of the FFTs are limited to 4000K (4194304) and if you have a modern computer using SSE2 the size of the FFTs are limited to 32M (33554432). As a concequence of this you cannot test exponents larger then 73.3M with the X87 computer and not larger then 596M with the modern computer. Question 1: Is the limit of the FFT sizes to 32M due to hardware/software limitations? Or is it just because George have not yet written FFTs for larger sizes (due to obvious reasons)? Question 2: I have tried to find a table of all the FFT sizes used by Prime95, but I have not found one. I would be happy if someone would publish such a table or point me to where I can find one. |
![]() |
![]() |
![]() |
#2 |
"Vincent"
Apr 2010
Over the rainbow
55048 Posts |
![]()
As I understand it, George didn't go beyond 32M because of the non-nessecity-in-my-lifetime prospect. but i could be wrong.
|
![]() |
![]() |
![]() |
#3 |
"Oliver"
Mar 2005
Germany
2·557 Posts |
![]()
I *guess* the fastest non SSE2 CPU which can be used for Prime95 is the Athlon Thunderbird (up to 1400MHz) which was released mid 2000! (nearly 12 years old). According to the benchmark page it takes roughly half a second per iteration for a 4M FFT. The max exponent for a 4M X87 FFT in Prime95 is (a prime just below) 79.300.000. So a single test of this size would need roughly 40 million seconds to complete (1 year an three month). I *guess* nobody really needs bigger FFTs for X87, right?
|
![]() |
![]() |
![]() |
#4 |
"Åke Tilander"
Apr 2011
Sandviken, Sweden
10668 Posts |
![]()
Yes TheJudger and firejuggler I fully agree with you. This is what I was referring to when I wrote:
still I am curious to know if it could be easily done or if there is any hardware/software limitations which cannot easily be overcome? |
![]() |
![]() |
![]() |
#5 |
Undefined
"The unspeakable one"
Jun 2006
My evil lair
2·32·7·53 Posts |
![]()
While there are some fundamental limits on the mximum FFT length that a 53-bit mantissa can allow, we are nowhere near such a limit yet. If you want larger FFT length then just start bugging George for it or write it yourself.
Last fiddled with by retina on 2012-04-22 at 12:01 |
![]() |
![]() |
![]() |
#6 |
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
![]()
The FFT table can be found in mult.asm, though when I attempted to read it, ...well, I couldn't. (Also, wouldn't 4M be 4096K?)
|
![]() |
![]() |
![]() |
#7 | |
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
13×773 Posts |
![]() Quote:
I think there's a good chance that someone may need to work enough to understand the existing block structure of George's code (without understanding all the details of the code) and faithfully reproduce it for some higher FFT size, e.g. make 8M code from comparing 2M and 4M code branches, ..or 6M from 3M. In contrast, to create something new (like the 2304K code before George really wrote it) would be much harder. |
|
![]() |
![]() |
![]() |
#8 | |
∂2ω=0
Sep 2002
República de California
5·2,351 Posts |
![]() Quote:
Code:
case 2304 : /* 2.25M = 2304K */ switch(radix_set) { case 0 : numrad = 6; radix_vec[0] = 9; radix_vec[1] = 8; radix_vec[2] = 8; radix_vec[3] = 8; radix_vec[4] = 16; radix_vec[5] = 16; break; case 1 : numrad = 5; radix_vec[0] = 18; radix_vec[1] = 16; radix_vec[2] = 16; radix_vec[3] = 16; radix_vec[4] = 16; break; case 2 : numrad = 5; radix_vec[0] = 36; radix_vec[1] = 8; radix_vec[2] = 16; radix_vec[3] = 16; radix_vec[4] = 16; break; case 3 : numrad = 5; radix_vec[0] = 9; radix_vec[1] = 16; radix_vec[2] = 16; radix_vec[3] = 16; radix_vec[4] = 32; break; case 4 : numrad = 5; radix_vec[0] = 9; radix_vec[1] = 16; radix_vec[2] = 16; radix_vec[3] = 32; radix_vec[4] = 16; break; case 5 : numrad = 4; radix_vec[0] = 36; radix_vec[1] = 32; radix_vec[2] = 32; radix_vec[3] = 32; break; default : if(nradices){*nradices = 6;} return ERR_RADIXSET_UNAVAILABLE; }; break; |
|
![]() |
![]() |
![]() |
#9 |
P90 years forever!
Aug 2002
Yeehaw, FL
3×11×13×19 Posts |
![]() |
![]() |
![]() |
![]() |
#10 |
Undefined
"The unspeakable one"
Jun 2006
My evil lair
2×32×7×53 Posts |
![]() |
![]() |
![]() |
![]() |
#11 | ||
"Åke Tilander"
Apr 2011
Sandviken, Sweden
56610 Posts |
![]() Quote:
Quote:
So if I ever would like to try to LL a Billion digit exponent I would need larger FFTs. Yes I admit its whimsical. With current fastest hardware and the latest versions of Prime95 using AVX I think a LL-test of a Billion digit exponent would not take "more" then 10 years or so. That would at least be within my lifetime I hope. Last fiddled with by aketilander on 2012-04-22 at 21:49 Reason: Comment could be misunderstood |
||
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Force FFT-size to be used | kruoli | Software | 4 | 2017-11-17 18:14 |
Pi(x) value for x at 10^16 size | edorajh | Computer Science & Computational Number Theory | 6 | 2017-03-08 20:28 |
Size optimization | Sleepy | Msieve | 14 | 2011-10-20 10:27 |
Exponent Size Gap | TimSorbet | PrimeNet | 8 | 2007-03-25 07:29 |
FFT-Size | andi314 | Lounge | 14 | 2007-01-22 00:21 |