![]() |
![]() |
#34 | |
Aug 2003
Europe
2×97 Posts |
![]() Quote:
Under the section "Files > ProvingPrograms" there are three files. One for Windows and two for Linux. |
|
![]() |
![]() |
#35 | |
Banned
"Luigi"
Aug 2002
Team Italia
3·1,619 Posts |
![]() Quote:
![]() Luigi |
|
![]() |
![]() |
#36 |
Sep 2002
Database er0rr
118F16 Posts |
![]()
Wait a little more and we will see "llr3" (or whatever it will be called) which is 30% faster than llr2 because it is based on George's latest IBDWT library.
![]() |
![]() |
![]() |
#37 | |
Banned
"Luigi"
Aug 2002
Team Italia
113718 Posts |
![]() Quote:
![]() Luigi |
|
![]() |
![]() |
#39 |
Sep 2002
2·131 Posts |
![]()
I Just downloaded and tried the new LLR4P. It works great!!!
Start and Stop easily. On my machine it is 3 times faster. at n=800,000 on k=253. thank you so much. ![]() Joss |
![]() |
![]() |
#40 |
Mar 2004
3·11 Posts |
![]()
On my NON-SSE2 box, the new LLR was much slower. I notice that it tends to select the largest FFT (for N around 700,000) so it took twice the time that the old LLR would have for that number.
|
![]() |
![]() |
#41 | |
Feb 2003
27·3·5 Posts |
![]() Quote:
The new formula for choosing the FFT length works well for k<32 but tends to overestimate at larger values of k. There is still no "optimal" formulation for guessing the FFT length right. If on your k the new non-SSE2 LLR is slower than the old one, then I suggest the use the old LLR. -- Thomas |
|
![]() |
![]() |
#42 | |
Mar 2004
3·11 Posts |
![]() Quote:
I've stuck with the old one for now on this machine. Even for large N, like 121, the SSE2 version is still much faster. |
|
![]() |
![]() |
#43 |
Sep 2002
Database er0rr
5·29·31 Posts |
![]()
Thomas Ritschel sent me some LLR timings for 321search on his AMD Opteron computer . These show where the FFT sizes change. This is where we notice a slow down in iteration timings and the LLR tests taking longer.
Opteron 2GHz ------------ n Mers. Proth used t/iter. ------------------------------------------ 1200000 65536 131072 65536 3.035 ms 1273062 65536 131072 65536 3.035 ms * 1273063 65536 131072 81920 4.055 ms * 1300000 65536 131072 81920 4.055 ms 1400000 81920 163840 81920 4.055 ms 1500000 81920 163840 81920 4.055 ms 1583078 81920 163840 81920 4.055 ms * 1583079 81920 163840 98304 5.145 ms * 1600000 81920 163840 98304 5.145 ms 1700000 98304 196608 98304 5.145 ms 1800000 98304 196608 98304 5.145 ms 1888094 98304 196608 98304 5.145 ms * 1888095 98304 196608 114688 6.350 ms * 1900000 98304 196608 114688 6.350 ms 2000000 114688 229376 114688 6.350 ms 2100000 114688 229376 114688 6.350 ms 2196110 114688 229376 114688 6.350 ms * 2196111 114688 229376 131072 7.285 ms * 2200000 114688 229376 131072 7.285 ms 2300000 131072 262144 131072 7.285 ms 2400000 131072 262144 131072 7.285 ms 2500000 131072 262144 131072 7.285 ms 2510126 131072 262144 131072 7.285 ms * 2510127 131072 262144 163840 9.385 ms * 2600000 131072 262144 163840 9.385 ms 2700000 163840 327680 163840 9.385 ms 2800000 163840 327680 163840 9.385 ms 2900000 163840 327680 163840 9.385 ms 3000000 163840 327680 163840 9.385 ms 3100000 163840 327680 163840 9.385 ms 3121158 163840 327680 163840 9.385 ms * 3121159 163840 327680 196608 11.195 ms * 3200000 163840 327680 196608 11.195 ms 3300000 196608 393216 196608 11.195 ms 3400000 196608 393216 196608 11.195 ms 3500000 196608 393216 196608 11.195 ms 3600000 196608 393216 196608 11.195 ms 3700000 196608 393216 196608 11.195 ms 3719190 196608 393216 196608 11.195 ms * 3719191 196608 393216 229376 13.490 ms * 3800000 196608 458752 229376 13.490 ms 3900000 229376 458752 229376 13.490 ms 4000000 229376 458752 229376 13.490 ms 4100000 229376 458752 229376 13.490 ms 4200000 229376 458752 229376 13.490 ms 4300000 229376 458752 229376 13.490 ms 4330220 229376 458752 229376 13.490 ms * 4330223 229376 458752 262144 15.085 ms * 4400000 229376 524288 262144 15.085 ms 4500000 229376 524288 262144 15.085 ms 4600000 262144 524288 262144 15.085 ms 4700000 262144 524288 262144 15.085 ms 4800000 262144 524288 262144 15.085 ms 4900000 262144 524288 262144 15.085 ms 4950254 262144 524288 262144 15.085 ms * 4950255 262144 524288 327680 20.395 ms * 5000000 262144 655360 327680 20.395 ms Last fiddled with by paulunderwood on 2004-09-05 at 08:59 |
![]() |
![]() |
#44 |
Sep 2002
Database er0rr
5·29·31 Posts |
![]()
Thomas Ritschel pointed out to me that the FFT size change occurs at different "n" for Athlons than P4s. When we get to the difference in a couple of months it makes sense to have "Athlon only 321 files". The first set of files that should be tested by Athlons only is:
1888000-1926000 The complete table for Athlons is: Athlon 2GHz ----------- n Mers. used t/iter. ---------------------------------- 1200000 65536 65536 3.660 ms 1299062 65536 65536 3.660 ms * 1299063 65536 81920 4.720 ms * (+26000) 1300000 65536 81920 4.720 ms 1400000 81920 81920 4.720 ms 1500000 81920 81920 4.720 ms 1600000 81920 81920 4.720 ms 1610078 81920 81920 4.720 ms * 1610079 98304 98304 5.825 ms * (+27000) 1700000 98304 98304 5.825 ms 1800000 98304 98304 5.825 ms 1900000 98304 98304 5.825 ms 1925094 98304 98304 5.825 ms * 1925095 98304 114688 6.980 ms * (+37000) 2000000 98304 114688 6.980 ms 2100000 114688 114688 6.980 ms 2200000 114688 114688 6.980 ms 2244110 114688 114688 6.980 ms * 2244111 131072 131072 7.840 ms * (+48000) 2300000 131072 131072 7.840 ms 2400000 131072 131072 7.840 ms 2500000 131072 131072 7.840 ms 2560126 131072 131072 7.840 ms * 2560127 131072 163840 10.020 ms * (+50000) 2600000 131072 163840 10.020 ms 2700000 163840 163840 10.020 ms 2800000 163840 163840 10.020 ms 2900000 163840 163840 10.020 ms 3000000 163840 163840 10.020 ms 3100000 163840 163840 10.020 ms 3171158 163840 163840 10.020 ms * 3171159 163840 196608 11.950 ms * (+50000) 3200000 163840 196608 11.950 ms 3300000 163840 196608 11.950 ms 3400000 196608 196608 11.950 ms 3500000 196608 196608 11.950 ms 3600000 196608 196608 11.950 ms 3700000 196608 196608 11.950 ms 3789190 196608 196608 11.950 ms * 3789191 196608 229376 14.450 ms * (+70000) 3800000 196608 229376 14.450 ms 3900000 196608 229376 14.450 ms 4000000 229376 229376 14.450 ms 4100000 229376 229376 14.450 ms 4200000 229376 229376 14.450 ms 4300000 229376 229376 14.450 ms 4400000 229376 229376 14.450 ms 4420220 229376 229376 14.450 ms * 4420223 229376 262144 15.880 ms * (+90000) 4500000 229376 262144 15.880 ms 4600000 229376 262144 15.880 ms 4700000 262144 262144 15.880 ms 4800000 262144 262144 15.880 ms 4900000 262144 262144 15.880 ms 5000000 262144 262144 15.880 ms |
![]() |