mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   3*2^n-1 Search (https://www.mersenneforum.org/forumdisplay.php?f=14)
-   -   DWT (https://www.mersenneforum.org/showthread.php?t=620)

BotXXX 2004-08-02 11:57

[QUOTE=ET_]I'll download it as soon as it is released![/QUOTE]

It is already available in the files section of [url]http://groups.yahoo.com/group/primenumbers[/url]

Under the section "Files > ProvingPrograms" there are three files. One for Windows and two for Linux.

ET_ 2004-08-02 12:55

[QUOTE=BotXXX]It is already available in the files section of [url]http://groups.yahoo.com/group/primenumbers[/url]

Under the section "Files > ProvingPrograms" there are three files. One for Windows and two for Linux.[/QUOTE]

So I'll wait until someone will upload the files on mersenne.org/gimps

:help:

Luigi

paulunderwood 2004-08-02 13:09

Wait a little more and we will see "llr3" (or whatever it will be called) which is 30% faster than llr2 because it is based on George's latest IBDWT library. :shock:

ET_ 2004-08-02 13:31

[QUOTE=paulunderwood]Wait a little more and we will see "llr3" (or whatever it will be called) which is 30% faster than llr2 because it is based on George's latest IBDWT library. :shock:[/QUOTE]

Sure I will! Thank you! :bow:

Luigi

paulunderwood 2004-08-02 18:10

Latest binaries can be found by following the links [URL=http://groups.yahoo.com/group/primenumbers/message/15184]here[/URL]

Happy hunting :devil:

jocelynl 2004-08-02 21:01

I Just downloaded and tried the new LLR4P. It works great!!!
Start and Stop easily. On my machine it is 3 times faster. at n=800,000 on k=253. thank you so much. :grin:

Joss

justinsane 2004-08-02 23:26

On my NON-SSE2 box, the new LLR was much slower. I notice that it tends to select the largest FFT (for N around 700,000) so it took twice the time that the old LLR would have for that number.

Thomas11 2004-08-03 08:29

[QUOTE=justinsane]On my NON-SSE2 box, the new LLR was much slower. I notice that it tends to select the largest FFT (for N around 700,000) so it took twice the time that the old LLR would have for that number.[/QUOTE]

You are testing k=121, right? And around n=700,000 (or may be n=730,000) I would expect a lot of fftlen switching with the old LLR, isn't it?
The new formula for choosing the FFT length works well for k<32 but tends to overestimate at larger values of k. There is still no "optimal" formulation for guessing the FFT length right. If on your k the new non-SSE2 LLR is slower than the old one, then I suggest the use the old LLR.

-- Thomas

justinsane 2004-08-04 03:46

[QUOTE=Thomas11]You are testing k=121, right? And around n=700,000 (or may be n=730,000) I would expect a lot of fftlen switching with the old LLR, isn't it?
The new formula for choosing the FFT length works well for k<32 but tends to overestimate at larger values of k. There is still no "optimal" formulation for guessing the FFT length right. If on your k the new non-SSE2 LLR is slower than the old one, then I suggest the use the old LLR.

-- Thomas[/QUOTE]


I've stuck with the old one for now on this machine. Even for large N, like 121, the SSE2 version is still much faster.

paulunderwood 2004-09-05 08:56

timings
 
Thomas Ritschel sent me some LLR timings for 321search on his AMD Opteron computer . These show where the FFT sizes change. This is where we notice a slow down in iteration timings and the LLR tests taking longer.

Opteron 2GHz
------------

n Mers. Proth used t/iter.
------------------------------------------
1200000 65536 131072 65536 3.035 ms
1273062 65536 131072 65536 3.035 ms *
1273063 65536 131072 81920 4.055 ms *
1300000 65536 131072 81920 4.055 ms
1400000 81920 163840 81920 4.055 ms
1500000 81920 163840 81920 4.055 ms
1583078 81920 163840 81920 4.055 ms *
1583079 81920 163840 98304 5.145 ms *
1600000 81920 163840 98304 5.145 ms
1700000 98304 196608 98304 5.145 ms
1800000 98304 196608 98304 5.145 ms
1888094 98304 196608 98304 5.145 ms *
1888095 98304 196608 114688 6.350 ms *
1900000 98304 196608 114688 6.350 ms
2000000 114688 229376 114688 6.350 ms
2100000 114688 229376 114688 6.350 ms
2196110 114688 229376 114688 6.350 ms *
2196111 114688 229376 131072 7.285 ms *
2200000 114688 229376 131072 7.285 ms
2300000 131072 262144 131072 7.285 ms
2400000 131072 262144 131072 7.285 ms
2500000 131072 262144 131072 7.285 ms
2510126 131072 262144 131072 7.285 ms *
2510127 131072 262144 163840 9.385 ms *
2600000 131072 262144 163840 9.385 ms
2700000 163840 327680 163840 9.385 ms
2800000 163840 327680 163840 9.385 ms
2900000 163840 327680 163840 9.385 ms
3000000 163840 327680 163840 9.385 ms
3100000 163840 327680 163840 9.385 ms
3121158 163840 327680 163840 9.385 ms *
3121159 163840 327680 196608 11.195 ms *
3200000 163840 327680 196608 11.195 ms
3300000 196608 393216 196608 11.195 ms
3400000 196608 393216 196608 11.195 ms
3500000 196608 393216 196608 11.195 ms
3600000 196608 393216 196608 11.195 ms
3700000 196608 393216 196608 11.195 ms
3719190 196608 393216 196608 11.195 ms *
3719191 196608 393216 229376 13.490 ms *
3800000 196608 458752 229376 13.490 ms
3900000 229376 458752 229376 13.490 ms
4000000 229376 458752 229376 13.490 ms
4100000 229376 458752 229376 13.490 ms
4200000 229376 458752 229376 13.490 ms
4300000 229376 458752 229376 13.490 ms
4330220 229376 458752 229376 13.490 ms *
4330223 229376 458752 262144 15.085 ms *
4400000 229376 524288 262144 15.085 ms
4500000 229376 524288 262144 15.085 ms
4600000 262144 524288 262144 15.085 ms
4700000 262144 524288 262144 15.085 ms
4800000 262144 524288 262144 15.085 ms
4900000 262144 524288 262144 15.085 ms
4950254 262144 524288 262144 15.085 ms *
4950255 262144 524288 327680 20.395 ms *
5000000 262144 655360 327680 20.395 ms

paulunderwood 2005-03-22 17:22

Thomas Ritschel pointed out to me that the FFT size change occurs at different "n" for Athlons than P4s. When we get to the difference in a couple of months it makes sense to have "Athlon only 321 files". The first set of files that should be tested by Athlons only is:

[B]1888000-1926000[/B]

The complete table for [B]Athlons[/B] is:

Athlon 2GHz

-----------


n Mers. used t/iter.

----------------------------------

1200000 65536 65536 3.660 ms
1299062 65536 65536 3.660 ms *
1299063 65536 81920 4.720 ms * (+26000)
1300000 65536 81920 4.720 ms
1400000 81920 81920 4.720 ms
1500000 81920 81920 4.720 ms
1600000 81920 81920 4.720 ms
1610078 81920 81920 4.720 ms *
1610079 98304 98304 5.825 ms * (+27000)
1700000 98304 98304 5.825 ms
1800000 98304 98304 5.825 ms
1900000 98304 98304 5.825 ms
1925094 98304 98304 5.825 ms *
1925095 98304 114688 6.980 ms * (+37000)
2000000 98304 114688 6.980 ms
2100000 114688 114688 6.980 ms
2200000 114688 114688 6.980 ms
2244110 114688 114688 6.980 ms *
2244111 131072 131072 7.840 ms * (+48000)
2300000 131072 131072 7.840 ms
2400000 131072 131072 7.840 ms
2500000 131072 131072 7.840 ms
2560126 131072 131072 7.840 ms *
2560127 131072 163840 10.020 ms * (+50000)
2600000 131072 163840 10.020 ms
2700000 163840 163840 10.020 ms
2800000 163840 163840 10.020 ms
2900000 163840 163840 10.020 ms
3000000 163840 163840 10.020 ms
3100000 163840 163840 10.020 ms
3171158 163840 163840 10.020 ms *
3171159 163840 196608 11.950 ms * (+50000)
3200000 163840 196608 11.950 ms
3300000 163840 196608 11.950 ms
3400000 196608 196608 11.950 ms
3500000 196608 196608 11.950 ms
3600000 196608 196608 11.950 ms
3700000 196608 196608 11.950 ms
3789190 196608 196608 11.950 ms *
3789191 196608 229376 14.450 ms * (+70000)
3800000 196608 229376 14.450 ms
3900000 196608 229376 14.450 ms
4000000 229376 229376 14.450 ms
4100000 229376 229376 14.450 ms
4200000 229376 229376 14.450 ms
4300000 229376 229376 14.450 ms
4400000 229376 229376 14.450 ms
4420220 229376 229376 14.450 ms *
4420223 229376 262144 15.880 ms * (+90000)
4500000 229376 262144 15.880 ms
4600000 229376 262144 15.880 ms
4700000 262144 262144 15.880 ms
4800000 262144 262144 15.880 ms
4900000 262144 262144 15.880 ms
5000000 262144 262144 15.880 ms


All times are UTC. The time now is 04:56.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.