![]() |
Beta version of PRP
There is a new version of PRP available for testing. It only works on SSE2 machines and should be much faster. It would be great some of you could test it out - especially important would be a few retests to make sure it generates matching residues.
Windows only for now: [url]ftp://mersenne.org/gimps/prp3.zip[/url] |
I will do some tests with some of the psp numbers.( At first some low n)
My first observation was that it was slower then prp.exe PRP.exe gave me 0.92mS per bit PRP3.exe gives 1.234mS per bit for the same test. Correction: now it gives sometimes 0.573mS per bit. The runtime is not stable: It moves between 0.57 and 1.2 mS. (most of the time >1.0mS) The software runs on a P4 HT 3.0GHz with WinXP. There is another DC project runing in parallel.(eulernet) The same configurtion runs with prp.exe without changing times. Lars |
One additional question which residue is the one i should use to compare with prp.exe. "Res64" or "old64" ???
Lars |
old64 should match the old prp.exe. THe new res64 is compatible with PFGW.
|
Yep found that out too. :redface:
My next report on PRP3. I made 20 tests with an FFT length of 16k. All residues are identical with the original residues i found in the PSP database. ( The original calculation was done by a user which used llr.exe) I made also test if there are problems when i stop PRP3.exe whitin a calculation and found no problems yet. Only problem i have is that it is not continuosly fast. Most of the time it is slower then prp.exe on my system. Lars |
OK i think i found a reason for the different speeds.
It is to warm here and so the PC (a notebook) slows down to secure itself from overheating. Seems to be the case as at the moment prp.exe also shows differnt speeds which is normaly not happening. Lars |
I've found two bugs so far:
The first is a minor one. If the sieve depth is greater than 2^32, than the value stored in the headline of the output file doesn't match that one from the input file, e.g. 37572480860218 -> 106954810 The second one is a bit serious. I'm getting a ROUND OFF error while testing 3*2^234760-1: [Fri Jul 30 12:11:36 2004] Bit: 234759/234762, ERROR: ROUND OFF (0.5) > 0.40 Possible hardware failure, consult the readme file. Continuing from last save file. [Fri Jul 30 12:16:22 2004] Bit: 234759/234762, ERROR: ROUND OFF (0.5) > 0.40 Possible hardware failure, consult the readme file. Continuing from last save file. Seems that the FFT length (12K) may be choosen too small in this case, since this machine never caused a "possible hardware failure" before. Nevertheless it's very very fast and I haven't found any non-matching residuals so far. Great job, George! -- Thomas |
Newest Test report.
1. I have run 140 double checks for PSP number with a FFT of 16K. All residues are OK. 2. I have run 6 tests with FFT 96K. All tests are OK also. 3. At the moment i run the PSP k/n pairs that have been found prime. The first 6 test are OK. For the rest i will report later. The new software is real fast and i recognized no bugs so far. Lars |
All 11 PSP primes are also found with prp3.exe.
I had no crashes and no other errors yet. If you want me to test something special let me know. Lars |
[QUOTE=Thomas11]
The second one is a bit serious. I'm getting a ROUND OFF error while testing 3*2^234760-1: [/QUOTE] Thanks for discovering this one. It is very troublesome. I recommend not using the new prp3 until I can figure out what is causing this. |
Further study of this naughty case reveals no problem in the FFT code. Unfortunately, the FFT data number being squared does not exhibit "random" properties that generates low round-off errors.
I'm talking to the PFGW folks about possible workarounds. They have seen this many times in the past. Since prp3 does the last 50 squarings with round-off checking enabled, then I think it is safe to use. If you get this error in the last few iterations, I'll bet you have a probable prime and the next rev of prp3 will backtrack and handle properly. |
I've uploaded a new prp3 that fixes a few minor bugs and avoids the problems with non-random data by doing special-but-slower multiplications for the last 25 iterations.
Let me know if you find any problems. |
I also downloaded and tested the new version of prp. For me, it is about 3 times faster! (19*2^531k+1; 32k FFT)
Congratulations, George! Just one thing seems somewhat strange. (k*2^exp+1) If I use k=19, I can go beyond exp=595000 and it still uses 32k FFT. If I use K=10000 or so, it switches already below exp=447000 from 32K FFT to 48K FFT. with k=3 exponents upt to 638000 are possible with 32k FFT. in the k*2^exp-1 we have more FFT lengths: k=19 has 32kFFT up to 601000; k=3 up to 645000 and k= 10000 up to 453000. In the minus mode, more FFT sizes are available (28K and 40K instead of only 24K and 40K besides 32K). Is that normal, that the threshholds of the fft sizes depend so greatly on the size of the k (mantissa)? If we use k=3 instead of 10000, the the difference is just 13 bits. |
This is all normal. In my modified-Percival IBDWT, log2(k)/2 bits are required in each FFT word.
So let's say the 32K FFT can handle 20 bits per FFT word. If k=1, n can be 32K*20 or 640K. If k = 8191, then log2(k)/2 = 6.5 bits. That leaves 13.5 bits per word. So now you can only handle n up to 32K*13.5 = 432K. |
Warning: A bug has been uncovered for k values around 7 to 9 digits. A fix is underway.
|
The fixed prp3 can now be downloaded.
You can get the versions from: Windows: [url]ftp://mersenne.org/gimps/prp3.zip[/url] Linux: [url]ftp://mersenne.org/gimps/prp3.tgz[/url] The linux version is untested, I do not have Linux running on any P4s here. |
| All times are UTC. The time now is 00:26. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.