![]() |
|
|
#353 |
|
May 2005
23×7×29 Posts |
Comparison of linux.x86-64 binaries. Some mixed results, overall minor improvement.
Last fiddled with by Cruelty on 2007-07-13 at 21:29 |
|
|
|
|
|
#354 |
|
Mar 2003
New Zealand
13·89 Posts |
These versions reduce the limit for the x86-64 SSE2 mulmod to 2^51. Testing against GMP revealed that it could fail for some values of p just below 2^52.
This affected all x86-64 binaries from sr1sieve version 1.0.0 to 1.1.7 and from sr5sieve version 1.4.11 to 1.5.15. x86 and ppc64 builds are not affected. |
|
|
|
|
|
#355 |
|
May 2005
23·7·29 Posts |
I haven't reached 100T yet so this "bug" does not affect me fortunately...
BTW: Geoff, since you are now a proud owner of C2D, can we expect some further fine-tuning of sr(x)sieve
|
|
|
|
|
|
#356 |
|
Mar 2003
New Zealand
100100001012 Posts |
These versions fix a memory allocation bug that could cause the program to abort at the end of a sieve range, or a memory leak if there were multiple ranges queued up in the work file.
No work needs to be repeated, as all results for the range would have been written to file before the abort. The affected builds were: Windows: sr1sieve versions 1.0.25 - 1.1.6, sr5sieve versions 1.4.42 - 1.5.15. OS X: sr1sieve versions 1.0.25 - 1.1.8, sr5sieve versions 1.4.42 - 1.5.16. The bug didn't affect the Linux builds. Thanks rogue for finding it. Almost certainly :-). |
|
|
|
|
|
#357 |
|
May 2005
31308 Posts |
does it mean that sr2sieve was not affected?
|
|
|
|
|
|
#358 |
|
Mar 2003
New Zealand
115710 Posts |
|
|
|
|
|
|
#359 |
|
Mar 2003
New Zealand
13×89 Posts |
A whole month without a new version :-)
I have mainly been working on gcwsieve lately, but I am learning some things that I should be able to apply to sr5sieve. In particular I am beginning to understand why the performance of the modular multiplications is so much better on Core 2 CPUs than others, and maybe some things I can do to improve the situation for the Athlon 64. Here are some plans: 1. Make 64-bit Windows binaries. Once this is done for gcwsieve then it will not be too much harder to do for sr5sieve. 2. Rewrite the x86-64 baby-steps and giant-steps methods completely in assembly. This will allow the hashtable code to be interleaved with the mulmod code, which should help hide the higher instruction latencies of the Athlon 64. 3. Make a library of `streaming modular arithmetic' functions, including a thread-safe version. 4. Implement the SPH algorithm for sr1sieve, and maybe sr2sieve. This will probably be version 2.x.x. 5. Make a multi-threaded version, maybe with a client/server setup for farms. This might be version 3.x.x Last fiddled with by geoff on 2007-09-07 at 03:59 Reason: Not reading before posting |
|
|
|
|
|
#360 |
|
May 2005
23·7·29 Posts |
Thanks for update Geoff
All changes mentioned by you will be highly appreciated ![]() Some more feature requests from me:
|
|
|
|
|
|
#361 |
|
Mar 2003
New Zealand
13·89 Posts |
This version just adds some code to reset the FPU precision before use, it shouldn't be necessary on a properly functioning system, but it doesn't take any exra time to do.
There is a small improvement to the x86-64 mulmod, about 1% faster on Core 2 CPUs. Otherwise no need to upgrade. |
|
|
|
|
|
#362 |
|
Mar 2003
New Zealand
22058 Posts |
This version fixes a bug introduced in version 1.4.27 that caused an error message at startup if the --pmax switch was used without the --pmin switch when the sieve file contained the start of the sieve range.
Version 1.6.x will be able to sieve b^n+/-k for use on the Dual-Sierpinski problem. Thanks Phil Moore for pointing out how little change was needed to make this work. There is an experimental version 1.6.0 for those interested, hopefully in a future version the code will be integrated into the standard sr2sieve binary so that b^n+/-k and k*b^n+/-1 can be sieved together. |
|
|
|
|
|
#363 |
|
Mar 2003
New Zealand
48516 Posts |
This version works with both forms k*b^n+/-1 and b^n+/-k together in the same sieve.
It doesn't appear to be any slower than 1.5.x when sieving k*b^n+/-1 alone, but if anyone notices a slowdown on their own hardware, please let me know. (There are a number of extra branches in the code, but when all sequences in the sieve have the same form the branches are predictable). |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Very Prime Riesel and Sierpinski k | robert44444uk | Open Projects | 587 | 2016-11-13 15:26 |
| Sierpinski/ Riesel bases 6 to 18 | robert44444uk | Conjectures 'R Us | 139 | 2007-12-17 05:17 |
| Sierpinski/Riesel Base 10 | rogue | Conjectures 'R Us | 11 | 2007-12-17 05:08 |
| Sierpinski / Riesel - Base 23 | michaf | Conjectures 'R Us | 2 | 2007-12-17 05:04 |
| Sierpinski / Riesel - Base 22 | michaf | Conjectures 'R Us | 49 | 2007-12-17 05:03 |