mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > Sierpinski/Riesel Base 5

Reply
 
Thread Tools
Old 2007-06-29, 02:55   #342
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default

Quote:
Originally Posted by Cruelty View Post
Attached are benchmarks for all variants of 1.5.10.
Thanks for those benchmarks, It is a pity the -int versions are so much slower, but at least I know not to spend any more time on that idea :-)
geoff is offline   Reply With Quote
Old 2007-07-02, 02:50   #343
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13·89 Posts
Default sr5sieve 1.5.12

The x86-64 binary now has two seperate code paths: For factors up to 2^52 it uses SSE2, for factors between 2^52 and 2^62 it uses the (slower) x87 FPU. The choice of code path is automatic, but for testing purposes the `--no-sse2' switch will force it to use the FPU code path.

This is mainly for the benefit of the RieselSieve project, which at the current rate of sieving could be crossing the 2^52 boundary in a few months. Any testing before then will be much appreciated.

There have been a number of small improvements to the ppc-64 build since 1.5.10, these just need testing to make sure no new bugs have crept in.

There haven't been any significant changes to the x86 builds since 1.5.6.

This version contains most of the changes I wanted to make in 1.5.x, and I will mainly concentrate on fixing bugs from now on.
geoff is offline   Reply With Quote
Old 2007-07-02, 17:52   #344
Cruelty
 
Cruelty's Avatar
 
May 2005

23·7·29 Posts
Default

1.5.12 is slower than 1.5.10 by ~13%
Attached Files
File Type: txt 1510vs1512.txt (2.7 KB, 109 views)

Last fiddled with by Cruelty on 2007-07-02 at 17:53
Cruelty is offline   Reply With Quote
Old 2007-07-05, 03:40   #345
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13·89 Posts
Default

Quote:
Originally Posted by Cruelty View Post
1.5.12 is slower than 1.5.10 by ~13%
I can't figure out how this could happen, as there has been no change to the SSE2 code since 1.5.10. The only possibility I can think of are that somehow the wrong code path is being used, could you try running with the --no-sse2 switch to test the other code path?

The one area that 1.5.12 should be much slower in is verifying the factors it finds, but this should only have a noticable effect when there are hundreds of factors per second being found.
geoff is offline   Reply With Quote
Old 2007-07-05, 18:00   #346
Cruelty
 
Cruelty's Avatar
 
May 2005

23×7×29 Posts
Default

There is virtually no change when using "--no-sse2" switch - it has to be something different.
Attached Files
File Type: txt bench.txt (4.1 KB, 114 views)
Cruelty is offline   Reply With Quote
Old 2007-07-05, 22:55   #347
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default sr5sieve 1.5.13

This version fixes the bug in 1.5.12 that caused the x86-64 binary to always use the non-SSE2 code path, even when it said it was using the SSE2 path :-).
geoff is offline   Reply With Quote
Old 2007-07-06, 06:02   #348
Cruelty
 
Cruelty's Avatar
 
May 2005

31308 Posts
Default

Here is comparison of 1.5.10 and 1.5.13. What is "BSGS range..."?
Attached Files
File Type: txt 1510vs1513.txt (2.7 KB, 121 views)
Cruelty is offline   Reply With Quote
Old 2007-07-07, 00:02   #349
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

48516 Posts
Default

Quote:
Originally Posted by Cruelty View Post
Here is comparison of 1.5.10 and 1.5.13. What is "BSGS range..."?
BSGS range: 92*91 - 1042*8

This means that in the extreme cases it might do 92 baby steps and 91 giant steps, or 1042 baby steps and 8 giant steps.
geoff is offline   Reply With Quote
Old 2007-07-09, 22:56   #350
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default sr5sieve 1.5.14, sr1sieve 1.1.6

In 1.5.14 I have improved the sse2/8 and sse2/16 methods a little, they now avoid doing more than 4 extra mulmods. This increases the number of branches in the code, but most of them are predictable, and since it works out a bit faster on my P4 I assume newer machines with better branch prediction or shorter pipelines will not suffer. Here are the times for my P4:
Code:
                         19k SoB.dat    68k riesel.dat    237k sr5data.txt
                         -----------    --------------    ----------------
sr2sieve-intel 1.5.6     425 kp/s       223 kp/s           98 kp/s
sr2sieve-intel 1.5.14    455 kp/s       229 kp/s           99 kp/s
sr1sieve 1.1.6 is updated with most of the recent changes for ppc64 and x86-64, except it is still limited to sieve depth of 2^52 on x86-64. I don't plan to extend that unless someone has a real need for it.
geoff is offline   Reply With Quote
Old 2007-07-10, 22:10   #351
Cruelty
 
Cruelty's Avatar
 
May 2005

23·7·29 Posts
Default

Attached is comparison for linux.x86-64 binaries.
Virtually no change for sr2sieve, and 2.6% improvement for sr1sieve
Attached Files
File Type: txt bench.txt (4.9 KB, 116 views)
Cruelty is offline   Reply With Quote
Old 2007-07-12, 22:59   #352
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

115710 Posts
Default sr5sieve 1.5.15

This version extends the improvements in 1.5.14 to the non-SSE2 x86 mulmod code. I hope it'll be a little faster on newer machines. It is only fractionally faster on my P3, but better on my P4 (tested with SSE2 disabled):
Code:
                         19k SoB.dat    68k riesel.dat    237k sr5data.txt
                         -----------    --------------    ----------------
1.5.14 --no-sse2          260 kp/s       142 kp/s           69 kp/s
1.5.15 --no-sse2          286 kp/s       151 kp/s           71 kp/s
geoff is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Very Prime Riesel and Sierpinski k robert44444uk Open Projects 587 2016-11-13 15:26
Sierpinski/ Riesel bases 6 to 18 robert44444uk Conjectures 'R Us 139 2007-12-17 05:17
Sierpinski/Riesel Base 10 rogue Conjectures 'R Us 11 2007-12-17 05:08
Sierpinski / Riesel - Base 23 michaf Conjectures 'R Us 2 2007-12-17 05:04
Sierpinski / Riesel - Base 22 michaf Conjectures 'R Us 49 2007-12-17 05:03

All times are UTC. The time now is 16:13.


Fri Jul 16 16:13:18 UTC 2021 up 49 days, 14 hrs, 1 user, load averages: 1.94, 2.13, 1.95

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.