mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > Sierpinski/Riesel Base 5

Reply
 
Thread Tools
Old 2007-10-05, 00:14   #375
KriZp
 
KriZp's Avatar
 
Feb 2007

13510 Posts
Default

Heating season is coming up, so I have a good reason to dust off the old BP6 dual Celeron.
Did some benchmarks running Fedora 7:
Code:
sr2sieve 1.6.5		89376 p/sec
sr2sieve 1.4.42 intel	83988 p/sec
sr2sieve 1.4.42 amd	83667 p/sec
sr2sieve 1.5.20 intel	88587 p/sec
sr2sieve 1.5.20 amd	89993 p/sec
sr2sieve 1.6.5		89376 p/sec
sr2sieve 1.6.6		89397 p/sec
JJsieveCMOV6.exe 	76 kp/s 	(through wine)
verbose (-vv) output in attached file for anyone interested.
Attached Files
File Type: zip Sievebenchmarks.txt.zip (2.0 KB, 100 views)
KriZp is offline   Reply With Quote
Old 2007-10-08, 01:31   #376
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13·89 Posts
Default sr2sieve/sr5sieve 1.6.7

In this version I optimised the hashtable code a bit.
geoff is offline   Reply With Quote
Old 2007-10-09, 20:52   #377
Cruelty
 
Cruelty's Avatar
 
May 2005

31308 Posts
Default

Attached you will find comparison between sr1 and sr2 sieves on Linux-x86-64. ~10% speed increase for sr2 and ~2% for sr1
Attached Files
File Type: txt bench.txt (5.0 KB, 105 views)
Cruelty is offline   Reply With Quote
Old 2007-10-13, 22:22   #378
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

22058 Posts
Default sr2sieve/sr5sieve 1.6.9

This version has a new giant step method for the x86-64, it is my first attempt to combine the mulmods and hashtable lookups in one pass. It is a little faster for Core 2, but untested on Athlon 64.

The idea behind doing the hashtable lookups at the same time as the mulmods is that it will give the Athlon 64 CPU something useful to do while waiting for the high latency integer/floating point conversions to finish. (These are not a bottleneck on the Core 2). But the problem is that the hashtable code contains a lot of branches, and the branches are only predictable when the hashtable density is low, so it is not really clear whether it will pay off in practice.
geoff is offline   Reply With Quote
Old 2007-10-14, 02:09   #379
KriZp
 
KriZp's Avatar
 
Feb 2007

13510 Posts
Default

1.6.9:
p=688770037701469, 1070697 p/sec, 29 factors, 99.1% cpu, 16748 sec/factor
[root@Athlon64 ~]# 1008239 p/sec, 27 factors, 58.0% done, ETA 17 Oct 01:40

A good 15% improvement on my opteron and A64, thanks! More than 1 Mp/s on each core for the first time.
KriZp is offline   Reply With Quote
Old 2007-10-16, 05:38   #380
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

100100001012 Posts
Default

Quote:
Originally Posted by KriZp View Post
1.6.9:
p=688770037701469, 1070697 p/sec, 29 factors, 99.1% cpu, 16748 sec/factor
[root@Athlon64 ~]# 1008239 p/sec, 27 factors, 58.0% done, ETA 17 Oct 01:40

A good 15% improvement on my opteron and A64, thanks! More than 1 Mp/s on each core for the first time.
Great! With a bit of luck we should get a similar improvement when the baby step mulmods ae combined with the hashtable insertions. I plan to make these changes for the 32-bit versions as well, but the gains will probably be less unless I can figure out a way to employ SSE2 for the hashtable operations.
geoff is offline   Reply With Quote
Old 2007-10-16, 15:36   #381
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3·2,083 Posts
Default

In light of the recent sieve file update, I was wondering, when updating the sieve file for sr5sieve, if you've got the Legendre symbol tables cached by using the -c option the first time you ran sr5sieve, do you have to delete the sr5cache.bin file, and run it with the -c switch again, to generate the cached file again? Or, does the sieve file have no effect on the sr5cache.bin file, and thus not need to be re-generated when a new sieve file comes out?
mdettweiler is offline   Reply With Quote
Old 2007-10-18, 05:16   #382
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13·89 Posts
Default

Quote:
Originally Posted by Anonymous View Post
In light of the recent sieve file update, I was wondering, when updating the sieve file for sr5sieve, if you've got the Legendre symbol tables cached by using the -c option the first time you ran sr5sieve, do you have to delete the sr5cache.bin file, and run it with the -c switch again, to generate the cached file again? Or, does the sieve file have no effect on the sr5cache.bin file, and thus not need to be re-generated when a new sieve file comes out?
It is not necessary to regenerate the cache file, unless you want to save a little disk space.

The cache file stores information about each k,c pair. This information doesn't change when terms are deleted from the sieve file. Regenerating the cache file will just remove the redundant entries for those k,c that have been removed from the sieve file.

A note for those sieving with SoB.dat and riesel.dat: If you run `sr2sieve -rs -c' once it will generate a combined cache file that can be used for either SoB.dat or riesel.dat. (Stop sieving with ctrl-c once it has been generated).
geoff is offline   Reply With Quote
Old 2007-10-18, 14:18   #383
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3×2,083 Posts
Default

Quote:
Originally Posted by geoff View Post
It is not necessary to regenerate the cache file, unless you want to save a little disk space.

The cache file stores information about each k,c pair. This information doesn't change when terms are deleted from the sieve file. Regenerating the cache file will just remove the redundant entries for those k,c that have been removed from the sieve file.

A note for those sieving with SoB.dat and riesel.dat: If you run `sr2sieve -rs -c' once it will generate a combined cache file that can be used for either SoB.dat or riesel.dat. (Stop sieving with ctrl-c once it has been generated).
Okay, thanks! I'll keep that in mind next time a new dat file comes out (or if a new prime is found, in which case I'll remove it manually).
mdettweiler is offline   Reply With Quote
Old 2007-10-19, 22:02   #384
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

100100001012 Posts
Default sr1sieve 1.2.0

This version results from a lot of cut-and-paste of the latest sr5sieve code, and so could contain bugs. Use the latest 1.1.x version instead if you have problems. New since version 1.1.12:

* -e switch reports speeds in elapsed time instead of cpu time.

* Single x86 executable with seperate AMD and Intel code paths: --amd or --intel switches can be used to override the automatic code path selection.

* x86-64 executable can now sieve to p=2^62. The x87 FPU will be used for p > 2^51. The --no-sse2 switch forces use of the x87 FPU for p < 2^51.

* New hashtable code should benefit all x86/x86-64 machines. New giant steps method (new/4) and baby steps method (gen/6) for x86-64 should benefit Athlon 64.
geoff is offline   Reply With Quote
Old 2007-10-20, 20:49   #385
BlisteringSheep
 
BlisteringSheep's Avatar
 
Oct 2006
On a Suzuki Boulevard C90

2×3×41 Posts
Default

Geoff, There's a compilation error in mulmod-ppc64.c, line 8, undeclared variable 'p'. I think it's just a copy-and-paste, where the function parameter is named 'b' and and should be 'p' instead. At least that's the change I made. :)

FYI, here's some startup info for v1.1.12 and v1.2.0 on a 970MP:

Code:
sr1sieve 1.1.12 -- A sieve for one sequence k*b^n+/-1.
L1 data cache 32Kb (default), L2 cache 1024Kb (detected).
Read 141012 terms for 5*2^n-1 from NewPGen file `5sheep_840.txt'.
Split 1 base 2 sequence into 61 base 2^180 subsequences.
Using 0 Kb for Legendre symbol tables.   
BSGS range: 133*132 - 1033*17.
Using 16 Kb for the baby-steps giant-steps hashtable, maximum density 0.13.
Best time for baby step method gen/1: 164.
Best time for baby step method gen/2: 182.
Best time for baby step method gen/4: 155.
Best time for baby step method gen/8: 152.
Best time for giant step method gen/1: 159.
Best time for giant step method gen/2: 153.
Best time for giant step method gen/4: 152.
Best time for giant step method gen/8: 150.
Baby step method gen/8, giant step method gen/8.
Using 512Kb for the Sieve of Eratosthenes bitmap.
Code:
sr1sieve 1.2.0 -- A sieve for one sequence k*b^n+/-1.
Compiled on Oct 20 2007 with GCC version 4.1.1.
L1 data cache 32Kb (default), L2 cache 1024Kb (detected).
Read 141012 terms for 5*2^n-1 from NewPGen file `5sheep_840.txt'.
Split 1 base 2 sequence into 61 base 2^180 subsequences.
Using 0 Kb for Legendre symbol tables.   
BSGS range: 133*132 - 1033*17.
Using 16 Kb for the baby-steps giant-steps hashtable, maximum density 0.13.
Best time for baby step method gen/2: 188.
Best time for baby step method gen/4: 162.
Best time for baby step method gen/8: 154.
Best time for giant step method gen/2: 146.
Best time for giant step method gen/4: 141.
Best time for giant step method gen/8: 139.
Baby step method gen/8, giant step method gen/8.
Using 512Kb for the Sieve of Eratosthenes bitmap.
BlisteringSheep is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Very Prime Riesel and Sierpinski k robert44444uk Open Projects 587 2016-11-13 15:26
Sierpinski/ Riesel bases 6 to 18 robert44444uk Conjectures 'R Us 139 2007-12-17 05:17
Sierpinski/Riesel Base 10 rogue Conjectures 'R Us 11 2007-12-17 05:08
Sierpinski / Riesel - Base 23 michaf Conjectures 'R Us 2 2007-12-17 05:04
Sierpinski / Riesel - Base 22 michaf Conjectures 'R Us 49 2007-12-17 05:03

All times are UTC. The time now is 12:12.


Mon Aug 2 12:12:00 UTC 2021 up 10 days, 6:40, 0 users, load averages: 1.49, 1.53, 1.48

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.