![]() |
|
|
#309 |
|
May 2005
23×7×29 Posts |
|
|
|
|
|
|
#310 |
|
May 2005
31308 Posts |
P3-M @ 1GHz - amd binary is slower
Generally version 1.5.2 is ~3% faster than 1.4.2 Code:
sr2sieve-intel -vv sr2sieve 1.5.2 -- A sieve for multiple sequences k*b^n+/-1. L1 data cache 16Kb (detected), L2 cache 512Kb (detected). Read 493896 terms for 9 sequences from ABCD format file `sr2data.txt'. Split 9 base 2 sequences into 649 base 2^180 subsequences. Loaded Legendre symbol lookup tables for 9 sequences from `sr2cache.bin'. Using 16 Kb for the baby-steps giant-steps hashtable, maximum density 0.23. Best time for baby step method gen/2: 62836. Best time for baby step method gen/4: 62120. Best time for baby step method gen/8: 57971. Best time for baby step method gen/1: 73323. Best time for giant step method gen/2: 39837. Best time for giant step method gen/4: 39845. Best time for giant step method gen/8: 37217. Best time for giant step method gen/1: 44362. Best time for ladder method gen/2: 5197. Best time for ladder method gen/4: 4614. Best time for ladder method gen/8: 4866. Best time for ladder method gen/1: 8042. Best time for ladder method add/1: 12124. Using baby step method gen/8, giant step method gen/8, ladder method gen/4. Resuming from checkpoint pmin=3955028652461 in `checkpoint.txt'. Using 256Kb for the Sieve of Eratosthenes bitmap. Expecting to find factors for about 4896.56 terms in this range. sr2sieve started: 1000000 <= n <= 1999997, 3955028652461 <= p <= 4000000000000 p=3955047920347, 319607 p/sec, 4474 factors, 95.50% done, 667 sec/factor Last fiddled with by Cruelty on 2007-05-14 at 08:31 |
|
|
|
|
|
#311 |
|
May 2005
23×7×29 Posts |
I have upgraded linux64.sr1sieve on one of my machines from 1.0.23 to 1.1.0 and I get an error when trying to run it. Any ideas what's going on? It's a C2D E4300 CPU @ 2.4GHz.
Code:
sr1sieve 1.1.0 -- A sieve for one sequence k*b^n+/-1. L1 data cache 32Kb (detected), L2 cache 2048Kb (detected). Read 89136 terms for 4*3^n-1 from NewPGen file `k=4_b=3.txt'. Split 1 base 3 sequence into 32 base 3^90 subsequences. Using 0 Kb for Legendre symbol tables. Using 8 Kb for the baby-steps giant-steps hashtable, maximum density 0.20. Best time for baby step method gen/2: 20322. Best time for baby step method gen/4: 17064. Best time for baby step method gen/1: 23553. Best time for giant step method gen/2: 12087. Best time for giant step method gen/4: 13131. Best time for giant step method gen/1: 16704. Using baby step method gen/4, giant step method gen/2. Using 1024Kb for the Sieve of Eratosthenes bitmap. Expecting to find factors for about 1461.46 terms. sr1sieve started: 200013 <= n <= 1999957, 6121454326643 <= p <= 10000000000000 ./linux.bat: line 1: 23295 Segmentation fault (core dumped) ./sr1sieve -i k=4_b=3.txt -o ready.txt -f factors.txt --pmax 10e12 -vv --save 15 Last fiddled with by Cruelty on 2007-05-14 at 20:12 |
|
|
|
|
|
#312 | |
|
Mar 2003
New Zealand
13×89 Posts |
Quote:
Also, could you try running it with the command line switches -Bgen/1 -Ggen/1 and see whether it still segfaults? |
|
|
|
|
|
|
#313 |
|
May 2005
23·7·29 Posts |
Setting manually -Bgen/x to anything other than "1" causes segfault. When using -Bgen/1 I can use any value for the -Ggen/x without any error.
As my linux knowledge is somehow limited, I don't understand what you mean by "core file"
|
|
|
|
|
|
#314 | ||
|
Mar 2003
New Zealand
100100001012 Posts |
Quote:
Also in these versions, benchmarks are run twice and the times taken from the second run. This should help ensure that everything is in cache when the times are taken. Quote:
I don't need the core file now, unless you get another segfault. |
||
|
|
|
|
|
#315 |
|
May 2005
23×7×29 Posts |
C2D E4300 @ 2.4GHz using sr1sieve.linux.x86-64 v.1.1.1 speed increase of ~40% (6.1M vs 8.6M)
|
|
|
|
|
|
#316 |
|
I quite division it
"Chris"
Feb 2005
England
207710 Posts |
sr1sieve
C2D E4300 @ a very hot 3013mhz , Windows. 1% speed increase (over the one from a couple of weeks ago.) It correctly chose sse2/16 for baby steps, sse2/8 for giant steps. Is it possible to have sse2/32, 64 etc. to further tweak it? (Baby steps in this case.) Last fiddled with by Flatlander on 2007-05-18 at 16:07 Reason: more details |
|
|
|
|
|
#317 |
|
May 2005
23·7·29 Posts |
BTW: under x86-64 linux there are no sse2 methods to choose for C2D CPU - is it only available under 32-bit systems?
|
|
|
|
|
|
#318 | ||
|
Mar 2003
New Zealand
13·89 Posts |
Quote:
Currently the 64-bit code uses SSE2 for the floating point operations and general registers for the integer operations (because the SSE2 instruction set lacks any SIMD equivalent of the imulq instruction). I plan to add routines that use the FPU instead of SSE2, this will be slower but will allow sieving beyond p=2^52 as an option. The 32-bit code uses the FPU for floating point operations (not ideal, but the 32-bit SSE2 instruction set lacks the vital cvtsi2sdq and cvtsd2siq instructions) and SSE2 for integer operations, where available. Quote:
On the other hand there is a small benefit simply from unrolling the loops on machines with a large L1 code cache, but there are other ideas I wil try first. In particular it should be possible to interleave the hashtable insert/lookup code with the mulmod code. The 64-bit code doesn't yet make proper use of the packed data SSE2 instructions, it uses two mulsd instructions instead of one mulpd, so I will also try improving that. |
||
|
|
|
|
|
#319 |
|
Mar 2003
New Zealand
13·89 Posts |
Could you send me the results of running with the -vv option on this machine? I have only implemented gen/2 and gen/4 methods, there may be more improvements possible, but without a machine to test on it involves a lot of guesswork. (Or a much better understanding of the processor architecture than I have :-)
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Very Prime Riesel and Sierpinski k | robert44444uk | Open Projects | 587 | 2016-11-13 15:26 |
| Sierpinski/ Riesel bases 6 to 18 | robert44444uk | Conjectures 'R Us | 139 | 2007-12-17 05:17 |
| Sierpinski/Riesel Base 10 | rogue | Conjectures 'R Us | 11 | 2007-12-17 05:08 |
| Sierpinski / Riesel - Base 23 | michaf | Conjectures 'R Us | 2 | 2007-12-17 05:04 |
| Sierpinski / Riesel - Base 22 | michaf | Conjectures 'R Us | 49 | 2007-12-17 05:03 |