![]() |
|
|
#320 |
|
May 2005
31308 Posts |
Voila
![]() Code:
sr1sieve 1.1.1 -- A sieve for one sequence k*b^n+/-1. L1 data cache 32Kb (detected), L2 cache 2048Kb (detected). Read 87511 terms for 4*3^n-1 from NewPGen file `k=4_b=3.txt'. Split 1 base 3 sequence into 32 base 3^90 subsequences. Using 0 Kb for Legendre symbol tables. Using 8 Kb for the baby-steps giant-steps hashtable, maximum density 0.20. Best time for baby step method gen/2: 20196. Best time for baby step method gen/4: 17109. Best time for baby step method gen/1: 23940. Best time for giant step method gen/2: 11529. Best time for giant step method gen/4: 11682. Best time for giant step method gen/1: 16101. Using baby step method gen/4, giant step method gen/2. Using 1024Kb for the Sieve of Eratosthenes bitmap. Expecting to find factors for about 1810.45 terms. sr1sieve started: 200013 <= n <= 1999957, 10613401255319 <= p <= 20000000000000 p=10613920959167, 8678654 p/sec, 0 factors, 0.01% done, ETA 03 Jun 20:31 |
|
|
|
|
|
#321 |
|
Mar 2003
New Zealand
13×89 Posts |
These versions fix a serious bug in the Linux/x86-64 build.
The giant steps method gen/4 did not work correctly in sr5sieve verisons 1.5.1-1.5.3 or sr1sieve versions 1.1.0-1.1.1 To check whether your work was affected, run with the -v switch to get a message like `Using baby step method gen/4, giant step method gen/4, ladder method gen/4.' If the giant step method is reported as gen/4 then the results are invalid. (The bug does not affect the baby steps or ladder methods). I have uploaded a test/benchmark program mulmodk8.zip for the x86-64 code here, it would be helpful if someone could run this on Core2 and AMD64 machines and post the results here. It can be run as ./mulmodk8 and if no error message is printed then all is well. |
|
|
|
|
|
#322 |
|
"Jason Goatcher"
Mar 2005
DB316 Posts |
Does it affect Linux 64-bit sr2sieve-amd?
I'm going to run Eon until I get an answer. Since I expect to know the answer to my question with 24 hours, I'm going to keep my reservations. |
|
|
|
|
|
#323 |
|
Mar 2003
New Zealand
13·89 Posts |
sr5sieve-amd and sr5sieve-intel are 32-bit, they are not affected even if you are running them on a 64-bit machine.
Only the binaries from the following archives are affected: sr1sieve-1.1.0-linux-x86_64.tar.gz sr1sieve-1.1.1-linux-x86_64.tar.gz sr2sieve-1.5.1-linux-x86_64.tar.gz sr2sieve-1.5.2-linux-x86_64.tar.gz sr2sieve-1.5.3-linux-x86_64.tar.gz sr5sieve-1.5.1-linux-x86_64.tar.gz sr5sieve-1.5.2-linux-x86_64.tar.gz sr5sieve-1.5.3-linux-x86_64.tar.gz |
|
|
|
|
|
#324 |
|
May 2005
23·7·29 Posts |
That is bad news
![]() What is the nature of the error in previous x86-64 builds? Are some factors missed or some factors are reported although they are not factors? I would like to evaluate how much work I have to repeat... |
|
|
|
|
|
#325 |
|
May 2005
23·7·29 Posts |
OK, the last backup I have is from May 12-th
![]() Here is a benchmark for C2D E4300 @ 2.4GHz Code:
./mulmodk8 length = 1000, iterations = 100000, b = 2, p = 4503599627370449: Code Vec Rate RDTSC ---- --- ------- --------- cmov 1 225.212 1751122971 gen 1 240.370 1551858453 gena 1 238.080 1533973842 genb 1 252.509 1441971999 genc 1 249.984 1445902668 gen 2 384.592 954202257 gen 4 438.570 832425624 gena 4 438.570 839892636 |
|
|
|
|
|
#326 | ||
|
Mar 2003
New Zealand
48516 Posts |
Quote:
If the program found plenty of factors in a range without any error then it is very unlikely to have been affected by the bug (i.e. the gen/4 method was not being used for the giant steps). edit: No incorrect factors would be reported without stopping with an error message. You don't have to worry that a prime has been wrongly removed from the sieve. Quote:
Last fiddled with by geoff on 2007-05-25 at 02:24 |
||
|
|
|
|
|
#327 |
|
May 2005
31308 Posts |
I didn't have any such errors... anyways I will rerun the entire range and afterwards compare results
|
|
|
|
|
|
#328 |
|
Mar 2003
New Zealand
13×89 Posts |
These versions improve the mulmod code for x86-64, based on Cruelty's C2D benchmark.
There are a few minor changes for the x86 version, but nothing that will be noticed in most cases. |
|
|
|
|
|
#329 | |
|
May 2005
23×7×29 Posts |
Quote:
Last fiddled with by Cruelty on 2007-05-28 at 19:34 |
|
|
|
|
|
|
#330 |
|
Mar 2003
New Zealand
13·89 Posts |
These versions have improvements to the 32-bit SSE2 mulmod and powmod code, mainly the use of 8-byte SSE2 reads to match the 8-byte FPU writes in tight loops (which is probably of most benefit to P4), but also by interleaving 4 integer multiplications where the previous code interleaved 2. (so sse2/16 method does 4x4 multiplications instead of 8x2) which should benefit other SSE2 capable machines too, I hope.
Also I realized that the compiler options for the *-amd binaries have never been right: The SSE2 code path was using the Athlon64 instruction set but Athlon32 scheduling. I don't know how much difference it makes, but it is fixed to use Athlon64 scheduling now. (The *-intel binaries use i686 scheduling for the SSE2 code path because GCC doesn't know about Core2 yet and the P4 scheduling is slower, even on a P4). Here are benchmarks at p=100e12 for my P4 (2.9GHz, 8K L1, 512K L2): Code:
19k SoB.dat 68k riesel.dat 237k sr5data.txt
----------- -------------- ----------------
sr2sieve-intel 1.4.42 377 kp/s 194 kp/s 85 kp/s
sr2sieve-intel 1.5.6 425 kp/s 223 kp/s 98 kp/s
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Very Prime Riesel and Sierpinski k | robert44444uk | Open Projects | 587 | 2016-11-13 15:26 |
| Sierpinski/ Riesel bases 6 to 18 | robert44444uk | Conjectures 'R Us | 139 | 2007-12-17 05:17 |
| Sierpinski/Riesel Base 10 | rogue | Conjectures 'R Us | 11 | 2007-12-17 05:08 |
| Sierpinski / Riesel - Base 23 | michaf | Conjectures 'R Us | 2 | 2007-12-17 05:04 |
| Sierpinski / Riesel - Base 22 | michaf | Conjectures 'R Us | 49 | 2007-12-17 05:03 |