![]() |
|
|
#518 | |
|
"Mark"
Apr 2003
Between here and the
11·577 Posts |
Quote:
I found the error and committed a change to sourceforge. I have updated srsieve2.7z over at sourceforge as well. Last fiddled with by rogue on 2021-01-29 at 18:37 |
|
|
|
|
|
|
#519 |
|
"Alexander"
Nov 2008
The Alamo City
68810 Posts |
The current SVN version fails to run on Kubuntu 20.04:
Code:
$ ./srsieve2 -W "3" -n "50e3" -N "230e3" -P "1e9" -o 't17_b2.prp' -f B -s "37803*2^n-1" srsieve2 v1.5, a program to find factors of k*b^n+c numbers for fixed b and variable k and n Sieving with generic logic for p >= 3 Sieve started: 3 < p < 1e9 with 180001 terms (50000 < n < 230000, k*2^n+c) (expecting 170458 factors) Sieving one sequence where abs(c) = 1 for p >= 37803 Split 1 base 2 sequence into 94 base 2^180 sequences. malloc(): corrupted top size Aborted (core dumped) |
|
|
|
|
|
#520 | |
|
"Mark"
Apr 2003
Between here and the
11×577 Posts |
Quote:
|
|
|
|
|
|
|
#521 |
|
"Mark"
Apr 2003
Between here and the
11·577 Posts |
I found and fixed the problem. The changes are committed to sourceforge.
|
|
|
|
|
|
#522 |
|
"Mark"
Apr 2003
Between here and the
11·577 Posts |
There seems to be an issue with the Legendre lookup table. If you do not use -l, then it will miss factors. It should be easy to track down, but one never knows. Note that -l disables the building of the Legendre lookup tables. It is enabled by default.
|
|
|
|
|
|
#523 |
|
"Mark"
Apr 2003
Between here and the
11×577 Posts |
This is now fixed.
|
|
|
|
|
|
#524 |
|
"Mark"
Apr 2003
Between here and the
18CB16 Posts |
BTW, now with this change the speed of srsieve2 (for CisOne logic) is within 5% of the speed of sr1sieve (with x86 asm) and about 10% faster than the speed of sr1sieve (with no x86 asm). By "within" I mean that sometimes it is faster and sometimes it is slower. The speed difference appears to be one of cache usage and CPU load on the machine overall. Note this was only tested with a single sequence so it is possible that other sequences will yield different results.
I will have to play around with unrolling some of the loops in srsieve2 to see if I can do better, but right now I'm pleased to see that it is performing so well considering it didn't look so well earlier this week. My intention is to post a build after I track down the issue with the CisOne logic in srsieve2cl. |
|
|
|
|
|
#525 |
|
"Mark"
Apr 2003
Between here and the
143138 Posts |
Great news! I have tracked down and squashed the known bugs in srsieve2 and srsieve2cl. I have some benchmarks to share.
The CPU is an Intel i78-8550H at 2.6 GHz and the GPU is an NVIDIA Quadro P3200. I was running no other CPU/GPU intensive processes during this test. All runs yielded the same set of factors. I sieved 37803*2^n-1 for n from 5e4 to 25e4 up to 1e6. I then ran the file thru sr1sieve, sr2sieve, and sr2sievecl taking the average of 5 runs. Here are the results: Code:
srsieve2 -i b2_n.in -P1e10 504 srsieve2 -i b2_n.in -P1e10 -l 647 srsieve2cl -i b2_n.in -P1e10 355 srsieve2cl -i b2_n.in -P1e10 -l 353 srsieve2cl -i b2_n.in -P1e10 -g100 221 srsieve2cl -i b2_n.in -P1e10 -g100 -1 210 srsieve2cl -i b2_n.in -P1e10 -g1000 184 srsieve2cl -i b2_n.in -P1e10 -g1000 -l 183 sr1sieve -i b2_n.in -P1e10 -ffact.out (asm) 460 sr1sieve -i b2_n.in -P1e10 -ffact.out -x (asm) 562 sr1sieve -i b2_n.in -P1e10 -ffact.out (no asm) 455 sr1sieve -i b2_n.in -P1e10 -ffact.out -x (no asm) 549 It is clear that srsieve2cl with -g1000 clearly beats out everything else. With -g1000 it uses less than 500 MB of GPU memory (per Windows Task Manager. It will be interesting to see this run on lower GPUs to see how they compare. So with this report, mtsieve 2.1.6 is now released. Here are the changes: Code:
framework:
Add largestPrimeTested parameter to NotifyAppToRebuild() as the app cannot rely
on accurately determining that value.
srsieve2, srsieve2cl: version 1.5
Fixed remaining known issues with CisOne logic (sequences where abs(c) = 1) for
a single CisOne sequence (sr1sieve).
Added OpenCL code for CisOne logic.
Added Legendre table lookups for CisOne logic.
|
|
|
|
|
|
#526 |
|
Romulan Interpreter
Jun 2011
Thailand
100101100010102 Posts |
|
|
|
|
|
|
#527 |
|
Dec 2011
After milion nines:)
5×172 Posts |
Does srsieve2cl with -g1000 kill srsieve1 in speed?
|
|
|
|
|
|
#528 |
|
"Mark"
Apr 2003
Between here and the
143138 Posts |
Based upon the single sequence I tested given the hardware specs I provided, sriseve2cl with -g1000 is more than twice as fast as sr1sieve. With -g100 it is slightly more than twice is faster as sr1sieve. With a higher value with -g, it could possible be 3x faster, but that is on this hardware.
|
|
|
|