20210129, 18:09  #518  
"Mark"
Apr 2003
Between here and the
3·7·13·23 Posts 
Quote:
I found the error and committed a change to sourceforge. I have updated srsieve2.7z over at sourceforge as well. Last fiddled with by rogue on 20210129 at 18:37 

20210202, 08:59  #519 
"Alexander"
Nov 2008
The Alamo City
2·3·5·19 Posts 
The current SVN version fails to run on Kubuntu 20.04:
Code:
$ ./srsieve2 W "3" n "50e3" N "230e3" P "1e9" o 't17_b2.prp' f B s "37803*2^n1" srsieve2 v1.5, a program to find factors of k*b^n+c numbers for fixed b and variable k and n Sieving with generic logic for p >= 3 Sieve started: 3 < p < 1e9 with 180001 terms (50000 < n < 230000, k*2^n+c) (expecting 170458 factors) Sieving one sequence where abs(c) = 1 for p >= 37803 Split 1 base 2 sequence into 94 base 2^180 sequences. malloc(): corrupted top size Aborted (core dumped) 
20210202, 13:21  #520  
"Mark"
Apr 2003
Between here and the
1100010000111_{2} Posts 
Quote:


20210202, 15:47  #521 
"Mark"
Apr 2003
Between here and the
1887_{16} Posts 
I found and fixed the problem. The changes are committed to sourceforge.

20210202, 19:24  #522 
"Mark"
Apr 2003
Between here and the
1100010000111_{2} Posts 
There seems to be an issue with the Legendre lookup table. If you do not use l, then it will miss factors. It should be easy to track down, but one never knows. Note that l disables the building of the Legendre lookup tables. It is enabled by default.

20210203, 14:07  #523 
"Mark"
Apr 2003
Between here and the
14207_{8} Posts 
This is now fixed.

20210203, 15:30  #524 
"Mark"
Apr 2003
Between here and the
3·7·13·23 Posts 
BTW, now with this change the speed of srsieve2 (for CisOne logic) is within 5% of the speed of sr1sieve (with x86 asm) and about 10% faster than the speed of sr1sieve (with no x86 asm). By "within" I mean that sometimes it is faster and sometimes it is slower. The speed difference appears to be one of cache usage and CPU load on the machine overall. Note this was only tested with a single sequence so it is possible that other sequences will yield different results.
I will have to play around with unrolling some of the loops in srsieve2 to see if I can do better, but right now I'm pleased to see that it is performing so well considering it didn't look so well earlier this week. My intention is to post a build after I track down the issue with the CisOne logic in srsieve2cl. 
20210204, 01:40  #525 
"Mark"
Apr 2003
Between here and the
3·7·13·23 Posts 
Great news! I have tracked down and squashed the known bugs in srsieve2 and srsieve2cl. I have some benchmarks to share.
The CPU is an Intel i788550H at 2.6 GHz and the GPU is an NVIDIA Quadro P3200. I was running no other CPU/GPU intensive processes during this test. All runs yielded the same set of factors. I sieved 37803*2^n1 for n from 5e4 to 25e4 up to 1e6. I then ran the file thru sr1sieve, sr2sieve, and sr2sievecl taking the average of 5 runs. Here are the results: Code:
srsieve2 i b2_n.in P1e10 504 srsieve2 i b2_n.in P1e10 l 647 srsieve2cl i b2_n.in P1e10 355 srsieve2cl i b2_n.in P1e10 l 353 srsieve2cl i b2_n.in P1e10 g100 221 srsieve2cl i b2_n.in P1e10 g100 1 210 srsieve2cl i b2_n.in P1e10 g1000 184 srsieve2cl i b2_n.in P1e10 g1000 l 183 sr1sieve i b2_n.in P1e10 ffact.out (asm) 460 sr1sieve i b2_n.in P1e10 ffact.out x (asm) 562 sr1sieve i b2_n.in P1e10 ffact.out (no asm) 455 sr1sieve i b2_n.in P1e10 ffact.out x (no asm) 549 It is clear that srsieve2cl with g1000 clearly beats out everything else. With g1000 it uses less than 500 MB of GPU memory (per Windows Task Manager. It will be interesting to see this run on lower GPUs to see how they compare. So with this report, mtsieve 2.1.6 is now released. Here are the changes: Code:
framework: Add largestPrimeTested parameter to NotifyAppToRebuild() as the app cannot rely on accurately determining that value. srsieve2, srsieve2cl: version 1.5 Fixed remaining known issues with CisOne logic (sequences where abs(c) = 1) for a single CisOne sequence (sr1sieve). Added OpenCL code for CisOne logic. Added Legendre table lookups for CisOne logic. 
20210204, 02:05  #526 
Romulan Interpreter
Jun 2011
Thailand
3^{3}×347 Posts 

20210204, 12:25  #527 
Dec 2011
After milion nines:)
10110001011_{2} Posts 
Does srsieve2cl with g1000 kill srsieve1 in speed?

20210204, 13:23  #528 
"Mark"
Apr 2003
Between here and the
3×7×13×23 Posts 
Based upon the single sequence I tested given the hardware specs I provided, sriseve2cl with g1000 is more than twice as fast as sr1sieve. With g100 it is slightly more than twice is faster as sr1sieve. With a higher value with g, it could possible be 3x faster, but that is on this hardware.
