![]() |
![]() |
#133 |
"Mark"
Apr 2003
Between here and the
146748 Posts |
![]()
Give this a spin for the range you are sieving. It works for 7 digit n and all factors are verified. Should be much faster than what you are using. OpenCL code has to wait. The calls to mpz_nextprime() have a significant impact on performance as p groups.
Last fiddled with by rogue on 2022-01-11 at 16:31 |
![]() |
![]() |
![]() |
#134 |
Mar 2006
2×7×37 Posts |
![]()
Thanks for this! I was able to quickly finish sieving with primes from 1e10 to 1e11 with this. I'm now sieving with primes from 1e11 to 1e12 which looks like it will finish on the 19th.
|
![]() |
![]() |
![]() |
#135 |
"Mark"
Apr 2003
Between here and the
11001101111002 Posts |
![]()
I hope to work on the OpenCL code between tomorrow and Monday. Fortunately for me Monday is a company holiday (MLK).
|
![]() |
![]() |
![]() |
#136 |
"Mark"
Apr 2003
Between here and the
22×33×61 Posts |
![]()
Happy MLK day (to the Americans out there).
Attached are smsieve and smsievecl (for the GPU). These support Sm(n) for n from 100000 to 9999999. Based upon my testing the GPU kernel for 6-digit n is about 50x faster than the CPU code. The GPU kernel for 7-digit n is about 20x faster than the CPU code. All factors are verified by code, although you can still use -O to output a file that you can use to verify externally with pfgw. On my system, smsievecl -ism7.in -p1e8 -P2e8 -Ofact_gpu.out -g16 -M150 completed in 615 seconds, finding 6594 factors. smsieve -ism7.in -p1e8 -P2e8 -Ofact_cpu.out will need to run another 3 hours to finish. So far there are no discrepancies. If you choose the run smsievecl, then you might want to adjust -g (default is 10) and -S (default is 200000). -S indicates how many n are tested per GPU call. If -S is too high you might experience a lag in screen updates. -g is used to determine how many prime are tested per GPU call. All code is committed to SVN in sourceforge. Outside of makefile changes, it should build on OS X and Linux. There will be an official release soon. Have fun! |
![]() |
![]() |
![]() |
#137 | |
Jun 2012
Boulder, CO
36910 Posts |
![]() Quote:
Code:
$ ./smsievecl -ism_in.txt -p1e8 -P1e11 -Ofact_gpu.out -g128 -M150 -o smar.txt smsievecl v1.0, a program to find factors of Smarandache numbers GPU primes per worker is 3538944 Sieve started: 1e8 < p < 1e11 with 13547 terms (1000039 <= n <= 1999879) (expecting 3694 factors) p=3294959417, 2.496M p/sec, 723 factors found at 6.719 f/sec (last 1 min), ... p=96774231461, 3.266M p/sec, 2423 factors found at 1.036 f/sec (last 20 min), 96.7% done. Sieve completed at p=100088593453. CPU time: 1252.91 sec. (41.45 sieving) (0.99 cores) GPU time: 1159.14 sec. 11106 terms written to smar.txt Primes tested: 4115791872. Factors found: 2441. Remaining terms: 11106. Time: 1257.51 seconds. Segmentation fault (core dumped) |
|
![]() |
![]() |
![]() |
#138 |
"Mark"
Apr 2003
Between here and the
22×33×61 Posts |
![]()
Nice speed! That is about 10x faster than my GPU. It might be faster with a higher value with -g. Have you tried that? What is the comparable speed on your CPU?
I don't get a segfault on Windows. At least the output files were written. I will have to see if I can reproduce on OS X. Probably a simple fix. Last fiddled with by rogue on 2022-01-18 at 13:46 |
![]() |
![]() |
![]() |
#139 | |
Jun 2012
Boulder, CO
32×41 Posts |
![]() Quote:
Code:
./smsievecl -ismar.txt -p1e11 -P1e13 -Ofact_gpu.out -g128 -M150 -o smar2.txt |
|
![]() |
![]() |
![]() |
#140 |
Mar 2006
51810 Posts |
![]()
After sieving 1e6 to 2e6 with primes up to 1e11 I had 11107 candidates remaining. Ryan has one less than me because his sieve went a little over 1e11 finding an extra factor and removing that one extra candidate. So, cpu and gpu results match up to 1e11.
My cpu sieving from 1e11 to 1e12 finally finished. Code:
smsieve -i sm7_01e6_02e6_ns1e11_c11107.txt -O sm7_01e6_02e6_s1e11_1e12_factors.txt -o sm7_01e6_02e6_ns1e12.txt -W 10 -p 1e11 -P 1e12 smsieve v1.0, a program to find factors of Smarandache numbers Sieve started: 1e11 < p < 1e12 with 11107 terms (1000129 <= n <= 1999801) (expecting 926 factors) p=100104257543, 68.67K p/sec, 1 factors found at 605 sec per factor (last 1 mi p=100210311557, 68.38K p/sec, 4 factors found at 303 sec per factor (last 2 mi ... p=999832895063, 72.48K p/sec, 908 factors found at 5940 sec per factor (last 7 p=999955146917, 72.48K p/sec, 908 factors found at 5949 sec per factor (last 7199 min), 100.0% done. ETC 2022-01-18 20:19 Sieve completed at p=1000000000063. CPU time: 4646505.62 sec. (9664.12 sieving) (9.84 cores) 10199 terms written to sm7_01e6_02e6_ns1e12.txt Primes tested: 33489857208. Factors found: 908. Remaining terms: 10199. Time: 472406.38 seconds. Last fiddled with by WraithX on 2022-01-19 at 04:37 |
![]() |
![]() |
![]() |
#141 |
Jun 2012
Boulder, CO
32×41 Posts |
![]()
Down to 9465 terms now. Once again, it wrote the output files, then segfaulted. Factors-found and remaining terms attached.
Code:
p=9991084976051, 3.741M p/sec, 1641 factors found at 108 sec per factor p=9997864667647, 3.741M p/sec, 1641 factors found at 108 sec per factor (last 1513 min), 99.9% done. ETC 2022-01-19 14:39 Sieve completed at p=10000089230221. CPU time: 91882.82 sec. (5024.92 sieving) (0.99 cores) GPU time: 85979.74 sec. 9465 terms written to smar2.txt Primes tested: 341950464000. Factors found: 1641. Remaining terms: 9465. Time: 92312.05 seconds. Segmentation fault (core dumped) |
![]() |
![]() |
![]() |
#142 | |
"Mark"
Apr 2003
Between here and the
11001101111002 Posts |
![]() Quote:
As for tracking down the segfault, can you build with debug=yes (in the makefile) and run via gdb to find the line of code causing it? You can use a very small range of primes, i.e. 1e9 to 2e9 and see if you can reproduce it. |
|
![]() |
![]() |
![]() |
#143 | |
Mar 2006
2·7·37 Posts |
![]() Quote:
Code:
(gdb) r Starting program: \smsieve_r165_debug\smsievecl.exe -ism_in.txt -p1e9 -P2e9 -Ofact_gpu.out -g128 -M150 -o smar.txt [New Thread 10200.0xf34] [New Thread 10200.0x1004] [New Thread 10200.0x2938] [New Thread 10200.0x2ab8] [New Thread 10200.0x21b0] [New Thread 10200.0x18c4] [New Thread 10200.0x2b04] [New Thread 10200.0x2a04] [New Thread 10200.0x2890] smsievecl v1.0, a program to find factors of Smarandache numbers [New Thread 10200.0x1780] GPU primes per worker is 262144 Sieve started: 1e9 < p < 2e9 with 13547 terms (1000039 <= n <= 1999879) (expecting 438 factors) p=1169060701, 134.7K p/sec, 94 factors found at 1.23 sec per factor (last 1 mi p=1339306091, 134.0K p/sec, 179 factors found at 1.30 sec per factor (last 2 m p=1516121771, 135.2K p/sec, 250 factors found at 1.40 sec per factor (last 3 m p=1693928333, 135.8K p/sec, 322 factors found at 1.46 sec per factor (last 4 m p=1872633797, 136.1K p/sec, 401 factors found at 1.46 sec per factor (last 5 m[Thread 10200.0x1780 exited with code 0] Sieve completed at p=2001565063. CPU time: 344.03 sec. (13451671603788.20 sieving) (0.99 cores) GPU time: 326.07 sec. 13110 terms written to smar.txt Primes tested: 47448064. Factors found: 437. Remaining terms: 13110. Time: 347.18 seconds. Thread 1 received signal SIGSEGV, Segmentation fault. 0x0000000000407d66 in xfree (memoryPtr=0x77300c0) at core/main.cpp:276 276 allocatedSize = *(size_t *) (currentPtr + 8); (gdb) |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Prime residues of near-prime modulo a prime | robert44444uk | Math | 27 | 2021-11-21 11:00 |
Primes of the form prime(a)+prime(b)+1=prime(c) and prime(b)-prime(a)-1=prime (c) | Hugo1177 | Miscellaneous Math | 1 | 2021-01-05 08:09 |
Smarandache-Fibonacci Primes | rogue | And now for something completely different | 5 | 2016-07-18 14:33 |
Smarandache-Wellin Primes | rogue | And now for something completely different | 25 | 2016-01-01 17:07 |
Smarandache semiprimes | sean | Factoring | 15 | 2014-11-09 06:05 |