mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
Thread Tools
Old 2023-06-19, 22:25   #1255
Citrix
 
Citrix's Avatar
 
Jun 2003

22×11×37 Posts
Default

Code:
twinsieve.exe -W1 -k2 -K7001 -n7001 -b2 -p3 -P1e11 -fA -t1
twinsieve v1.6.1, a program to find factors of k*b^n+1/-1 numbers for fixed b and n and variable k
Sieve started: 3 <= p <= 1e11 with 3500 terms (3 < k < 7001, k*2^7001) (expecting 3348 factors)
Increasing worksize to 1600000 since each chunk is tested in less than a second
Increasing worksize to 40000000 since each chunk is tested in less than a second
  p=35243280493, 25.13M p/sec, 3499 factors found at 84.38 f/sec (last 1 min), 35.2% done. ETC 2023-06-19 17:25
  p=69548741681, 23.96M p/sec, 3499 factors found at 53.02 f/sec (last 2 min), 69.5% done. ETC 2023-06-19 17:25
Citrix is offline   Reply With Quote
Old 2023-06-19, 22:53   #1256
gd_barnes
 
gd_barnes's Avatar
 
"Gary"
May 2007
Overland Park, KS

300078 Posts
Default

Quote:
Originally Posted by Citrix View Post
Code:
twinsieve.exe -W1 -k2 -K7001 -n7001 -b2 -p3 -P1e11 -fA -t1
twinsieve v1.6.1, a program to find factors of k*b^n+1/-1 numbers for fixed b and n and variable k
Sieve started: 3 <= p <= 1e11 with 3500 terms (3 < k < 7001, k*2^7001) (expecting 3348 factors)
Increasing worksize to 1600000 since each chunk is tested in less than a second
Increasing worksize to 40000000 since each chunk is tested in less than a second
  p=35243280493, 25.13M p/sec, 3499 factors found at 84.38 f/sec (last 1 min), 35.2% done. ETC 2023-06-19 17:25
  p=69548741681, 23.96M p/sec, 3499 factors found at 53.02 f/sec (last 2 min), 69.5% done. ETC 2023-06-19 17:25
It is down to the final term remaining. I suspect it may have something to do with not wanting to remove that final term.

To confirm this, you could expand the k range to -k2 -K100000. If that term no longer shows, then that is the problem.

To confirm it further, you could even test a narrower k-range such as 2 to 4000. If this is the problem, you should end up with a different single term remaining since the original one was k>4000.
gd_barnes is online now   Reply With Quote
Old 2023-06-19, 23:03   #1257
Citrix
 
Citrix's Avatar
 
Jun 2003

22×11×37 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
It is down to the final term remaining. I suspect it may have something to do with not wanting to remove that final term.

To confirm this, you could expand the k range to -k2 -K100000. If that term no longer shows, then that is the problem.

To confirm it further, you could even test a narrower k-range such as 2 to 4000. If this is the problem, you should end up with a different single term remaining since the original one was k>4000.
The problem is it is removing numbers that are not composites and leaving the composites in the output file.
4879*2^7001+-1 is the term remaining which has small factors.

I believe this bug is introduced in latest version-with the even k values removed for base 2. The program is still finding factors for even k value.
Code:
3 | 4810*2^7001+1
3 | 4816*2^7001+1
3 | 4822*2^7001+1
3 | 4828*2^7001+1
3 | 4852*2^7001+1
3 | 4858*2^7001+1
3 | 4870*2^7001+1
3 | 4876*2^7001+1
3 | 4888*2^7001+1

Last fiddled with by Citrix on 2023-06-19 at 23:11
Citrix is offline   Reply With Quote
Old 2023-06-19, 23:52   #1258
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

11100101010102 Posts
Default

I know what the problem is now with twin sieve. I will fix.
rogue is offline   Reply With Quote
Old 2023-06-20, 18:21   #1259
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

162528 Posts
Default

I have committed the change for twinsieve. Regarding srsieve2 I have been able to reproduce the issue in a reliable way that will make it easier for me to debug. sr2sieve finds the factors, but srsieve2 does not (when using Legendre tables). I do not know how long it will take me to find the issue and fix it.

In doing this test one thing I did not expect was that srsieve2 would be 30x faster than sr2sieve for these five sequences of S777. I'm sure some of that is because sr2sieve was built without the asm code and with -O2. srsieve2 was built with -O3, but has no asm either. This is probably not true for higher p, but I have no explanation why srsieve2 is that much faster. In other comparison tests I have run sr2sieve was faster than srsieve2.
rogue is offline   Reply With Quote
Old 2023-06-22, 03:42   #1260
ryanp
 
ryanp's Avatar
 
Jun 2012
Boulder, CO

7358 Posts
Default xyyxsievecl behavior

I'm doing some simple tests with xyyxsievecl, and would like to understand the output:

Code:
$ ./xyyxsievecl -g 16 -G 16 -P 1e10 -x 300e3 -X 301e3 -y 150e3 -Y 151e3 -s '+' -o out.txt -M 1e6
xyyxsieve v2.0, a program to find factors numbers of the form x^y+y^x or x^y-y^x
Quick elimination of terms info (in order of check):
    0 because y >= x
    501001 because the term is even
    95143 because x and y have a common divisor
GPU primes per worker is 442368
Sieve started: 3 <= p <= 1e10 with 405857 terms (300000 <= x <= 301000, 150000 <= y <= 151000) (expecting 386492 factors)
  p=0, 0.000 p/sec, 97708 factors found at 342.2 f/sec (last 1 min)
  p=0, 0.000 p/sec, 102860 factors found at 91.61 f/sec (last 2 min)
  p=0, 0.000 p/sec, 195274 factors found at 100.6 f/sec (last 3 min)
  p=0, 0.000 p/sec, 284963 factors found at 101.6 f/sec (last 4 min)
  p=20952199, 4.358K p/sec, 376301 factors found at 89.79 f/sec (last 5 min)
  p=6462271, 7.261K p/sec, 384344 factors found at 70.84 f/sec (last 6 min
  p=6462271, 15.55K p/sec, 384620 factors found at 59.50 f/sec (last 7 min)
  p=36131591, 19.96K p/sec, 384992 factors found at 54.15 f/sec (last 8 min), 0.3% done. ETC 2023-06-23 17:00
  p=51694691, 22.58K p/sec, 385334 factors found at 44.99 f/sec (last 9 min)
  p=51694691, 26.85K p/sec, 385643 factors found at 40.45 f/sec (last 10 min), 0.5% done. ETC 2023-06-23 12:16
Why does p stay at 0 for a while, then suddenly jump up, drop back down, etc (and occasionally remain stuck for a while at the same value)? Is this expected behavior, something to do with GPU threading, or a legitimate bug?

(Note: this is on an A100 GPU, and the overall speed isn't remotely close to what I'd expect.)
ryanp is offline   Reply With Quote
Old 2023-06-22, 11:49   #1261
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

110011000012 Posts
Default

Yes, it seems like it has to do with threading. Usually, you should not start a sieve from fresh with the ...cl variants, instead start with the non-cl variants up to p=1e9 (for example) and then let the ...cl variant take over.

You should set -G to 108 (number of SMs). You should set -g to 108 (number of SMs) and leave out -G for a beginning.

Edit: Setting -M to such a large number is usually a sign you did something not intended. You should be able to leave it out when doing it the way I described above.

Last fiddled with by kruoli on 2023-06-22 at 12:13 Reason: Addition. Correction.
kruoli is offline   Reply With Quote
Old 2023-06-22, 12:59   #1262
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

163310 Posts
Default

Code:
$ ./xyyxsieve -P 1e8 -x 300e3 -X 301e3 -y 150e3 -Y 151e3 -s '+' -o ryanp.txt
xyyxsieve v1.8.1, a program to find factors numbers of the form x^y+y^x or x^y-y^x
Quick elimination of terms info (in order of check):
    0 because y >= x
    501001 because the term is even
    95143 because x and y have a common divisor
Sieve started: 3 < p < 1e8 with 405857 terms (300000 <= x <= 301000, 150000 <= y <= 151000) (expecting 381651 factors)
Decreasing worksize to 2528 since each chunk needs more than 5 seconds to test
Increasing worksize to 10112 since each chunk is tested in less than a second
Decreasing worksize to 5056 since each chunk needs more than 5 seconds to test
  p=55030691, 3.592K p/sec, 383736 factors found at 1.879 f/sec (last 1 min), 55.0% done. ETC 2023-06-22 14:55
Increasing worksize to 20224 since each chunk is tested in less than a second
Decreasing worksize to 10112 since each chunk needs more than 5 seconds to test
  p=99688909, 3.769K p/sec, 384391 factors found at 0.49 sec per factor (last 20 min), 99.6% done. ETC 2023-06-22 14:53
Sieve completed at p=100000037.
CPU time: 1533.52 sec. (0.04 sieving) (1.00 cores)
21464 terms written to ryanp.txt
Primes tested: 5761456.  Factors found: 384393.  Remaining terms: 21464.  Time: 1528.32 seconds.
$ ./xyyxsievecl -P 1e10 -i ryanp.txt -o ryanp2.txt -g 84 -H -O ryanp.factors
xyyxsieve v1.8.1, a program to find factors numbers of the form x^y+y^x or x^y-y^x
GPU primes per worker is 1806336
Sieve started: 100000037 < p < 1e10 with 21464 terms (300000 <= x <= 301000, 150001 <= y <= 151000) (expecting 4292 factors)
  p=9928036067, 332.5K p/sec, 4244 factors found at 1.582 f/sec (last 22 min), 99.2% done. ETC 2023-06-22 15:18
Sieve completed at p=10011193811.
CPU time: 1354.18 sec. (3.49 sieving) (1.00 cores) GPU time: 1345.37 sec.
17212 terms written to ryanp2.txt
Primes tested: 449777664.  Factors found: 4252.  Remaining terms: 17212.  Time: 1349.22 seconds.
Much better, no?

Last fiddled with by kruoli on 2023-06-22 at 13:24 Reason: More recent data.
kruoli is offline   Reply With Quote
Old 2023-06-22, 13:01   #1263
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

2×3×1,223 Posts
Default

Quote:
Originally Posted by ryanp View Post
I'm doing some simple tests with xyyxsievecl, and would like to understand the output:

Why does p stay at 0 for a while, then suddenly jump up, drop back down, etc (and occasionally remain stuck for a while at the same value)? Is this expected behavior, something to do with GPU threading, or a legitimate bug?

(Note: this is on an A100 GPU, and the overall speed isn't remotely close to what I'd expect.)
p is supposed to represent the largest p where we know that all p less than that value have been tested.

If I had to guess, some workers have no work so the largest p they have tested is still 0. I can adjust the code to avoid workers with no work when determining p.

I am more concerned that p decreased in the middle of the run, that is definitely a bug.
rogue is offline   Reply With Quote
Old 2023-06-22, 17:04   #1264
ryanp
 
ryanp's Avatar
 
Jun 2012
Boulder, CO

1DD16 Posts
Default

Another bug... after a while, it crashes with:

Code:
  p=4685491913, 19.71K p/sec, 3838 factors found at 1.78 sec per factor (last 56 
p=4685491913, 19.37K p/sec, 3892 factors found at 1.78 sec per factor (last 57 
p=4737217409, 19.69K p/sec, 3982 factors found at 1.77 sec per factor (last 58 
p=4737217409, 19.36K p/sec, 4026 factors found at 1.78 sec per factor 
Fatal Error:  Something is wrong.  Counted terms (272597) != expected terms (276672)
ryanp is offline   Reply With Quote
Old 2023-06-22, 17:17   #1265
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

23×71 Posts
Default

Is this with the same command line? Or are you using something with -W or -G?
kruoli is offline   Reply With Quote
Reply



All times are UTC. The time now is 04:17.


Fri Jul 7 04:17:00 UTC 2023 up 323 days, 1:45, 0 users, load averages: 2.11, 1.78, 1.53

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔