mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2019-10-31, 22:30   #1
MrRepunit
 
MrRepunit's Avatar
 
Mar 2011
Germany

11000012 Posts
Default gr-mfaktc: a CUDA program for generalized repunits prefactoring

Hi,
finally I completed the generalized repunit version of mfaktc.

Changes compared to mfaktc-0.21:
- implemented factoring of generalized repunits
- Removed Barrett and 72 bit kernels
- Removed Wagstaff related stuff
- Added 64 bit kernels
- Compiling with more-classes flag seem to be slightly faster, thus it is switched on
- allowed are all bases >= 2, program might crash if base is larger than roughly 100,000
- implemented special cases for bases 2, 3, 5, 6, 7, 8, 10, 11, 12
- dropped lower limit for exponents from 100,000 to 50,000

The zip file contains the source code and executables for Linux and Windows (both 64 bit).

Check if it runs correctly first
.
Code:
./gr-mfaktc.exe -st
Which takes a few minutes and should give a similar output at the end:
Code:
Selftest statistics
  number of tests           31127
  successfull tests         31127

  kernel             | success |   fail
  -------------------+---------+-------
  UNKNOWN kernel     |      0  |      0
  64bit_mul32        |   4633  |      0
  75bit_mul32        |   5712  |      0
  95bit_mul32        |   5918  |      0
  64bit_mul32_gs     |   4190  |      0
  75bit_mul32_gs     |   5248  |      0
  95bit_mul32_gs     |   5426  |      0

 selftest PASSED!
Running
Code:
./gr-mfaktc.exe -tf 23 3300019 1 60
Example Output:
Code:
got assignment: base=23 exp=3300019 bit_min=1 bit_max=60 (0.05 GHz-days)
Starting trial factoring R[23]3300019 from 2^1 to 2^60 (0.05 GHz-days)
 k_min =  0
 k_max =  174684070698
Using GPU kernel "64bit_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve  | Exp        Base        bit-range
Oct 31 21:57 |    6   0.1% |  0.009    n.a. |    232.71    22837  | 3300019    23          1:60
R[23]3300019 has a factor: 39600229
Oct 31 21:57 | 1347  29.1% |  0.008    n.a. |    261.80    22837  | 3300019    23          1:60
R[23]3300019 has a factor: 1021252834106707
Oct 31 21:57 | 4619 100.0% |  0.011    n.a. |    190.40    22837  | 3300019    23          1:60
found 2 factors for R[23]3300019 from 2^ 1 to 2^60 [mfaktc 0.21 64bit_mul32_gs]
tf(): total time spent: 19.370s
Or running without parameters, then it uses the worktodo.txt file:
Code:
Factor=bla,66362159,64,68
Factor=bla,base=17,1055167,1,64
The bla string is optional. First line defaults to base=10
I attached the compiled versions of gr-mfaktc for Linux and Windows (both 64 bit).

Executables are compiled with
Code:
NVCCFLAGS += --generate-code arch=compute_50,code=sm_50 # CC 5.x GPUs will use this code
NVCCFLAGS += --generate-code arch=compute_60,code=sm_60 # CC 6.0 GPUs will use this code
NVCCFLAGS += --generate-code arch=compute_61,code=sm_61 # CC 6.1 GPUs will use this code
NVCCFLAGS += --generate-code arch=compute_70,code=sm_70 # CC 7.x GPUs will use this code
NVCCFLAGS += --generate-code arch=compute_75,code=sm_75 # CC 7.5 GPUs will use this code
I am using gr-mfaktc to find factors of base 10 repunits as the presieving step for the PRP tests. Recently the search reached the 4000000 digits milestone, but so far no new prime was found (after R270343). Help is always welcome, pm me if you want to join the search.

Let me know if there are any issues. Have fun finding new factors.
Cheers,
Danilo
Attached Files
File Type: zip gr-mfaktc-0.21.zip (967.0 KB, 512 views)
MrRepunit is offline   Reply With Quote
Old 2019-11-01, 08:13   #2
lalera
 
lalera's Avatar
 
Jul 2003

65110 Posts
Default

hi,
the win64 version of gr-mfaktc does not work
all selftests failed
lalera is offline   Reply With Quote
Old 2019-11-01, 20:30   #3
MrRepunit
 
MrRepunit's Avatar
 
Mar 2011
Germany

6116 Posts
Default

Quote:
Originally Posted by lalera View Post
hi,
the win64 version of gr-mfaktc does not work
all selftests failed

What GPU do you use? On my own system (GeForce GTX 1080, compute capability 6.1 & CUDA 10.1) it works without issues.
I know that a GTX 1660 was giving problems running the previous mfaktc-repunit version with the exact same compilation settings as gr-mfaktc, did not find the problem yet. There might be an issue with my compilation setup or the drivers for compute capabilities > 6.1. The same GPU could run mfaktc (https://download.mersenne.ca/mfaktc/...in.cuda100.zip) without issues.



Maybe somebody else could try to compile it for windows and upload the binary here?
I have no idea what the issue could be, since I cannot test it myself on Turing cards.
MrRepunit is offline   Reply With Quote
Old 2019-11-01, 20:44   #4
lalera
 
lalera's Avatar
 
Jul 2003

10100010112 Posts
Default

hi,
i do use win10 x64 (1903) with a gtx1660ti
drivers 426.00 with cuda 10.1
lalera is offline   Reply With Quote
Old 2019-11-01, 21:39   #5
lalera
 
lalera's Avatar
 
Jul 2003

3·7·31 Posts
Default

hi,
i tried another machine with win10 x64 gtx1050ti nvidia driver 419.67
all selftests passed
lalera is offline   Reply With Quote
Old 2019-11-01, 21:54   #6
MrRepunit
 
MrRepunit's Avatar
 
Mar 2011
Germany

9710 Posts
Default

Quote:
Originally Posted by lalera View Post
hi,
i tried another machine with win10 x64 gtx1050ti nvidia driver 419.67
all selftests passed

Okay, so there seems to be pattern. Let's wait a bit, maybe we can isolate the issue with more feedback.
MrRepunit is offline   Reply With Quote
Old 2019-11-02, 23:19   #7
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Oceanus Procellarum

57258 Posts
Default

I tried this on two machines. An older one running Windows 7 Pro x64 with a GTX 750Ti and a newer one running Windows 10 Pro x64, v1903, using a GTX 1080.

Using the version of mfaktc I had, the 1080 would run around 1050 GHz-d/day. This one was 730 GHz-d/day., more or less. This was with base 2.

It seems that specifying the base in the worktodo file would be problematic for PrimeNet and GPUto72. Perhaps it may be better to specify the base in the configuration file? I would not think many would be changing the base very often, or at all, given this projects goal of finding Mersenne prime numbers.

storm5510 is offline   Reply With Quote
Old 2019-11-03, 05:39   #8
Citrix
 
Citrix's Avatar
 
Jun 2003

163110 Posts
Default

3 questions
1) Were you able to figure out the Legendre/Jacobi symbols to filter primes for all bases.
2) Is there a reason the app would crash for large bases >100,000?
3) Are negative bases also supported -example 10^n+1


Thanks.

Last fiddled with by Citrix on 2019-11-03 at 06:33
Citrix is offline   Reply With Quote
Old 2019-11-03, 11:39   #9
MrRepunit
 
MrRepunit's Avatar
 
Mar 2011
Germany

97 Posts
Default

Quote:
Originally Posted by storm5510 View Post
I tried this on two machines. An older one running Windows 7 Pro x64 with a GTX 750Ti and a newer one running Windows 10 Pro x64, v1903, using a GTX 1080.

Using the version of mfaktc I had, the 1080 would run around 1050 GHz-d/day. This one was 730 GHz-d/day., more or less. This was with base 2.

It seems that specifying the base in the worktodo file would be problematic for PrimeNet and GPUto72. Perhaps it may be better to specify the base in the configuration file? I would not think many would be changing the base very often, or at all, given this projects goal of finding Mersenne prime numbers.


When I started the fork of mfaktc I was only considering base 10 repunits, so I had to 'deoptimize' some code. I removed the Barrett kernels as they seemed unsuited to fit the base 10 and also more general bases. I needed to generalize some methods that where using the better performant shl instruction (optional_mul). Also implementing the 64 bit kernel was for speeding up lower exponents. Mersenne numbers are already factored far beyond this point.That considered the current version is definitively not optimal for factoring Mersenne numbers, but tries to focus on other bases and smaller exponents.



Reading the default base from the configuration file is a good idea, will implement this soon.


However, I am not sure if gr-mfaktc should be a complete replacement for mfaktc (it is still the project from TheJudger), or if it should be thought as an orthogonal project that puts the focus on general repunits. I could certainly try to start from scratch againg and cherry-pick my changes while leaving the Mersenne & Wagstaff number stuff mainly untouched and just add more functionality. Maybe TheJudger has some thoughts about this...
MrRepunit is offline   Reply With Quote
Old 2019-11-03, 11:55   #10
MrRepunit
 
MrRepunit's Avatar
 
Mar 2011
Germany

11000012 Posts
Default

Quote:
Originally Posted by Citrix View Post
1) Were you able to figure out the Legendre/Jacobi symbols to filter primes for all bases.
I figured out the symbols for bases 3, 5, 6, 7, 8, 10, 11 and 12. All other bases are just testing all remaining possible numbers. Look for methods 'class_needed_<base>' in mfaktc.c. I can certainly try to write it up here in this wiki if it is wished for.


Quote:
Originally Posted by Citrix View Post
2) Is there a reason the app would crash for large bases >100,000?
I have not checked in detail yet, will do this when I have a bit more time.



Quote:
Originally Posted by Citrix View Post
3) Are negative bases also supported -example 10^n+1
Not yet, I have to look into the Wagstaff code and try to generalize this. Hopefully this is not to complicated, except maybe for the Legendre/Jacobi symbols.

Last fiddled with by MrRepunit on 2019-11-03 at 11:55
MrRepunit is offline   Reply With Quote
Old 2019-11-03, 13:34   #11
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Oceanus Procellarum

302910 Posts
Default

Quote:
Originally Posted by MrRepunit View Post
When I started the fork of mfaktc I was only considering base 10 repunits, so I had to 'deoptimize' some code. I removed the Barrett kernels as they seemed unsuited to fit the base 10 and also more general bases. I needed to generalize some methods that where using the better performant shl instruction (optional_mul). Also implementing the 64 bit kernel was for speeding up lower exponents. Mersenne numbers are already factored far beyond this point.That considered the current version is definitively not optimal for factoring Mersenne numbers, but tries to focus on other bases and smaller exponents.

Reading the default base from the configuration file is a good idea, will implement this soon.

However, I am not sure if gr-mfaktc should be a complete replacement for mfaktc (it is still the project from TheJudger), or if it should be thought as an orthogonal project that puts the focus on general repunits. I could certainly try to start from scratch againg and cherry-pick my changes while leaving the Mersenne & Wagstaff number stuff mainly untouched and just add more functionality. Maybe TheJudger has some thoughts about this...
Thank you for the reply! If anyone wants to run Base 2, having the setting in the configuration file will remove any ambiguity and assignments, as presented by PrimeNet, would run without any modifications.

Your project goes in a different direction so I would not fret much over base two. I find the possibility of being able to run different bases quite interesting.

storm5510 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3644 2023-09-13 14:39
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1724 2023-06-04 23:31
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51
World's dumbest CUDA program? xilman Programming 1 2009-11-16 10:26

All times are UTC. The time now is 21:34.


Fri Sep 29 21:34:34 UTC 2023 up 16 days, 19:16, 0 users, load averages: 0.73, 1.01, 1.16

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔