20191031, 22:30  #1 
Mar 2011
Germany
97 Posts 
grmfaktc: a CUDA program for generalized repunits prefactoring
Hi,
finally I completed the generalized repunit version of mfaktc. Changes compared to mfaktc0.21:  implemented factoring of generalized repunits  Removed Barrett and 72 bit kernels  Removed Wagstaff related stuff  Added 64 bit kernels  Compiling with moreclasses flag seem to be slightly faster, thus it is switched on  allowed are all bases >= 2, program might crash if base is larger than roughly 100,000  implemented special cases for bases 2, 3, 5, 6, 7, 8, 10, 11, 12  dropped lower limit for exponents from 100,000 to 50,000 The zip file contains the source code and executables for Linux and Windows (both 64 bit). Check if it runs correctly first . Code:
./grmfaktc.exe st Code:
Selftest statistics number of tests 31127 successfull tests 31127 kernel  success  fail ++ UNKNOWN kernel  0  0 64bit_mul32  4633  0 75bit_mul32  5712  0 95bit_mul32  5918  0 64bit_mul32_gs  4190  0 75bit_mul32_gs  5248  0 95bit_mul32_gs  5426  0 selftest PASSED! Code:
./grmfaktc.exe tf 23 3300019 1 60 Code:
got assignment: base=23 exp=3300019 bit_min=1 bit_max=60 (0.05 GHzdays) Starting trial factoring R[23]3300019 from 2^1 to 2^60 (0.05 GHzdays) k_min = 0 k_max = 174684070698 Using GPU kernel "64bit_mul32_gs" Date Time  class Pct  time ETA  GHzd/day Sieve  Exp Base bitrange Oct 31 21:57  6 0.1%  0.009 n.a.  232.71 22837  3300019 23 1:60 R[23]3300019 has a factor: 39600229 Oct 31 21:57  1347 29.1%  0.008 n.a.  261.80 22837  3300019 23 1:60 R[23]3300019 has a factor: 1021252834106707 Oct 31 21:57  4619 100.0%  0.011 n.a.  190.40 22837  3300019 23 1:60 found 2 factors for R[23]3300019 from 2^ 1 to 2^60 [mfaktc 0.21 64bit_mul32_gs] tf(): total time spent: 19.370s Code:
Factor=bla,66362159,64,68 Factor=bla,base=17,1055167,1,64 I attached the compiled versions of grmfaktc for Linux and Windows (both 64 bit). Executables are compiled with Code:
NVCCFLAGS += generatecode arch=compute_50,code=sm_50 # CC 5.x GPUs will use this code NVCCFLAGS += generatecode arch=compute_60,code=sm_60 # CC 6.0 GPUs will use this code NVCCFLAGS += generatecode arch=compute_61,code=sm_61 # CC 6.1 GPUs will use this code NVCCFLAGS += generatecode arch=compute_70,code=sm_70 # CC 7.x GPUs will use this code NVCCFLAGS += generatecode arch=compute_75,code=sm_75 # CC 7.5 GPUs will use this code Let me know if there are any issues. Have fun finding new factors. Cheers, Danilo 
20191101, 08:13  #2 
Jul 2003
2×307 Posts 
hi,
the win64 version of grmfaktc does not work all selftests failed 
20191101, 20:30  #3  
Mar 2011
Germany
97 Posts 
Quote:
What GPU do you use? On my own system (GeForce GTX 1080, compute capability 6.1 & CUDA 10.1) it works without issues. I know that a GTX 1660 was giving problems running the previous mfaktcrepunit version with the exact same compilation settings as grmfaktc, did not find the problem yet. There might be an issue with my compilation setup or the drivers for compute capabilities > 6.1. The same GPU could run mfaktc (https://download.mersenne.ca/mfaktc/...in.cuda100.zip) without issues. Maybe somebody else could try to compile it for windows and upload the binary here? I have no idea what the issue could be, since I cannot test it myself on Turing cards. 

20191101, 20:44  #4 
Jul 2003
2·307 Posts 
hi,
i do use win10 x64 (1903) with a gtx1660ti drivers 426.00 with cuda 10.1 
20191101, 21:39  #5 
Jul 2003
2×307 Posts 
hi,
i tried another machine with win10 x64 gtx1050ti nvidia driver 419.67 all selftests passed 
20191101, 21:54  #6 
Mar 2011
Germany
97 Posts 

20191102, 23:19  #7 
Random Account
Aug 2009
Not U. + S.A.
2^{3}·3^{3}·11 Posts 
I tried this on two machines. An older one running Windows 7 Pro x64 with a GTX 750Ti and a newer one running Windows 10 Pro x64, v1903, using a GTX 1080.
Using the version of mfaktc I had, the 1080 would run around 1050 GHzd/day. This one was 730 GHzd/day., more or less. This was with base 2. It seems that specifying the base in the worktodo file would be problematic for PrimeNet and GPUto72. Perhaps it may be better to specify the base in the configuration file? I would not think many would be changing the base very often, or at all, given this projects goal of finding Mersenne prime numbers. 
20191103, 05:39  #8 
Jun 2003
37·43 Posts 
3 questions
1) Were you able to figure out the Legendre/Jacobi symbols to filter primes for all bases. 2) Is there a reason the app would crash for large bases >100,000? 3) Are negative bases also supported example 10^n+1 Thanks. Last fiddled with by Citrix on 20191103 at 06:33 
20191103, 11:39  #9  
Mar 2011
Germany
97 Posts 
Quote:
When I started the fork of mfaktc I was only considering base 10 repunits, so I had to 'deoptimize' some code. I removed the Barrett kernels as they seemed unsuited to fit the base 10 and also more general bases. I needed to generalize some methods that where using the better performant shl instruction (optional_mul). Also implementing the 64 bit kernel was for speeding up lower exponents. Mersenne numbers are already factored far beyond this point.That considered the current version is definitively not optimal for factoring Mersenne numbers, but tries to focus on other bases and smaller exponents. Reading the default base from the configuration file is a good idea, will implement this soon. However, I am not sure if grmfaktc should be a complete replacement for mfaktc (it is still the project from TheJudger), or if it should be thought as an orthogonal project that puts the focus on general repunits. I could certainly try to start from scratch againg and cherrypick my changes while leaving the Mersenne & Wagstaff number stuff mainly untouched and just add more functionality. Maybe TheJudger has some thoughts about this... 

20191103, 11:55  #10  
Mar 2011
Germany
97 Posts 
Quote:
Quote:
Not yet, I have to look into the Wagstaff code and try to generalize this. Hopefully this is not to complicated, except maybe for the Legendre/Jacobi symbols. Last fiddled with by MrRepunit on 20191103 at 11:55 

20191103, 13:34  #11  
Random Account
Aug 2009
Not U. + S.A.
2^{3}·3^{3}·11 Posts 
Quote:
Your project goes in a different direction so I would not fret much over base two. I find the possibility of being able to run different bases quite interesting. 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
mfakto: an OpenCL program for Mersenne prefactoring  Bdot  GPU Computing  1690  20221115 02:51 
mfaktc: a CUDA program for Mersenne prefactoring  TheJudger  GPU Computing  3568  20221110 20:02 
The P1 factoring CUDA program  firejuggler  GPU Computing  753  20201212 18:07 
World's seconddumbest CUDA program  fivemack  Programming  112  20150212 22:51 
World's dumbest CUDA program?  xilman  Programming  1  20091116 10:26 