mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2018-07-27, 21:57   #2828
GP2
 
GP2's Avatar
 
Sep 2003

5×11×47 Posts
Default

If anyone here has used mfaktc to search for factors of Wagstaff numbers (2^p + 1)/3 and if you have conserved log files or lists of factors that you're willing to share, let me know.

I've already asked in Wagstaff-related threads, and tried contacting some of the folks who were active when earlier searches were being done around 2013. I just thought I'd inquire here too.
GP2 is offline   Reply With Quote
Old 2018-07-28, 22:06   #2829
GP2
 
GP2's Avatar
 
Sep 2003

5×11×47 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Good news: CUDA 9.2.88 seems to have fixed the issue on Volta architecture!
I am experimenting with one GPU of a Tesla V100-SXM2-16GB (this is a p3.2xlarge instance on Amazon AWS cloud with Deep Learning Base AMI).

Same specs as you listed for the Tesla V100-PCIE-16GB except a slightly faster clock rate:
Code:
  clock rate (CUDA cores)   1530MHz
It is configurable to use CUDA 9.2.88, by setting the symbolic link /usr/local/cuda

mfaktc passes all the Mersenne self tests.

However, when I compile an alternate version with -DWAGSTAFF added to CFLAGS, it fails all the Wagstaff self tests.

Did you try the Wagstaff self tests on your V100 and do they work for you?

Is anything more needed to create a Wagstaff version, other than adding the -DWAGSTAFF flag in CFLAGS?

The compilation uses gcc 4.8
GP2 is offline   Reply With Quote
Old 2018-07-29, 21:10   #2830
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

Hello,

Quote:
Originally Posted by GP2 View Post
mfaktc passes all the Mersenne self tests.

However, when I compile an alternate version with -DWAGSTAFF added to CFLAGS, it fails all the Wagstaff self tests.

Did you try the Wagstaff self tests on your V100 and do they work for you?
no, I didn't try.

Quote:
Originally Posted by GP2 View Post
Is anything more needed to create a Wagstaff version, other than adding the -DWAGSTAFF flag in CFLAGS?
No, that should be enough. Will look at this later. Thanks for reporting.

Oliver
TheJudger is offline   Reply With Quote
Old 2018-08-02, 20:17   #2831
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·5·271 Posts
Default comments in worktodo

While looking for something else, I stumbled across this:
The source of parse.c for CUDAPm1 indicates # or \\ or / are comment characters marking the rest of a worktodo line as a comment

I've confirmed by test in mfaktc that \\ worked; # or / did not work in my test, which placed them mostly at the beginnings of records. I could tell by the line number in any warning messages which did or did not work.

The capability is not present in the readme.txt (yet) that I recall.

Last fiddled with by kriesel on 2018-08-02 at 20:38
kriesel is offline   Reply With Quote
Old 2018-08-06, 13:58   #2832
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3×457 Posts
Default

I have a question, maybe somebody knows, how do the mfaktc and mfakto codebases compare?

I think at some point in history, mfakto was inspired by mfaktc. But in the interleaving years, how did they diverge? do they have now any different capabilities? or different self-test data sets?

(aside from targeting different platforms, CUDA vs. OpenCL).
preda is online now   Reply With Quote
Old 2018-08-06, 14:43   #2833
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·5·271 Posts
Default

Quote:
Originally Posted by preda View Post
I have a question, maybe somebody knows, how do the mfaktc and mfakto codebases compare?

I think at some point in history, mfakto was inspired by mfaktc. But in the interleaving years, how did they diverge? do they have now any different capabilities? or different self-test data sets?

(aside from targeting different platforms, CUDA vs. OpenCL).
Yes, mfaktc preceded mfakto. Some features developed in mfakto were added to mfaktc later (worktodoadd as I recall).

Per http://www.mersenneforum.org/showpos...91&postcount=2
Mfaktc max bit depth 95, mfakto 92. Minimum exponent may vary.

Comparing their respective readme files and bug and wish lists may show some other differences.
Mfaktc bug and wish list http://www.mersenneforum.org/showpos...21&postcount=3
Mfakto bug and wish list http://www.mersenneforum.org/showpos...37&postcount=3

Some client management software supports mfaktc or mfakto, typically not both. http://www.mersenneforum.org/showpos...92&postcount=3

(All the above, and more, are periodically updated in place, as part of the mersenne-gpu-computing-oriented reference material I've been accumulating at http://www.mersenneforum.org/forumdisplay.php?f=154)

And of course, there's comparing the source code in the portions that are not CUDA or OpenCl specific.

Mfaktc self-test:tests multiple kernels per testcase
Code:
########## testcase 1/2867 ##########
...

Selftest statistics
  number of tests           26192
  successfull tests         26192

  kernel             | success |   fail
  -------------------+---------+-------
  UNKNOWN kernel     |      0  |      0
  71bit_mul24        |   2586  |      0
  75bit_mul32        |   2682  |      0
  95bit_mul32        |   2867  |      0
  barrett76_mul32    |   1096  |      0
  barrett77_mul32    |   1114  |      0
  barrett79_mul32    |   1153  |      0
  barrett87_mul32    |   1066  |      0
  barrett88_mul32    |   1069  |      0
  barrett92_mul32    |   1084  |      0
  75bit_mul32_gs     |   2420  |      0
  95bit_mul32_gs     |   2597  |      0
  barrett76_mul32_gs |   1079  |      0
  barrett77_mul32_gs |   1096  |      0
  barrett79_mul32_gs |   1130  |      0
  barrett87_mul32_gs |   1044  |      0
  barrett88_mul32_gs |   1047  |      0
  barrett92_mul32_gs |   1062  |      0

selftest PASSED!
Mfakto short self-test (runs every time I launch mfakto to do factoring):
Code:
Started a simple selftest ...
######### testcase 1/30 (M1031831[63-64]) #########
######### testcase 2/30 (M51332417[68-69]) #########
######### testcase 3/30 (M50896831[69-70]) #########
######### testcase 4/30 (M50979079[70-71]) #########
######### testcase 5/30 (M51232133[71-72]) #########
######### testcase 6/30 (M50830523[71-72]) #########
######### testcase 7/30 (M50752613[72-73]) #########
######### testcase 8/30 (M51507913[72-73]) #########
######### testcase 9/30 (M51916901[73-74]) #########
######### testcase 10/30 (M51157933[74-75]) #########
######### testcase 11/30 (M51308501[75-76]) #########
######### testcase 12/30 (M51671491[75-76]) #########
######### testcase 13/30 (M50805581[77-78]) #########
######### testcase 14/30 (M51157429[78-79]) #########
######### testcase 15/30 (M51406151[78-79]) #########
######### testcase 16/30 (M51478381[79-80]) #########
######### testcase 17/30 (M51350527[80-81]) #########
######### testcase 18/30 (M53061139[80-81]) #########
######### testcase 19/30 (M48629519[81-82]) #########
######### testcase 20/30 (M51752893[83-84]) #########
######### testcase 21/30 (M51760133[83-84]) #########
######### testcase 22/30 (M51090757[84-85]) #########
######### testcase 23/30 (M51050171[84-85]) #########
######### testcase 24/30 (M50989481[86-87]) #########
######### testcase 25/30 (M50856937[86-87]) #########
######### testcase 26/30 (M53065231[88-89]) #########
######### testcase 27/30 (M3321929777[63-64]) #########
######### testcase 28/30 (M3321930841[63-64]) #########
######### testcase 29/30 (M55069117[64-65]) #########
######### testcase 30/30 (M45448679[81-82]) #########
Selftest statistics                                    
  number of tests           30
  successful tests          30
Mfakto -st:
Code:
######### testcase 1/34071 (M67094119[81-82]) #########
...
######### testcase 34071/34071 (M112404491[91-92]) #########
Starting trial factoring M112404491 from 2^91 to 2^92 (4461450.54GHz-days)
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jan 16 20:34 | 1848   0.1% |  0.124    n.a. |      n.a.    81206    0.00%
M112404491 has a factor: 3941616367695054034124905537 (91.670846 bits, 2992945.937358 GHz-d)
found 1 factor for M112404491 from 2^91 to 2^92 [mfakto 0.15pre6-Win cl_barrett32_92_gs_2]
selftest for M112404491 passed (cl_barrett32_92_gs)!
tf(): total time spent:  0.124s

Selftest statistics                                    
  number of tests           34026
  successful tests          34026

selftest PASSED!

Last fiddled with by kriesel on 2018-08-06 at 15:06
kriesel is offline   Reply With Quote
Old 2018-08-07, 14:34   #2834
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default

I have been playing with some GPU sieving code, similar to the GPU sieve used by mfaktc and mafkto.

The sieve works in the usual way: for each prime P from a set of primes, compute the initial "bit-to-clear" for a given exponent E and K (q = 2*E*K+1), and then mark off bits at every P step starting with the bit-to-clear.

Is there some (mathematical) reason for the number of survivors of this kind of sieve to slightly decrease as K grows? In other words, are there slightly fewer candidates surviving the sieve when the bit-level of K grows?

(I know that the number of actual primes does decrease as K grows, but is this fact reflected at all when sieving with the technique above?)
preda is online now   Reply With Quote
Old 2018-08-07, 16:22   #2835
axn
 
axn's Avatar
 
Jun 2003

2·3·7·112 Posts
Default

Quote:
Originally Posted by preda View Post
Is there some (mathematical) reason for the number of survivors of this kind of sieve to slightly decrease as K grows? In other words, are there slightly fewer candidates surviving the sieve when the bit-level of K grows?

(I know that the number of actual primes does decrease as K grows, but is this fact reflected at all when sieving with the technique above?)
Not if you keep the set of sieving primes the same. If you increase the sieving primes, that will result in fewer survivors.

This is assuming you're sieving with fewer primes than sqrt(candidate).
axn is online now   Reply With Quote
Old 2018-08-07, 21:14   #2836
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default

Quote:
Originally Posted by axn View Post
Not if you keep the set of sieving primes the same. If you increase the sieving primes, that will result in fewer survivors.

This is assuming you're sieving with fewer primes than sqrt(candidate).
That's what I thought. Need to find the bug that generates the observed behavior then..
preda is online now   Reply With Quote
Old 2018-08-08, 02:10   #2837
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

55B16 Posts
Default

Quote:
Originally Posted by axn View Post
This is assuming you're sieving with fewer primes than sqrt(candidate).
Is sqrt(q=2*e*k+1), or sqrt(k) the prime magnitude limit?

If it's sqrt(k), this may be it. If I sieve with primes up to 2^23, exponent 2^28, then TF under 75bits would have slightly reduced filtering.
preda is online now   Reply With Quote
Old 2018-08-08, 02:49   #2838
axn
 
axn's Avatar
 
Jun 2003

2×3×7×112 Posts
Default

Quote:
Originally Posted by preda View Post
Is sqrt(q=2*e*k+1), or sqrt(k) the prime magnitude limit?

If it's sqrt(k), this may be it. If I sieve with primes up to 2^23, exponent 2^28, then TF under 75bits would have slightly reduced filtering.
It is the first one. If you're sieving for 75 bits (i.e 2^74 - 2^75), then as long as you're using primes < 2^37 (and 2^23 is well under that), you'll be sieving out a constant-ish proportion of the candidate. There will be variations, but more or less the same fraction will be left if you sieve any range from 2^64 and above.

Can you provide some stats as to the pattern you're observing (of the fraction of survivors)?
axn is online now   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 14:10.


Mon Aug 2 14:10:14 UTC 2021 up 10 days, 8:39, 0 users, load averages: 4.31, 3.82, 3.23

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.