![]() |
If anyone here has used mfaktc to search for factors of Wagstaff numbers (2^p + 1)/3 and if you have conserved log files or lists of factors that you're willing to share, let me know.
I've already asked in Wagstaff-related threads, and tried contacting some of the folks who were active when earlier searches were being done around 2013. I just thought I'd inquire here too. |
[QUOTE=TheJudger;490784]Good news: CUDA 9.2.88 seems to have fixed the issue on Volta architecture![/QUOTE]
I am experimenting with one GPU of a Tesla V100-SXM2-16GB (this is a p3.2xlarge instance on Amazon AWS cloud with Deep Learning Base AMI). Same specs as you listed for the Tesla V100-PCIE-16GB except a slightly faster clock rate: [CODE] clock rate (CUDA cores) 1530MHz [/CODE] It is configurable to use CUDA 9.2.88, by setting the symbolic link /usr/local/cuda mfaktc passes all the Mersenne self tests. However, when I compile an alternate version with -DWAGSTAFF added to CFLAGS, it fails all the Wagstaff self tests. Did you try the Wagstaff self tests on your V100 and do they work for you? Is anything more needed to create a Wagstaff version, other than adding the -DWAGSTAFF flag in CFLAGS? The compilation uses gcc 4.8 |
Hello,
[QUOTE=GP2;492679]mfaktc passes all the Mersenne self tests. However, when I compile an alternate version with -DWAGSTAFF added to CFLAGS, it fails all the Wagstaff self tests. Did you try the Wagstaff self tests on your V100 and do they work for you?[/QUOTE] no, I didn't try. [QUOTE=GP2;492679]Is anything more needed to create a Wagstaff version, other than adding the -DWAGSTAFF flag in CFLAGS?[/QUOTE] No, that should be enough. Will look at this later. Thanks for reporting. Oliver |
comments in worktodo
While looking for something else, I stumbled across this:
The source of parse.c for CUDAPm1 indicates # or \\ or / are comment characters marking the rest of a worktodo line as a comment I've confirmed by test in mfaktc that \\ worked; # or / did not work in my test, which placed them mostly at the beginnings of records. I could tell by the line number in any warning messages which did or did not work. The capability is not present in the readme.txt (yet) that I recall. |
I have a question, maybe somebody knows, how do the mfaktc and mfakto codebases compare?
I think at some point in history, mfakto was inspired by mfaktc. But in the interleaving years, how did they diverge? do they have now any different capabilities? or different self-test data sets? (aside from targeting different platforms, CUDA vs. OpenCL). |
[QUOTE=preda;493277]I have a question, maybe somebody knows, how do the mfaktc and mfakto codebases compare?
I think at some point in history, mfakto was inspired by mfaktc. But in the interleaving years, how did they diverge? do they have now any different capabilities? or different self-test data sets? (aside from targeting different platforms, CUDA vs. OpenCL).[/QUOTE] Yes, mfaktc preceded mfakto. Some features developed in mfakto were added to mfaktc later (worktodoadd as I recall). Per [URL]http://www.mersenneforum.org/showpost.php?p=488291&postcount=2[/URL] Mfaktc max bit depth 95, mfakto 92. Minimum exponent may vary. Comparing their respective readme files and bug and wish lists may show some other differences. Mfaktc bug and wish list [URL]http://www.mersenneforum.org/showpost.php?p=488521&postcount=3[/URL] Mfakto bug and wish list [URL]http://www.mersenneforum.org/showpost.php?p=488637&postcount=3[/URL] Some client management software supports mfaktc or mfakto, typically not both. [URL]http://www.mersenneforum.org/showpost.php?p=488292&postcount=3[/URL] (All the above, and more, are periodically updated in place, as part of the mersenne-gpu-computing-oriented reference material I've been accumulating at [URL]http://www.mersenneforum.org/forumdisplay.php?f=154[/URL]) And of course, there's comparing the source code in the portions that are not CUDA or OpenCl specific. Mfaktc self-test:tests multiple kernels per testcase[CODE]########## testcase 1/2867 ########## ... Selftest statistics number of tests 26192 successfull tests 26192 kernel | success | fail -------------------+---------+------- UNKNOWN kernel | 0 | 0 71bit_mul24 | 2586 | 0 75bit_mul32 | 2682 | 0 95bit_mul32 | 2867 | 0 barrett76_mul32 | 1096 | 0 barrett77_mul32 | 1114 | 0 barrett79_mul32 | 1153 | 0 barrett87_mul32 | 1066 | 0 barrett88_mul32 | 1069 | 0 barrett92_mul32 | 1084 | 0 75bit_mul32_gs | 2420 | 0 95bit_mul32_gs | 2597 | 0 barrett76_mul32_gs | 1079 | 0 barrett77_mul32_gs | 1096 | 0 barrett79_mul32_gs | 1130 | 0 barrett87_mul32_gs | 1044 | 0 barrett88_mul32_gs | 1047 | 0 barrett92_mul32_gs | 1062 | 0 selftest PASSED! [/CODE]Mfakto short self-test (runs every time I launch mfakto to do factoring):[CODE]Started a simple selftest ... ######### testcase 1/30 (M1031831[63-64]) ######### ######### testcase 2/30 (M51332417[68-69]) ######### ######### testcase 3/30 (M50896831[69-70]) ######### ######### testcase 4/30 (M50979079[70-71]) ######### ######### testcase 5/30 (M51232133[71-72]) ######### ######### testcase 6/30 (M50830523[71-72]) ######### ######### testcase 7/30 (M50752613[72-73]) ######### ######### testcase 8/30 (M51507913[72-73]) ######### ######### testcase 9/30 (M51916901[73-74]) ######### ######### testcase 10/30 (M51157933[74-75]) ######### ######### testcase 11/30 (M51308501[75-76]) ######### ######### testcase 12/30 (M51671491[75-76]) ######### ######### testcase 13/30 (M50805581[77-78]) ######### ######### testcase 14/30 (M51157429[78-79]) ######### ######### testcase 15/30 (M51406151[78-79]) ######### ######### testcase 16/30 (M51478381[79-80]) ######### ######### testcase 17/30 (M51350527[80-81]) ######### ######### testcase 18/30 (M53061139[80-81]) ######### ######### testcase 19/30 (M48629519[81-82]) ######### ######### testcase 20/30 (M51752893[83-84]) ######### ######### testcase 21/30 (M51760133[83-84]) ######### ######### testcase 22/30 (M51090757[84-85]) ######### ######### testcase 23/30 (M51050171[84-85]) ######### ######### testcase 24/30 (M50989481[86-87]) ######### ######### testcase 25/30 (M50856937[86-87]) ######### ######### testcase 26/30 (M53065231[88-89]) ######### ######### testcase 27/30 (M3321929777[63-64]) ######### ######### testcase 28/30 (M3321930841[63-64]) ######### ######### testcase 29/30 (M55069117[64-65]) ######### ######### testcase 30/30 (M45448679[81-82]) ######### Selftest statistics number of tests 30 successful tests 30[/CODE]Mfakto -st:[CODE]######### testcase 1/34071 (M67094119[81-82]) ######### ... ######### testcase 34071/34071 (M112404491[91-92]) ######### Starting trial factoring M112404491 from 2^91 to 2^92 (4461450.54GHz-days) Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jan 16 20:34 | 1848 0.1% | 0.124 n.a. | n.a. 81206 0.00% M112404491 has a factor: 3941616367695054034124905537 (91.670846 bits, 2992945.937358 GHz-d) found 1 factor for M112404491 from 2^91 to 2^92 [mfakto 0.15pre6-Win cl_barrett32_92_gs_2] selftest for M112404491 passed (cl_barrett32_92_gs)! tf(): total time spent: 0.124s Selftest statistics number of tests 34026 successful tests 34026 selftest PASSED! [/CODE] |
I have been playing with some GPU sieving code, similar to the GPU sieve used by mfaktc and mafkto.
The sieve works in the usual way: for each prime P from a set of primes, compute the initial "bit-to-clear" for a given exponent E and K (q = 2*E*K+1), and then mark off bits at every P step starting with the bit-to-clear. Is there some (mathematical) reason for the number of survivors of this kind of sieve to slightly decrease as K grows? In other words, are there slightly fewer candidates surviving the sieve when the bit-level of K grows? (I know that the number of actual primes does decrease as K grows, but is this fact reflected at all when sieving with the technique above?) |
[QUOTE=preda;493362]Is there some (mathematical) reason for the number of survivors of this kind of sieve to slightly decrease as K grows? In other words, are there slightly fewer candidates surviving the sieve when the bit-level of K grows?
(I know that the number of actual primes does decrease as K grows, but is this fact reflected at all when sieving with the technique above?)[/QUOTE] Not if you keep the set of sieving primes the same. If you increase the sieving primes, that will result in fewer survivors. This is assuming you're sieving with fewer primes than sqrt(candidate). |
[QUOTE=axn;493371]Not if you keep the set of sieving primes the same. If you increase the sieving primes, that will result in fewer survivors.
This is assuming you're sieving with fewer primes than sqrt(candidate).[/QUOTE] That's what I thought. Need to find the bug that generates the observed behavior then.. |
[QUOTE=axn;493371]This is assuming you're sieving with fewer primes than sqrt(candidate).[/QUOTE]
Is sqrt(q=2*e*k+1), or sqrt(k) the prime magnitude limit? If it's sqrt(k), this may be it. If I sieve with primes up to 2^23, exponent 2^28, then TF under 75bits would have slightly reduced filtering. |
[QUOTE=preda;493400]Is sqrt(q=2*e*k+1), or sqrt(k) the prime magnitude limit?
If it's sqrt(k), this may be it. If I sieve with primes up to 2^23, exponent 2^28, then TF under 75bits would have slightly reduced filtering.[/QUOTE] It is the first one. If you're sieving for 75 bits (i.e 2^74 - 2^75), then as long as you're using primes < 2^37 (and 2^23 is well under that), you'll be sieving out a constant-ish proportion of the candidate. There will be variations, but more or less the same fraction will be left if you sieve any range from 2^64 and above. Can you provide some stats as to the pattern you're observing (of the fraction of survivors)? |
| All times are UTC. The time now is 23:07. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.