mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-03-11, 07:15   #2234
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

2·3·31·47 Posts
Default

We had this discussion few times in the past, I still remember the last try. It turns out that one would need a good way to eliminate the algebraic and intrinsic factors first, and do the sieving then. It is not difficult to change the mfaktc if one don't care about missing some factors, and the goal is just to find some other factors, no matter if they are in order or not. If one is going to make a new mfaktc, I would suggest introducing a new flag, like "-allowcomposite" or whatever, and the program would not check if the exponent is composite when the flag is present (but still check if it is odd, otherwise we have trouble with 3,5 (mod 8), and more classes to parse, different logic). Modifying the program to always allow composite exponents could result in futile work when someone makes a typo, for example. What I want to say, is that is better to keep the check for primality on the default options of the program, but allow odd composites if some "special" flag is present. For these odd composite, the program would work exactly the same way as it does for prime exponents. Of course, it will miss the algebraic factors.

Last fiddled with by LaurV on 2013-03-11 at 07:29
LaurV is offline   Reply With Quote
Old 2013-03-11, 07:23   #2235
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

1001000000002 Posts
Default

Quote:
Originally Posted by akruppa View Post
Correct, this is of no value to GIMPS as a prime search project. Some people, including me, sometimes do look for factors of large 2^n-1 numbers, and it would be awesome to be able to use GPUs for the job. If factoring composite exponents is permitted, I think it should be done with minimal developer effort, even if that neglects some relatively easy optimizations that could be done, precisely because factoring such numbers is not the main purpose of mfakc.
True, there is no harm in giving mfaktc the ability to TF composite exponents. Perhaps there could be a switch that makes it skip the primality check.
ixfd64 is online now   Reply With Quote
Old 2013-03-11, 10:18   #2236
akruppa
 
akruppa's Avatar
 
"Nancy"
Aug 2002
Alexandria

2,467 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
True, there is no harm in giving mfaktc the ability to TF composite exponents. Perhaps there could be a switch that makes it skip the primality check.
I'm not familiar with the mfaktc code at all, or I could probably make the changes myself. If the changes to allow composite exponents are implemented, maybe I can add some code to skip useless classes according to quadratic character, if such changes are well-localized.
akruppa is offline   Reply With Quote
Old 2013-03-11, 17:48   #2237
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

33×41 Posts
Default

Remove the check for prime exponent is easy, aswell as the classes stuff is easy, too, as long as we'll keep 420/4620 classes. But the (CPU-)sieve needs to be reworked, currently there is no code to remove primes from the sieve base, but this would be needed for composite exponents, otherwise there will be an endless loop in the offset calculation. Anyway, it seems feasible if someone wants to do so.

Oliver
TheJudger is offline   Reply With Quote
Old 2013-03-11, 18:30   #2238
akruppa
 
akruppa's Avatar
 
"Nancy"
Aug 2002
Alexandria

2,467 Posts
Default

Btw, what is the relationship between mfaktc and mmff? I haven't kept up with developments, and digging through a 100-page thread seems a daunting task... is one a superset of the other?
akruppa is offline   Reply With Quote
Old 2013-03-11, 18:43   #2239
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

28·32 Posts
Default

Quote:
Originally Posted by akruppa View Post
Btw, what is the relationship between mfaktc and mmff? I haven't kept up with developments, and digging through a 100-page thread seems a daunting task... is one a superset of the other?
mfaktc is for TF'ing GIMPS-class numbers, and mmff is for double Mersenne and Fermat numbers. The mmff source code is largely based on that of mfaktc.

Last fiddled with by ixfd64 on 2013-03-11 at 18:43
ixfd64 is online now   Reply With Quote
Old 2013-03-12, 00:11   #2240
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

131328 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
mfaktc is for TF'ing GIMPS-class numbers, and mmff is for double Mersenne and Fermat numbers. The mmff source code is largely based on that of mfaktc.
Although I think the gpu sieving code now in mfaktc was originally in mmff. Just to be confusing.
henryzz is offline   Reply With Quote
Old 2013-03-17, 17:03   #2241
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

33×41 Posts
Default

So finally I got my hands on a Titan (temporary), too.
Seems that I've underestimated to boost clock. So with stock clockrates the Titan is the fastest GPU for mfaktc, but it wins only by a small margin compared to the old GTX 580.

There might be a very small performance increase for the Titan once I test the new funnel shift instruction. The barrett_{76,77,79} kernels don't make use of multiword shifts expect for the initialization but barrett_{87,88,92} do a multiword shift in each iteration.

Oliver
TheJudger is offline   Reply With Quote
Old 2013-03-17, 17:08   #2242
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

2×37×41 Posts
Default

Quote:
Originally Posted by TheJudger View Post
So finally I got my hands on a Titan (temporary), too.
Would you mind sending me a benchmark? I'd feel better about my mfaktc performance chart ratios if I had more than 1 benchmark to go on.
James Heinrich is offline   Reply With Quote
Old 2013-03-20, 20:00   #2243
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

33×41 Posts
Default

Yepp, I was right: the funnel shift gives a small advantage.

A quick hack using stock mfaktc 0.20 code and barrett_87 for testing on a Tesla K20 (CUDA 5.0)
Code:
base                             300.8 GHzd/d
added code generation for sm_35  298.1 GHzd/d
using funnel shift in barrett_87 308.9 GHzd/d
Using funnel shift in the initialization phase causes a very small slowdown!

So barrett_87 beats now barrett_77, only barett_76 is faster on GK110.
For the current TF wavefront the impact is even lower because we do TF to 273 there. But hey, it is an improvement...

Oliver
TheJudger is offline   Reply With Quote
Old 2013-03-23, 02:16   #2244
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

170348 Posts
Default

We have been messing around with the extremely confusing "EVGA Precision X" software. There are so many options it is ridiculous!

Anyways, we were messing with the memory clock setting. By default on our GTX690 it is 1502.3MHz. We lowered this to 1252.8MHz and the performance did not change!

The temperatures and voltages do not change, either.

Does this make sense?

Would running the memory slower like that make it less likely to have a flipped bit?

FWIW, the GPU clock is 1058.2MHz. The performance changes a lot when we mess around with that!

Xyzzy is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1637 2020-09-27 16:39
The P-1 factoring CUDA program firejuggler GPU Computing 752 2020-09-08 16:15
"CUDA runtime version 0.0" when running mfaktc.exe froderik GPU Computing 4 2016-10-30 15:29
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51
World's dumbest CUDA program? xilman Programming 1 2009-11-16 10:26

All times are UTC. The time now is 06:06.

Mon Sep 28 06:06:58 UTC 2020 up 18 days, 3:17, 0 users, load averages: 1.36, 1.41, 1.41

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.