mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2010-05-06, 13:10   #188
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

61×79 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hi Luigi,



just to be sure: did mfaktc found all allready known factors within the ranges, too?

About the tripplecheck: you noticed that one of the three new factors just appeared with Factor5. Is the tripplecheck still running or didn't they came up? (false positives?)

Oliver
I can confirm that all factors discovered by OBD were found by mfaktc (range 1-69 or 1-71). There are still 5 factors not checked because they are above 71 bits. The three new factors found for the first time by mfaktc have been rediscovered by Factor5.

Total success.

Luigi
ET_ is offline   Reply With Quote
Old 2010-05-06, 13:18   #189
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

Quote:
Originally Posted by ET_ View Post
I can confirm that all factors discovered by OBD were found by mfaktc (range 1-69 or 1-71). There are still 5 factors not checked because they are above 71 bits. The three new factors found for the first time by mfaktc have been rediscovered by Factor5.

Total success.

Luigi
Yeah, good news!
To be honest: I was a little bit afraid, but now I'm happy!

Oliver
TheJudger is offline   Reply With Quote
Old 2010-05-06, 13:19   #190
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

61·79 Posts
Default

Quote:
Originally Posted by TheJudger View Post
How fast is it compared to Factor5?
AFAIK Factor5 is using GMP functions which allow _MUCH_ bigger factor limits so the comparison is not 100% fair...

Oliver
mfaktc is about 60-80 times faster than Factor5 in the range 1-71 of 3,321,xxx,xxx exponents.

It should be at least 5-10 times faster than Prime95 (I will do some benchmarking tonight).

As for the comparison, I have some improvements planned for Factor5_64 bits, but they only take advantage of integer k and GCD.

You are right, I wrote Factor5 just to play with very big big exponents (like M41234123412341, that I took up to 82,3 bits and Ernst Mayer to 85, finding a nice big factor). As there was no software ready for that, I didn't optimize it to reach its best efficiency, preferring versatility. :guilty smile:

Luigi
ET_ is offline   Reply With Quote
Old 2010-05-06, 13:32   #191
axn
 
axn's Avatar
 
Jun 2003

117358 Posts
Default

Quote:
Originally Posted by ET_ View Post
mfaktc is about 60-80 times faster than Factor5 in the range 1-71 of 3,321,xxx,xxx exponents.

It should be at least 5-10 times faster than Prime95 (I will do some benchmarking tonight).
What is your GPU?
axn is offline   Reply With Quote
Old 2010-05-06, 13:42   #192
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

61·79 Posts
Default

Quote:
Originally Posted by axn View Post
What is your GPU?
GTX 275@1404 MHz, a G200 series.

I will test my 9500M GS @ 950 MHz as soon as mfaktc code for Windows will be released...

Luigi
ET_ is offline   Reply With Quote
Old 2010-05-10, 13:29   #193
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Hi,

I had access to a GTX 480, the mfaktc code works without changes but I needed to adjust the compile script. Without code changes the performance is ~50% higher than on a my GTX 275. This is a little bit lower than expected but anyway it's fast.
Using 2 cores of a Core i7 750 (by starting 2 intances of mfaktc 0.07-pre2) it can take two exponents (M115.xxx.xxx) from 2^63 to 2^71 in ~2h 15m. This means ~21 per day!
The siever of 0.07 is a little bit faster compared to 0.06 (at least on Core i7 series, untested on other CPU types).

Oliver

Last fiddled with by TheJudger on 2010-05-10 at 13:31
TheJudger is offline   Reply With Quote
Old 2010-05-10, 23:46   #194
kjaget
 
kjaget's Avatar
 
Jun 2005

12910 Posts
Default

Here's my attempt at a 64-bit windows port. It has successfully found a few factors that I found with earlier versions, but I wouldn't say it's heavily tested.

If anyone wants a 32-bit version I can do that as well. Nvidia is out to annoy me (you can't have both 32 and 64 bit versions of the CUDA tools installed at the same time) but it's not a huge effort to switch back and forth.

The ZIP file includes source as well as an EXE. The source is mainly there for TheJudger to look at (and see how badly I mangled it), but others are welcome to fix any bugs they find.

Report any problems or questions here to the thread.

mfaktc-0.06-win.zip

Last fiddled with by kjaget on 2010-05-11 at 00:01
kjaget is offline   Reply With Quote
Old 2010-05-11, 00:01   #195
kjaget
 
kjaget's Avatar
 
Jun 2005

3·43 Posts
Default For the truly brave

Here's a version which includes two changes I've been working on in parallel with the main bit of work from TheJudger.

First off, you can specify how many streams (GPU threads) to spawn in the ini file. I added this because in my testing, I had better results with 3-5 threads instead of the 2 hardcoded in the current version. I think this is an issue specific to Win7, but others may find it useful.

The second addition is code which tries to minimize the execution time per TF class - turn it on by setting SievePrimesAdjust=2 in the ini file. This is different than the current code which adjusts to keep the average wait time in an optimum range. On my system, I see a noticable improvement using this approach - I'm curious if it helps on other systems.

The code is a delta off the code I posted previously and also includes a windows exe. I don't have linux system to build on, sorry. If the Win porting code I added is OK it should build cleanly, but I'm not willing to guarantee that's 100% true.

Use this at your own risk. I've done some testing but certainly not enough to say it's ready for prime time. I'm posting it mostly to get another (few) sets of eyes to look at the code but if anyone wants to run it against some known factors that would be great as well.

mfaktc-0.06-hack.zip
kjaget is offline   Reply With Quote
Old 2010-05-11, 03:25   #196
kjaget
 
kjaget's Avatar
 
Jun 2005

3·43 Posts
Default

OK, already found a small problem with the hacked version. For the new way of changing sieve primes it was starting at the wrong value (picking the value from the ini file rather than searching starting from the middle of the full range from min to max).

Try this instead of the last one.

mfaktc-0.06-hack-2.zip
kjaget is offline   Reply With Quote
Old 2010-05-11, 08:02   #197
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Hi kjaget,

thank you for your modifications / hints / tests.
I'll take a look at them later.
I've played around with the number of GPU streams, too. But on my system there was no difference between 2 and 4 streams. When you say it is usefully on Windows I'll add it. I screwed up the code path without USE_ASYNC_COPY in 0.06. (This was known to me but I didn't write it here or in the readme). I think I'll remove this part completly (not needed except for some integrated GPUs?)

Oliver
TheJudger is offline   Reply With Quote
Old 2010-05-11, 08:51   #198
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

61×79 Posts
Default

I just tried hack-2 version on Windows 7 64 bit.

it complains about cudart.dll

I have cudart64_30_14.dll and cuda32_30_14.dll on my system. No cudart.dll.

Luigi
ET_ is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 05:34.


Fri Aug 6 05:34:30 UTC 2021 up 14 days, 3 mins, 1 user, load averages: 3.13, 2.95, 2.71

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.