mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfaktc: a CUDA program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=12827)

nucleon 2010-11-16 22:32

While on this one:

Factor=F2D758D07E3AEF797F7F4FDC22FD64D4,79235809,65,66

It was displaying these:
class 3767: 12.85M candidates in 186ms (69.06M/sec) (avg. wait: 124usec)
class 3776: 12.85M candidates in 187ms (68.69M/sec) (avg. wait: 126usec)
class 3779: 12.85M candidates in 186ms (69.06M/sec) (avg. wait: 125usec)
class 3780: 12.85M candidates in 190ms (67.61M/sec) (avg. wait: 119usec)
class 3791: 12.85M candidates in 187ms (68.69M/sec) (avg. wait: 126usec)
class 3795: 12.85M candidates in 187ms (68.69M/sec) (avg. wait: 122usec)
class 3800: 12.85M candidates in 186ms (69.06M/sec) (avg. wait: 120usec)

Default settings

My CPU is i7-930 (non-overclocked). I'm also running prime95 on the CPU as well. Prime95 is running 1x LL, 2x DC, 1xP-1. I've set mfkatc affinity to run on the HT core as the P-1 check. i.e. CPU6 is running P-1, CPU7 is running mfkatc.

The M/sec figure does jump when I turn off prime95.

-- Craig

Karl M Johnson 2010-11-16 22:37

[QUOTE=TheJudger;237341]A GTX 460 should provide similar speed to a GTX 465, perhaps a little bit slower[/QUOTE]
A sm_21 GPU will never be faster than sm_20 GPU, unless it has 50% more shaders than the sm_20 has(at same clocks).

amphoria 2010-11-16 23:26

[QUOTE=nucleon;237375]
My CPU is i7-930 (non-overclocked). I'm also running prime95 on the CPU as well. Prime95 is running 1x LL, 2x DC, 1xP-1. I've set mfkatc affinity to run on the HT core as the P-1 check. i.e. CPU6 is running P-1, CPU7 is running mfkatc.

The M/sec figure does jump when I turn off prime95.
[/QUOTE]

mfaktc does require a real physical core (ie. non-HT) as well as the GPU otherwise performance drops off as you have found.

TheJudger 2010-11-17 11:41

Hi Craig,

as amphoria allready mentioned: If you have a decent GPU each instance of mfaktc needs one CPU core. And even a full core of your i7 930 might have problems to keep your GTX 460 busy all the time.
You could try higher bitlevels, too. mfaktc works best on "long running assignments" :wink:

Karl: 50% more shaders at same clock is too much for mfaktc. It doesn't take full benefit of ILP (cc 2.1) but it is not totally useless.

Oliver

Karl M Johnson 2010-11-17 14:15

Yep, those gpu's are not totally useless. They're cheap. And NV will probably have to make 2 GPU card out of them, to be realistic.

nucleon 2010-11-17 14:57

Thanks guys. So it sounds like I have to drop prime95 from using 4 cores down to 3 on my machine (i7 930), to fully utilize the GPU.

By my rough calculations, it seems to me that my machine generates more GHz-days with 3cores on prime95 + full core & GPU than 3.5cores on prime95 + 0.5 core & GPU. (rough figures).

When I have more time, I'll look to re-jig prime95 back down to only 3 cores in action.

-- Craig

nucleon 2010-11-20 23:56

I've reduced prime95 down to 3 workers, freeing up a core.

So now I get these timings:

Factor=3740F5667A00CCE5F5829843BC2B6921,78524917,69,70


class 812: 203.69M candidates in 1535ms (132.69M/sec) (avg. wait: 151usec)
class 815: 203.69M candidates in 1510ms (134.89M/sec) (avg. wait: 148usec)
class 819: 203.69M candidates in 1514ms (134.53M/sec) (avg. wait: 146usec)


-- Craig

TheJudger 2010-11-23 17:18

Hi Craig,

I think your setup works as expected. I would guess that SievePrimes drops down to 5000, right? This indicates that mfaktc still runs CPU-limited on your system. But this should be OK, don't move "too many" CPU-cores from LL and P-1 to TF, there is enough TF done on primenet.

Oliver

vsuite 2010-11-28 14:42

[QUOTE=amphoria;236817]A version compiled for Windows can be downloaded from here:

[url]http://www.sendspace.com/file/obirpl[/url][/QUOTE]

Hi, is it possible to compile for 32-bit windows too? Thanks.

amphoria 2010-11-28 21:23

[QUOTE=vsuite;239037]Hi, is it possible to compile for 32-bit windows too? Thanks.[/QUOTE]

I tried once but couldn't get it to compile.

TheJudger 2010-11-29 14:48

Hi Dave,

any idea whats wrong with your 32bit windows build?
From time to time I run some tests with a 32bit Linux build on my system: works fine, just a little bit slower in the CPU part (the preselection of the candidates (sieving) runs 33% slower).

Regards,
Oliver


All times are UTC. The time now is 22:59.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.