mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-12-10, 00:13   #1409
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Factory overclocked GTX 560Ti (1701MHz), barrett79 kernel, raw GPU speed (without sieving), M66362159 from 269 to 270
Code:
                 |  CUDA 3.2 | CUDA 4.1-RC1
mfaktc 0.17      | 260.94M/s |    261.93M/s
mfaktc 0.18-pre9 | 260.80M/s |    258.97M/s
Factory overclocked GTX 560Ti (1701MHz), barrett79 kernel, raw GPU speed (without sieving), M66362159 from 269 to 270
Code:
                  |  CUDA 3.2 | CUDA 4.1-RC2
mfaktc 0.17       | 260.94M/s |    261.93M/s
mfaktc 0.18-pre10 | 260.80M/s |    265.39M/s
A little bit better than before but there are no changes in the code of the barrett79 kernel from -pre9 to -pre10...

Factory overclocked GTX 560Ti (1701MHz), barrett92 kernel, raw GPU speed (without sieving), M3321932839 from 279 to 280
Code:
                  | CUDA 4.1-RC2
mfaktc 0.17       |    170.62M/s
mfaktc 0.18-pre10 |    173.32M/s
A little bit faster, too. But the difference between compute capability 2.0 and 2.1 increases further...

Oliver
TheJudger is offline   Reply With Quote
Old 2011-12-19, 23:26   #1410
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

21278 Posts
Default mfaktc 0.18

Hello!

http://www.mersenneforum.org/mfaktc/mfaktc-0.18.tar.gz
http://www.mersenneforum.org/mfaktc/mfaktc-0.18.win.zip
http://www.mersenneforum.org/mfaktc/...linux64.tar.gz

The executables need at least a CUDA 4.0 capable driver (270 series driver or newer). The Windows zip archive contains both, the 32 bit and 64 bit version. I'll upload new executables once CUDA 4.1 is public available. The sources should compile with older CUDA version, too, but they might be slower. CUDA 4.1 will give another performance improvement for the barrett based kernels on compute capability 2.x GPUs (especially on 2.0).

Compared to mfaktc 0.17 there are "more than usuall" minor changes. Highlights from the Changelog.txt:
  • autoadjustment of SievePrimes is now less dependend on the gridsize and
    absolute speed. Instead of measuring the absolute (average) time waited
    per precessing block (grid size) now the relative time spent on waiting
    for the GPU is calculated. In the per-class output "avg. wait" is replaced
    by "CPU wait".
  • new commandline option: "-v" (verbosity) let the user decide how many
    informations are printed
    (suggested by aspen on www.mersenneforum.org)
  • "has a factor" result lines now contain informations (program name,
    versions, bitlevel, ...) James Heinrich is working on this on the server
    side. This should give more accurate credits for "has a factor" results
    from the primenet server once this is fully implemented.
  • mfaktc no longer refuses to load a checkpoint file from a Linux version
    with a Windows version of mfaktc and vice versa. Of course mfaktc still
    refuses to load checkpoint files from other versions than itself
    (identical version string!)
  • added a (simple) signal handler (captures SIGINT and SIGTERM).
    1st ^C: mfaktc will exit after the currently processed class is finished.
    2nd ^C: mfaktc will stop immediately
  • added a minimum delay between two checkpoint file writes. The user can set
    the delay in mfaktc.ini (CheckpointDelay).
  • added a new code path to barrett79_mul32 and barrett92_mul32 kernels, CUDA
    >= 4.1 features multiply-add with carry for compute capability >= 2.0.
    On my GTX 470 (compute capability) this yields up to 15% for
    barrett92_mul32 and up to 7% for barrett79_mul32 extra throughput.

As usuall: finish your current assignments with your current version and do the update after it, mfaktc 0.18 will refuse foreign checkpoint files.

Oliver
TheJudger is offline   Reply With Quote
Old 2011-12-20, 00:43   #1411
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

1015810 Posts
Thumbs up Kudos!

Many thanks, sir! I am impatient for my current assignments to finish so that I can put this version into service.
kladner is offline   Reply With Quote
Old 2011-12-20, 01:34   #1412
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Would you mind posting the .dll/.so s on the mfatkc mirror? I'd rather not have to download the whole CUDA environment...

Last fiddled with by Dubslow on 2011-12-20 at 01:34
Dubslow is offline   Reply With Quote
Old 2011-12-20, 03:47   #1413
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

5·1,931 Posts
Default

Quote:
Originally Posted by TheJudger View Post
...mfaktc 0.18...
Output file (results.txt) customizable from the ini file? (including the path, for collecting all the results from all running processes of mfaktc in a single file).
LaurV is online now   Reply With Quote
Old 2011-12-20, 04:04   #1414
diamonddave
 
diamonddave's Avatar
 
Feb 2004

2408 Posts
Default

Quote:
Originally Posted by TheJudger View Post
[*]"has a factor" result lines now contain informations (program name,
versions, bitlevel, ...) James Heinrich is working on this on the server
side. This should give more accurate credits for "has a factor" results
from the primenet server once this is fully implemented.
Many thanks! Can't wait to test this feature with a new exponent!
diamonddave is offline   Reply With Quote
Old 2011-12-20, 05:05   #1415
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2×3×1,693 Posts
Default

The new version seems to be working well. At least, there have been no problems reported.
kladner is offline   Reply With Quote
Old 2011-12-20, 11:13   #1416
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Quote:
Originally Posted by Dubslow View Post
Would you mind posting the .dll/.so s on the mfatkc mirror? I'd rather not have to download the whole CUDA environment...
They are included in the archives for the executables, aren't they?

Quote:
Originally Posted by LaurV View Post
Output file (results.txt) customizable from the ini file? (including the path, for collecting all the results from all running processes of mfaktc in a single file).
Well, I'm still unsure about this feature. Personally I don't like it but it seems that you and some others want it. Bdot (mfakto) tries to convince me, too.

So I guess I'll add this for 0.19?

Oliver
TheJudger is offline   Reply With Quote
Old 2011-12-20, 13:55   #1417
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11×311 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Well, I'm still unsure about this feature. Personally I don't like it but it seems that you and some others want it. Bdot (mfakto) tries to convince me, too.
I also think it would be good to have as a configurable option. Naturally you'll need to lock the file for writing for the split second it takes to write the result line so two instances don't try and write at the same time.

Along the same lines, a unified worktodo.txt would also be nice, perhaps split into [Worker #1], [Worker #2], etc sections. This is of course a little more work than a configurable results.txt, but lets us just deal with one in and one out for each machine, in a format that's already familiar to us from Prime95.

Even better would be to optimize/thread the sieving such that we'd only ever need to run a single mfaktc instance (sieving would spread across as many CPU cores as needed to feed the GPU(s). But that's a whole other set of complications for a much later release.
James Heinrich is offline   Reply With Quote
Old 2011-12-20, 14:47   #1418
Chuck
 
Chuck's Avatar
 
May 2011
Orange Park, FL

15658 Posts
Default

Great! Thanks for the update. I've got two instances running now.
Chuck is offline   Reply With Quote
Old 2011-12-20, 15:07   #1419
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2×3×1,693 Posts
Default .17 vs .18

This was rather a quick test, showing the difference between mfaktc .17 and .18. V.18 did eventually drop to SievePrimes 5000, though the time didn't really change that much.

EDIT: These were run with the same exponent in single instances.
Attached Thumbnails
Click image for larger version

Name:	mfaktc_17-18.jpg
Views:	123
Size:	72.3 KB
ID:	7415  

Last fiddled with by kladner on 2011-12-20 at 15:09
kladner is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 10:20.


Mon Aug 2 10:20:05 UTC 2021 up 10 days, 4:49, 0 users, load averages: 0.94, 1.04, 1.12

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.