mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfaktc: a CUDA program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=12827)

TheJudger 2014-05-04 11:32

Hi James,

the good news is that there is a high chance that you're using the correct CUDA runtimes. The bad news is that I've no clue what's wrong there.

Oliver

James Heinrich 2014-05-04 11:57

[QUOTE=TheJudger;372610]the good news is that there is a high chance that you're using the correct CUDA runtimes. The bad news is that I've no clue what's wrong there.[/QUOTE]They're the DLLs that I've been using for years, and are the ones you distributed with mfaktc. :smile:

I am also confused, which is why I posted here. :ermm:

TheJudger 2014-05-12 17:57

OK, thanks to a friend of mine (not a GIMPS participant) I could put my hands on a Maxwell GPU. He bought a GTX 750Ti for his PC and than (instead of driving home to put his new GPU in his own system) he visited me and gave me the GPU for a quick test. :smile: Obviously I didn't spent time on optimizing the code (I'm not even sure whether there are any Maxwell specific optimizations possible/feasible in mfaktc or not).

For those who don't know what a (CUDA-)multiprocessor (nvidia speaking) is: stop reading here! When I say Maxwell I'm talking about CC 5.0 and observations made on a GTX 750Ti.

Talking about mfaktc:[LIST][*]performance per multiprocessor per clock seems to be a little bit [B]below[/B] my Kepler (GTX 680 aka GK104)[*]while browsing the [URL="http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#maximize-instruction-throughput"]CUDA documentation[/URL] I'm afraid about the integer multiplication throughput on Maxwell, there is no [I]native[/I] integer multiplication on Maxwell?[*]comparing chips of same diesize I *guess* 28nm Maxwell wins by a small margin compared to 28nm Kepler[*]when talking about performance per watt I *guess* 28nm Maxwell chips the first choice[/LIST]
Oliver

P.S. I can't wait for the [I]big[/I] Maxwells, no matter if 28nm or 20nm.

blip 2014-05-13 05:36

[CODE]May 13 07:08 | 604 13.4% | 1.700 23m33s | 1157.78 82485 n.a.%
ERROR: cudaGetLastError() returned 73: an illegal instruction was encountered


Some background info:

mfaktc v0.20 (64bit built)

Compiletime options
THREADS_PER_BLOCK 256
SIEVE_SIZE_LIMIT 32kiB
SIEVE_SIZE 193154bits
SIEVE_SPLIT 250
MORE_CLASSES enabled

Runtime options
SievePrimes 25000
SievePrimesAdjust 1
SievePrimesMin 5000
SievePrimesMax 100000
NumStreams 3
CPUStreams 3
GridSize 3
GPU Sieving enabled
GPUSievePrimes 82486
GPUSieveSize 64Mi bits
GPUSieveProcessSize 16Ki bits
WorkFile worktodo.txt
Checkpoints enabled
CheckpointDelay 30s
Stages enabled
StopAfterFactor bitlevel
PrintMode full
V5UserID blip
ComputerID evclient004
AllowSleep no
TimeStampInResults no

CUDA version info
binary compiled for CUDA 6.0
CUDA runtime version 6.0
CUDA driver version 6.0

CUDA device info
name GeForce GTX 590
compute capability 2.0
maximum threads per block 1024
number of multiprocessors 16 (512 shader cores)
clock rate 1215MHz

Automatic parameters
threads per grid 1048576

running a simple selftest...
Selftest statistics
number of tests 92
successfull tests 92

selftest PASSED!

got assignment: exp=87475907 bit_min=73 bit_max=74 (21.87 GHz-days)
Starting trial factoring M87475907 from 2^73 to 2^74 (21.87 GHz-days)
k_min = 53984767285680
k_max = 107969534579839
Using GPU kernel "barrett76_mul32_gs"

found a valid checkpoint file!
last finished class was: 384
found 0 factor(s) already[/CODE]

blip 2014-05-13 05:56

GPU hangs after restart at self-test.
After swiching jobs with the second GPU, both continue.

TheJudger 2014-05-13 18:59

blib: might be defective hardware? It is running for a while and than throws a "cudaGetLastError() returned 73: an illegal instruction was encountered"... there is no JIT-compiling in mfaktc, after startup everything is static.

Oliver

kracker 2014-05-14 19:03

So, do you think it is a step up or down from Kepler?

firejuggler 2014-05-14 20:13

or a sidestep?

James Heinrich 2014-05-27 12:36

[QUOTE=James Heinrich;372508]However, I broke mfaktc. While it has been running fine on 335.23 since that came out (early March), mfaktc won't start now[/QUOTE]The good news is that drivers v337.88 released today fixed the problem.

TheJudger 2014-09-19 19:53

Seems we have new highscore for energy efficient trial factoring:

Stock/reference GTX 980
[CODE] ./mfaktc.exe -tf 66362159 72 73
mfaktc v0.21-pre6 (64bit built)
[...]
CUDA device info
name GeForce GTX 980
compute capability 5.2
maximum threads per block 1024
number of mutliprocessors 16 (unknown number of shader cores)
clock rate 1215MHz
[...]
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Sep 19 21:52 | 256 5.6% | 2.391 36m06s | 542.54 82485 n.a.%
Sep 19 21:52 | 261 5.7% | 2.373 35m48s | 546.66 82485 n.a.%
Sep 19 21:52 | 264 5.8% | 2.379 35m51s | 545.28 82485 n.a.%
Sep 19 21:52 | 265 5.9% | 2.380 35m49s | 545.05 82485 n.a.%
Sep 19 21:52 | 276 6.0% | 2.377 35m44s | 545.74 82485 n.a.%
[...]
[/CODE]

Oliver

Mark Rose 2014-09-19 20:00

What exactly determines/effects mfaktc performance on a given GPU?


All times are UTC. The time now is 23:14.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.