![]() |
Hi James,
the good news is that there is a high chance that you're using the correct CUDA runtimes. The bad news is that I've no clue what's wrong there. Oliver |
[QUOTE=TheJudger;372610]the good news is that there is a high chance that you're using the correct CUDA runtimes. The bad news is that I've no clue what's wrong there.[/QUOTE]They're the DLLs that I've been using for years, and are the ones you distributed with mfaktc. :smile:
I am also confused, which is why I posted here. :ermm: |
OK, thanks to a friend of mine (not a GIMPS participant) I could put my hands on a Maxwell GPU. He bought a GTX 750Ti for his PC and than (instead of driving home to put his new GPU in his own system) he visited me and gave me the GPU for a quick test. :smile: Obviously I didn't spent time on optimizing the code (I'm not even sure whether there are any Maxwell specific optimizations possible/feasible in mfaktc or not).
For those who don't know what a (CUDA-)multiprocessor (nvidia speaking) is: stop reading here! When I say Maxwell I'm talking about CC 5.0 and observations made on a GTX 750Ti. Talking about mfaktc:[LIST][*]performance per multiprocessor per clock seems to be a little bit [B]below[/B] my Kepler (GTX 680 aka GK104)[*]while browsing the [URL="http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#maximize-instruction-throughput"]CUDA documentation[/URL] I'm afraid about the integer multiplication throughput on Maxwell, there is no [I]native[/I] integer multiplication on Maxwell?[*]comparing chips of same diesize I *guess* 28nm Maxwell wins by a small margin compared to 28nm Kepler[*]when talking about performance per watt I *guess* 28nm Maxwell chips the first choice[/LIST] Oliver P.S. I can't wait for the [I]big[/I] Maxwells, no matter if 28nm or 20nm. |
[CODE]May 13 07:08 | 604 13.4% | 1.700 23m33s | 1157.78 82485 n.a.%
ERROR: cudaGetLastError() returned 73: an illegal instruction was encountered Some background info: mfaktc v0.20 (64bit built) Compiletime options THREADS_PER_BLOCK 256 SIEVE_SIZE_LIMIT 32kiB SIEVE_SIZE 193154bits SIEVE_SPLIT 250 MORE_CLASSES enabled Runtime options SievePrimes 25000 SievePrimesAdjust 1 SievePrimesMin 5000 SievePrimesMax 100000 NumStreams 3 CPUStreams 3 GridSize 3 GPU Sieving enabled GPUSievePrimes 82486 GPUSieveSize 64Mi bits GPUSieveProcessSize 16Ki bits WorkFile worktodo.txt Checkpoints enabled CheckpointDelay 30s Stages enabled StopAfterFactor bitlevel PrintMode full V5UserID blip ComputerID evclient004 AllowSleep no TimeStampInResults no CUDA version info binary compiled for CUDA 6.0 CUDA runtime version 6.0 CUDA driver version 6.0 CUDA device info name GeForce GTX 590 compute capability 2.0 maximum threads per block 1024 number of multiprocessors 16 (512 shader cores) clock rate 1215MHz Automatic parameters threads per grid 1048576 running a simple selftest... Selftest statistics number of tests 92 successfull tests 92 selftest PASSED! got assignment: exp=87475907 bit_min=73 bit_max=74 (21.87 GHz-days) Starting trial factoring M87475907 from 2^73 to 2^74 (21.87 GHz-days) k_min = 53984767285680 k_max = 107969534579839 Using GPU kernel "barrett76_mul32_gs" found a valid checkpoint file! last finished class was: 384 found 0 factor(s) already[/CODE] |
GPU hangs after restart at self-test.
After swiching jobs with the second GPU, both continue. |
blib: might be defective hardware? It is running for a while and than throws a "cudaGetLastError() returned 73: an illegal instruction was encountered"... there is no JIT-compiling in mfaktc, after startup everything is static.
Oliver |
So, do you think it is a step up or down from Kepler?
|
or a sidestep?
|
[QUOTE=James Heinrich;372508]However, I broke mfaktc. While it has been running fine on 335.23 since that came out (early March), mfaktc won't start now[/QUOTE]The good news is that drivers v337.88 released today fixed the problem.
|
Seems we have new highscore for energy efficient trial factoring:
Stock/reference GTX 980 [CODE] ./mfaktc.exe -tf 66362159 72 73 mfaktc v0.21-pre6 (64bit built) [...] CUDA device info name GeForce GTX 980 compute capability 5.2 maximum threads per block 1024 number of mutliprocessors 16 (unknown number of shader cores) clock rate 1215MHz [...] Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Sep 19 21:52 | 256 5.6% | 2.391 36m06s | 542.54 82485 n.a.% Sep 19 21:52 | 261 5.7% | 2.373 35m48s | 546.66 82485 n.a.% Sep 19 21:52 | 264 5.8% | 2.379 35m51s | 545.28 82485 n.a.% Sep 19 21:52 | 265 5.9% | 2.380 35m49s | 545.05 82485 n.a.% Sep 19 21:52 | 276 6.0% | 2.377 35m44s | 545.74 82485 n.a.% [...] [/CODE] Oliver |
What exactly determines/effects mfaktc performance on a given GPU?
|
| All times are UTC. The time now is 23:14. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.