mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2018-10-28, 22:00   #2949
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

2×2,341 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Just uploaded a new set of binaries of mfaktc 0.21 for Windows using CUDA 10.0.If you're already running mfaktc 0.21 using an older CUDA version there is no need to upgrade, the source of mfaktc are unmodified.
These CUDA 10.0 binaries are compiled for compute_30 (Kepler), compute_35 (Kepler Update), compute_50 (Maxwell), compute_60 (Pascal), compute_70 (Volta) and compute_75 (Turing).
No support for compute_20 (Fermi) or even older cards and only 64bit binaries - main purpose of these binaries are Volta and Turing GPUs. For the latter there are only 64bit drivers available so the decission was easy.

Happy factor hunting!
Oliver

P.S. I had no access to Volta and Turing running Windows - if someone has such a GPU running on Windows please run the full selftest (e.g. mfaktc*.exe -st) for all 4 binaries and report results. Thank you!
I have results for binary: mfaktc-win-64.exe

PASSED!
Attached Thumbnails
Click image for larger version

Name:	Self_Test 2080Ti from petrw1.png
Views:	81
Size:	25.0 KB
ID:	19186  
petrw1 is offline   Reply With Quote
Old 2018-10-28, 22:57   #2950
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

2·2,341 Posts
Default HOWEVER....

When I ran an actual factor assignment …

First the Good News (I think) I was getting about 3,800 GD/Day

But after about a minute of running my screen pixelated like this (mfaktc 1)

Then the PC crashed with this STOPCODE (mfaktc 2)
Attached Thumbnails
Click image for larger version

Name:	mfaktc 1.png
Views:	96
Size:	782.5 KB
ID:	19188   Click image for larger version

Name:	mfaktc 2.png
Views:	85
Size:	360.4 KB
ID:	19189  

Last fiddled with by petrw1 on 2018-10-28 at 22:58
petrw1 is offline   Reply With Quote
Old 2018-10-29, 19:40   #2951
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Hello,

is this behaviour repeatable? The pixelated screens looks like too much OC or HW failure to me in first place but that is not the only option. Did you run some other workloads?

Keep in mind that the selftest is a selftest for the software itself and not a stresstest for HW and it doesn't put much load on the GPU during selftest.

Oliver
TheJudger is offline   Reply With Quote
Old 2018-10-29, 20:40   #2952
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

468210 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hello,

is this behaviour repeatable? The pixelated screens looks like too much OC or HW failure to me in first place but that is not the only option. Did you run some other workloads?

Keep in mind that the selftest is a selftest for the software itself and not a stresstest for HW and it doesn't put much load on the GPU during selftest.

Oliver
Repeated 3 times the same evening.
After the second I tried to install the driver that came with the new monitor (grasping at straws).

I haven't tried any OC of the GPU...just stock settings.
I did notice that the GhzDays/Day varied quite a bit from about 2,500 to 3,800 over that minute before the crash.

Should I try other exponent ranges or bit levels?
Are there any config parameters that might be useful to try?
Could it be a driver issue???

I just didn't know where to start....you (mfaktc issue) or NVIDIA (GPU issue) or my tech support (CPU/MB/RAM issue).

In case it is relevant (though unlikely) it was built with 4x*GB RAM but Windows only sees 24GB.

Thx
petrw1 is offline   Reply With Quote
Old 2018-10-29, 21:31   #2953
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

Hi,

I recommend to test other software in this case. I know that mfaktc in non-selftest mode easily hits the powertarget on RTX 2080 Ti so likely similar on other turing cards. Maybe try something like Furmark to stress your GPU really hard.
24 out of 32 GiB looks like one memory module isn't detected, maybe tools like CPUz give a hint which one.
I would do step by step
1. fix memory detection
2. run memtest and/or Prime95 torture test
3. put some load on GPU
Just the usual "how to test my system".

Oliver
TheJudger is offline   Reply With Quote
Old 2018-10-29, 23:31   #2954
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

7×11×41 Posts
Default

Launch GPU-Z or MSI Afterburner before you start mfaktc and watch the temperature and fan speed, it could be overheating. In Afterburner you can set a manual fan curve based on temperature, make sure fan speed is at 100% at around 80 C or lower.
https://www.techpowerup.com/gpuz/
https://www.msi.com/page/afterburner
ATH is offline   Reply With Quote
Old 2018-10-30, 00:56   #2955
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

3·7·257 Posts
Default

Quote:
Originally Posted by ATH View Post
Launch GPU-Z or MSI Afterburner before you start mfaktc and watch the temperature and fan speed, it could be overheating. In Afterburner you can set a manual fan curve based on temperature, make sure fan speed is at 100% at around 80 C or lower.
https://www.techpowerup.com/gpuz/
https://www.msi.com/page/afterburner
Amen to that, plus GPU-Z can log to a file. It looks something like this:
Code:
        Date        , GPU Core Clock [MHz] , GPU Memory Clock [MHz] , GPU Load [%] , Memory Usage (Dedicated) [MB] , CPU Temperature [°C] , System Memory Used [MB] ,
2018-10-29 19:35:36 ,              448.9   ,                478.8   ,          0   ,                         141   ,               72.0   ,                  3329   ,
2018-10-29 19:35:38 ,              499.3   ,                532.6   ,          0   ,                         139   ,               70.0   ,                  3330   ,
2018-10-29 19:35:41 ,              499.3   ,                532.6   ,          5   ,                         161   ,               64.0   ,                  3352   ,
2018-10-29 19:35:43 ,              499.3   ,                532.6   ,          2   ,                         163   ,               71.0   ,                  3355   ,
2018-10-29 19:35:46 ,              499.3   ,                532.6   ,          3   ,                         161   ,               72.0   ,                  3357   ,
2018-10-29 19:35:48 ,              499.3   ,                532.6   ,          1   ,                         157   ,               72.0   ,                  3350   ,
2018-10-29 19:35:51 ,              510.0   ,                544.0   ,          3   ,                         159   ,               68.0   ,                  3341   ,
2018-10-29 19:35:53 ,              510.0   ,                544.0   ,          5   ,                         157   ,               70.0   ,                  3336   ,
2018-10-29 19:35:56 ,              510.0   ,                544.0   ,          3   ,                         156   ,               74.0   ,                  3337   ,
2018-10-29 19:35:58 ,              510.0   ,                544.0   ,          1   ,                         157   ,               74.0   ,                  3338   ,
(example above is from a nearly idle Intel Arrandale IGP)
The significant variation in GhzD/day could indicate high temperature causing clock to throttle back; in old models it causes 50% reductions.
Another good app is HWMonitor, from CPUID, which will indicate and log various parameters. And there's nvidia-smi, which also has logging capability.
kriesel is offline   Reply With Quote
Old 2018-10-30, 03:21   #2956
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

2·2,341 Posts
Default My tech support was kind enough to test GPU

Even though I didn't buy it from them he ran FurMark on their own test machine and got the same "artifacting" I was getting running mfaktc.

He said the card was faulty and I convinced NVIDIA support the same so they are going to replace it (oh well….tick tick)

Also they got the RAM fixed...it now recognized all 32GB at 3600.

So I ran a benchmark but was surprised that the timings it sent to Prime95 are about the same as my sons i7-6700 with stock RAM
Attached Thumbnails
Click image for larger version

Name:	Benchmarks.jpg
Views:	84
Size:	79.9 KB
ID:	19191  
petrw1 is offline   Reply With Quote
Old 2018-10-30, 15:05   #2957
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

55628 Posts
Default

https://www.theinquirer.net/inquirer...n-high-numbers
Mark Rose is offline   Reply With Quote
Old 2018-10-30, 18:34   #2958
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

21278 Posts
Default

Quote:
Originally Posted by petrw1 View Post
Even though I didn't buy it from them he ran FurMark on their own test machine and got the same "artifacting" I was getting running mfaktc.

He said the card was faulty and I convinced NVIDIA support the same so they are going to replace it (oh well….tick tick)
Thank you for your followup report! I have the feeling that some people feel ashamed (for no reason) when their hardware is faulty and don't report back.

Oliver
TheJudger is offline   Reply With Quote
Old 2018-10-30, 19:36   #2959
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

539710 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Thank you for your followup report! I have the feeling that some people feel ashamed (for no reason) when their hardware is faulty and don't report back.

Oliver
And thanks much for the report and the various followups. I had been contemplating an RTX2080, which I see in Mark's posted link is also affected. Hopefully NVIDIA gets things straightened out soon.
kriesel is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 08:25.


Tue Jul 27 08:25:59 UTC 2021 up 4 days, 2:54, 0 users, load averages: 1.68, 1.80, 1.78

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.