mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfaktc: a CUDA program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=12827)

moebius 2018-01-13 00:15

[QUOTE=TheJudger;475599]Does the card really overheat or is it just bad (broken) hardware?
Oliver[/QUOTE]

I solved the problem as follows.
Core Clock and Memory Clock are now downclocked 100 MHz to the values of a GTX 560 TI NON OC.

There were no more error messages since then.


Thank you for the support

kladner 2018-01-13 16:20

[QUOTE=moebius;477410]I solved the problem as follows.
Core Clock and Memory Clock are now downclocked 100 MHz to the values of a GTX 560 TI NON OC.

There were no more error messages since then.


Thank you for the support[/QUOTE]
You might be able to reduce temps a bit more by setting the memory clock much lower. This will not impact mfaktc performance. I run both my cards 500-700 MHz under normal for memory.

storm5510 2018-01-13 17:44

[QUOTE=moebius;475609]...The temperatures rise sometimes over 100°C and no i don't think the card is defect, I let run CUDALucas as well on it... also for LL double check...[/QUOTE]

100°C is pushing that envelope pretty hard. My old GTX 480 runs around 91°C under a heavy load, [I]mfaktc[/I]. [I]CUDALucas[/I] and [I]CUDAPm1[/I], in the upper 80's. I have ran it with "SieveOnGPU" disabled. That cuts the heat and power consumption. Of course, doing this reduces the GHz-d/day nearly half.

kriesel 2018-01-13 18:46

[QUOTE=moebius;475609]Yes it's because of overheating, The temperatures rise sometimes over 100°C and no i don't think the card is defect, I let run CUDALucas as well on it... also for LL double check.
At a certain temperature, the GRAKA(CUDA)-driver simply crashes. Thats all, not so dramatic...

[URL="https://www.mersenne.org/report_exponent/?exp_lo=44714303&full=1"]44714303[/URL][/QUOTE]

From a geforce. com specifications sheet, maximum gpu temperature is 105 C for the GTX480. Quadro 4000 is also 105C; quadro 2000 102C. GTX 1070 94C; GTX 1060 94C; GTX 1050Ti 97C. All my gpus run with at least 9C temperature margin, including GTX480s in adjacent slots. Some have 30C or more of margin. Cooler electronics tend to live longer.

Max for 560Ti is 99C; 97C for limited edition. [URL]https://www.geforce.com/hardware/desktop-gpus/geforce-gtx-560ti/specifications[/URL].

Memory controller loads tend to be around 60% for LL or P-1, and only around 1% for TF, so throttling memory back considerably for TF should have little impact on throughput.

(All operating values on my hardware, obtained from GPU-Z)

ATH 2018-01-13 21:42

You should run MSI Afterburner and make sure the GPU fan is running 100% to keep the temperature down as much as possible.

kriesel 2018-01-14 00:11

[QUOTE=ATH;477472]You should run MSI Afterburner and make sure the GPU fan is running 100% to keep the temperature down as much as possible.[/QUOTE]

Case ventilation should also be checked. A well ventilated case will handle multiple GPUs and 500W of GPU power without them reaching 100C. High PCB temperatures might indicate poor case ventilation. Fans could be fine yet clearance, or pet hair or whatever cut air flow.

I found an older system running, though not well, with only one of its 3 fans operating. (One looked like it had caught fire!)

storm5510 2018-01-15 19:45

[QUOTE=ATH;477472]You should run MSI Afterburner and make sure the GPU fan is running 100% to keep the temperature down as much as possible.[/QUOTE]

I used this on my GTX 480 a few times. The fan on it, at 100%, sounded like a siren. 82% to 85% worked for me. My case has a lot of ventilation. It makes a difference.

Rodrigo 2018-01-30 05:06

How to adjust GPU memory clock in MSI Afterburner?
 
A few days ago one of my GPUs, a GeForce GT 630, completed a TF assignment overnight, no problems. In the morning I fed it a new set of exponents and went out of the office.

As this is a secondary system, I pay little attention to it except around the time when I anticipate it'll be finishing up a TF batch. So several days later I wiggled the mouse to wake up the display -- and nothing happened. The screen didn't come back after hitting any keys on the keyboard, either.

Eventually I realized that the PC was awake but not sending anything to the monitor. After a reboot and some tests, I discovered that the 630, which had been working just fine until the end of the last TF run, now could no longer run MFAKTC for more than a couple of minutes before it reached 100C and cr*pped out, requiring a reboot. Opening the PC case (for more airflow) didn't help.

The fan does spin but its speed tops out at 90%.

Now I'm trying to fiddle with the MFAKTC settings and the GPU clocks in Afterburner (version 4.4.2). Disabling SieveOnGPU allowed the card to run a little longer before going blink.

Regarding Afterburner, I could use it to dial down the [B]core[/B] clock from the default 810 MHz to 710 MHz, and that helped to slow down the process a little more, but ultimately the card is still tickling 100C, at which point only a reboot would bring back the display.

And so here's the issue. I can lower the [B]memory[/B] clock from the default 533 MHz, but -- unlike the core clock -- as soon as I start MFAKTC it jumps right back up to 533. I can't seem to find a way to make any other (lower) setting stick. Yes, I do click on "Apply" after trying to change the clock.

Why does this work with the core clock, but not the memory clock? How do I change the memory clock setting in MSI Afterburner?

Rodrigo 2018-01-30 06:27

Addendum to the above post:

I also tried dusting the inside of the PC case. Then I removed the GPU and gave it a good dose of compressed air. These steps didn't help the graphics card's situation.

Maybe it's simply time to replace that card?

Mark Rose 2018-01-30 11:55

Given how little TF your card will produce, my suggestion would be not to use it.

If you do replace it, I'd look for a GTX 1050. It should be supported by your system. Also, the more expensive cards are ridiculously priced right now.

James Heinrich 2018-01-30 13:27

If it's getting to 100C that quick, quite probably the GPU fan isn't spinning (either at all, or at the appropriate speed). Less likely are things like the heatsink becoming detached from the GPU and other mechanical failures. In any case, replacing the GPU wouldn't be a bad idea.
The GTX 1050 will give you 250% relative performance for 115% power usage.
[url]http://www.mersenne.ca/mfaktc.php?filter=gt+630|gtx+1050[/url]


All times are UTC. The time now is 23:08.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.