mersenneforum.org  

Go Back   mersenneforum.org > Extra Stuff > Linux

Reply
 
Thread Tools
Old 2014-02-20, 16:44   #23
chris2be8
 
chris2be8's Avatar
 
Sep 2009

207810 Posts
Default

Quote:
Originally Posted by blip View Post
Code:
|   0  GeForce GTX 590     Off  | 0000:03:00.0     N/A |                  N/A |
|  0%   91C  N/A     N/A /  N/A |      153MB /  1535MB |     N/A      Default |
(quite hot, I know...)
Why is the fan for GPU 0 stopped even though it's at 91C? Have a look to see if it's turning. If you are lucky it's blocked by something so will start turning if you free it.

Chris
chris2be8 is offline   Reply With Quote
Old 2014-02-20, 18:12   #24
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

100000001000002 Posts
Default

Quote:
Why is the fan for GPU 0 stopped even though it's at 91C?
It might be that the 590 is two GPUs on one card, with one fan?
Xyzzy is offline   Reply With Quote
Old 2014-02-20, 21:52   #25
blip
 
blip's Avatar
 
Jan 2014

2×73 Posts
Default

That is correct: It has only one fan.
blip is offline   Reply With Quote
Old 2014-02-21, 17:49   #26
chris2be8
 
chris2be8's Avatar
 
Sep 2009

2·1,039 Posts
Default

OK , that explains it.

My GTX 560 Ti has two fans in the body, but nvidia-smi only reports one speed. I assume they both turn at the same speed.

Chris
chris2be8 is offline   Reply With Quote
Old 2014-02-22, 03:21   #27
TheMawn
 
TheMawn's Avatar
 
May 2013
East. Always East.

172710 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
OK , that explains it.

My GTX 560 Ti has two fans in the body, but nvidia-smi only reports one speed. I assume they both turn at the same speed.

Chris
Likely the voltage / PWM is fed to both but RPM is only taken from one fan.

A fairly common doodad for builders is a case fan Y-splitter cable (especially with CPU heatsinks designed for two fans) and some bad manufacturers didn't think to cut one the rotation speed wire on one of the ends. Unless the rotations are in perfect synchronization (+/- impossible) the RPMs add up and cause some serious confusion for the motherboard and user who see a fan spinning at 4000 RPM.

And then the good manufacturers who remove the third pin get flack from the self-proclaimed gurus about having missing pins

(Lots of dumb mistakes like that on Newegg from "5/5 tech knowledge" users... my favourite is a guy who "tested hundreds of fans in his life and has a masters in fluid dynamics complaining that there's no arrow showing you the direction of flow)
TheMawn is offline   Reply With Quote
Old 2014-03-13, 06:45   #28
blip
 
blip's Avatar
 
Jan 2014

2·73 Posts
Default

I am still having the issue with cudalucas stalling on this machine after a while (several hours).

mfaktc is running fine on the same card other chip (GTX590 with two chips).

Swapping processes between chips does not help.

Current setup:

Linux 3.11.0-18-generic (Ubuntu)

| NVIDIA-SMI 334.21 Driver Version: 334.21 |

both mfaktc and CUDAlucas compiled with

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2013 NVIDIA Corporation
Built on Sat_Jan_25_17:33:19_PST_2014
Cuda compilation tools, release 6.0, V6.0.1
blip is offline   Reply With Quote
Old 2014-04-16, 05:33   #29
preben s
 
"Preben Soeberg"
Nov 2013
Thailand

23 Posts
Default

I have had this problem for long time and I have a good hint on what is triggering it.

I have have very bad connection to the Internet, and the problem ocurred "only" when Chromium or Firefox (and to some extent my computer; it helped to throw out the crappy kmail) became unresponsive because of failing connection.

Got new phone lines drawn and no problems since then.

I made a script to run mfaktc under:


Code:
#!/usr/bin/sh

./mfaktc&
B=`nvidia-smi -q -d TEMPERATURE|grep Gpu|cut -c39,40`
while true; do
  A=$B; sleep 30
  B=`nvidia-smi -q -d TEMPERATURE|grep Gpu|cut -c39,40`
  C=`expr \( $A - $B \)`
  if  [ $C -ge 3 -o $B -lt 50 ]; then 
    kill `ps -e |grep mfaktc|grep -v sh|cut -d\  -f1`
    wait
    ./mfaktc&
  fi
done
It catches the stalling state and restarts rapidly.
preben s is offline   Reply With Quote
Old 2014-04-19, 14:33   #30
preben s
 
"Preben Soeberg"
Nov 2013
Thailand

816 Posts
Default

Sorry, I posted the code above without testing enough.
This time it is better tested.

Code:
#!/usr/bin/sh

# If this script is named other than mfaktc.sh,
# the script below should be changed accordingly.

# It is assumed that the only process named mfaktc is
# the one started by this script.

# The temperature limits used here seems to work fine
# down to 30 seconds sleep time under 35 C environment
# temperature. Tested on a GTX 780 @ 1006 Mhz.

# Using the programs cut and tr is chosen for lean-ness.
# awk (gawk) could be used instead of cut/tr.
# Also sed could be used instead of tr.

# "man kill" states that kill can be used to find the
# pid of a named process. If somebody can make that to work,
# the script could be simplified a little.


echo `date` > start.log
./mfaktc&
B=`nvidia-smi -q -d TEMPERATURE|grep Gpu|cut -c39,40`
while true; do
  A=$B; sleep 120
  B=`nvidia-smi -q -d TEMPERATURE|grep Gpu|cut -c39,40`
  C=`expr \( $A - $B \)`
  if  [ $C -ge 3 -o $B -lt 50 ]; then
    PID=`ps -e |grep mfaktc|grep -v sh|cut -c1-5|tr ' ' '0'`
    if [ $PID ]; then kill $PID; sleep 2; fi
    PID=`ps -e |grep mfaktc|grep -v sh|cut -c1-5|tr ' ' '0'`
    if [ $PID ]; then kill $PID; fi
    wait
    echo `date` > start.log
    ./mfaktc&
  fi
done
preben s is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Trouble restarting large job fivemack Msieve 4 2018-01-04 01:13
assignment restarting prob isaac1204 Information & Answers 2 2017-07-20 17:26
restarting nfs linear algebra cubaq YAFU 2 2017-04-02 11:35
Well hung parliaments davieddy Soap Box 0 2010-08-23 13:43
Stop p95 or llr before restarting? Joshua2 Software 6 2005-05-16 16:36

All times are UTC. The time now is 08:30.


Sat Jul 17 08:30:56 UTC 2021 up 50 days, 6:18, 1 user, load averages: 1.63, 1.66, 1.56

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.