![]() |
![]() |
#1 |
(loop (#_fork))
Feb 2006
Cambridge, England
13·491 Posts |
![]()
After a few automatic upgrades and a reboot, nvidia-smi is telling me
Code:
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. Presumably I need to cause the driver to get rebuilt against the new kernel version, but I can't see how you do that. |
![]() |
![]() |
![]() |
#2 |
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
32×17×61 Posts |
![]()
I've had the same problem for years while I was using my home workstation for both work and cuda. (I don't anymore; easier to run cuda computations on the cloud and keep home computer lightly loaded... and in Windows so the kids can use it)
What I've gathered is that NVIDIA makes this non-automatable deliberately. I still have to type 'accept' every time I install Code:
# ssh into a new EC2 node sudo yum -y update sudo yum -y install tcsh wget bc perl unzip gcc gcc-c++ openssh-clients diffutils gmp-devel kernel-devel-`uname -r` wget http://us.download.nvidia.com/XFree86/Linux-x86_64/358.16/NVIDIA-Linux-x86_64-358.16.run sudo sh ./NVIDIA-Linux-x86_64-358.16.run |
![]() |
![]() |
![]() |
#3 |
Jul 2003
So Cal
81516 Posts |
![]() |
![]() |
![]() |
![]() |
#4 |
"/X\(β-β)/X\"
Jan 2013
29·101 Posts |
![]()
Are you using dkms? It's used to recompile modules on kernel upgrades.
|
![]() |
![]() |
![]() |
#5 |
Bamboozled!
"πΊππ·π·π"
May 2003
Down not across
22·3·883 Posts |
![]() |
![]() |
![]() |
![]() |
#6 |
(loop (#_fork))
Feb 2006
Cambridge, England
13·491 Posts |
![]()
I believe I'm using dkms, but all it see it doing is deleting old versions of the module when I do apt-get autoremove to clean up the huge pile of old kernels filling my unreasonably-small /boot partition.
|
![]() |
![]() |
![]() |
#7 |
(loop (#_fork))
Feb 2006
Cambridge, England
13·491 Posts |
![]()
Don't you find running CUDA computations on the cloud expensive? I'm paying probably Β£200/year for electricity for the GTX580, though I suppose a g2.2xlarge at spot price is 5p/hour so that's only a factor two.
|
![]() |
![]() |
![]() |
#8 |
(loop (#_fork))
Feb 2006
Cambridge, England
13×491 Posts |
![]()
Installing on a fresh machine is basically fine.
But I'm now in a situation where nvidia-sim can't find the device, and downloading the .deb and doing 'sudo apt-get install cuda' just tells me 'cuda is already the newest version'. sudo apt-get purge cuda; sudo apt-get install cuda also does very little Code:
sudo apt-get remove cuda sudo apt-get install cuda I'll try again using the run-file that nvidia ship; after a reboot (I really would prefer a solution with no reboots - this is a compute node, I aim to have twelve gnfs-lasieve jobs running 24/365) I get a new exciting unhelpful message Code:
pumpkin@pumpkin:~$ nvidia-smi Failed to initialize NVML: GPU access blocked by the operating system |
![]() |
![]() |
![]() |
#9 |
(loop (#_fork))
Feb 2006
Cambridge, England
13·491 Posts |
![]()
On further examination, the card has fallen off the PCIe bus entirely: lspci | grep -i nv returns nothing. Maybe Ubuntu is not to blame.
|
![]() |
![]() |
![]() |
#10 | |
Sep 2009
200510 Posts |
![]() Quote:
You only need to install expect on the system you are connecting to the new node from, it can automate responses to a SSH session. Chris |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Firefox copies Microsoft, makes downgrades into necessary upgrades | jasong | jasong | 5 | 2017-11-22 14:11 |
get msieve1.52 working with CUDA 7 | Anyone | Msieve | 22 | 2015-11-16 17:40 |
need help setting up CUDA drivers on Ubuntu 10.04 | mdettweiler | GPU Computing | 9 | 2013-07-29 09:56 |
Keeping relations | fivemack | Factoring | 1 | 2009-01-26 17:49 |
Keeping the Heat down at Home | outlnder | Hardware | 61 | 2003-02-15 03:12 |