mersenneforum.org CUDA driver disappeared after patch installation + kernel update
 Register FAQ Search Today's Posts Mark Forums Read

2012-12-28, 04:06   #12
Dubslow

"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts

Quote:
 Originally Posted by chalsall Actually, that's not really bizzare. The nVidia drivers are proprietary code. So you are suppose to download them yourself each and every time. Then run the script to recompile the driver against your current kernel. Welcome to freedom... Even though you payed for all the hardware, you still have to jump through hoops to run said hardware using free software....
Hey, speak for yourself. Using the repository mentioned above, I've not had one ounce of problems with the driver (Ubuntu 11.04). apt downloads the updates whenever they're available (which admittedly is never now that 11.04 is unsupported) and they work just fine, reboot or not.

However, before I found the repository, their installation scripts never worked on Ubuntu, and I was much more frustrated then.

2012-12-28, 04:15   #13
chalsall
If I May

"Chris Halsall"
Sep 2002

9,479 Posts

Quote:
 Originally Posted by Dubslow Hey, speak for yourself. Using the repository mentioned above, I've not had one ounce of problems with the driver (Ubuntu 11.04). apt downloads the updates whenever they're available (which admittedly is never now that 11.04 is unsupported) and they work just fine, reboot or not.
But... Technically you're breaking the contract.

Every time the kernel gets updated you are supposed to recompile the nVidia drivers yourself.

It's stupid, I agree. It's also the law (which is also stupid).

2012-12-28, 05:00   #14
Dubslow

"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100001101012 Posts

Quote:
 Originally Posted by chalsall But... Technically you're breaking the contract. Every time the kernel gets updated you are supposed to recompile the nVidia drivers yourself. It's stupid, I agree. It's also the law (which is also stupid).
Really? What a silly stipulation. (I assume by "recompile" you mean reinstall with the shell script they provide?)

I guess that anyone using that repository is breaking the contract then.

Also note that Canonical also supplies the drivers in a "partner" repository, and since those are from nVidia directly, that must have a different license.

*shrug*

2012-12-28, 23:52   #15
chalsall
If I May

"Chris Halsall"
Sep 2002

9,479 Posts

Quote:
 Originally Posted by Dubslow Really? What a silly stipulation. (I assume by "recompile" you mean reinstall with the shell script they provide?)
Yes. You "sh NVIDIA-Linux-x86_64-310.19.run", for example, as root, and trust (hope? (pray?)) they don't do anything nasty. Or they've been hacked. Interestingly, their binary script files they provide are never signed with a key...

Quote:
 Originally Posted by Dubslow I guess that anyone using that repository is breaking the contract then.
Probably. Please see as to the current restrictions.

It *is* possible that the rpm/dep file downloads the current nVidia driver and runs it. But I doubt it. And for some reason the repo maintainers are always behind the ball on kernel upgrades and nVidia driver releases...

Quote:
 Originally Posted by Dubslow Also note that Canonical also supplies the drivers in a "partner" repository, and since those are from nVidia directly, that must have a different license. *shrug*
But as Richard Stallman pointed out recently, Ubuntu isn't Free. Even though it's free.

It's complicated....

2012-12-29, 01:19   #16
TheJudger

"Oliver"
Mar 2005
Germany

11·101 Posts

Hi!

Quote:
 Originally Posted by Graff Just tried another reboot. This time ran the nvidia-smi -a command with sudo. Normal output resulted! Was able to get mfaktc running. Will now try this on the other machine. Same thing, no joy until I ran sudo nvidia-smi -a. Gareth
Quote:
 Originally Posted by Dubslow Huh, yes that is really bizzare behavior. Another thing to try is sudo apt-get upgrade, though I'm not sure that would help. I have no idea why the drivers seem to disappear, or why sudo nvidia-smi -a would fix it (but not without the sudo).
I've an idea why 'sudo nvidia-smi -a' fixes the problem.
I guess you have either no X running or X is configured not to use the nvidia driver. X is priviliged to load kernel modules, a normal user not. When you run 'nvidia-smi -a' some software notices that it should load the nvidia kernel module. When running 'nvidia-smi -a' as normal user you're not allowed to... when you do the same as priviliged user (e.g. root) it is possible to load the module. Same for mfaktc: running as user: no automagic module loading... but when you start as root it would...

Solutions (pick one of them):
• configure udev
• load X (with nvidia modules)
• write a small startup script which loads the modules

I would choose the first option, this is tested on Ubuntu 12.04. Create those two files as root:
/etc/udev/rules.d/86-nvidia.rules:
Code:
SUBSYSTEM=="module", KERNEL=="nvidia", RUN+="/lib/udev/nvidia.sh"
/lib/udev/nvidia.sh:
Code:
#!/bin/bash

mknod -m 666 /dev/nvidiactl c 195 255
#chown root:root /dev/nvidiactl

for DEV in {0..7} # one for each GPU
do
mknod -m 666 /dev/nvidia${DEV} c 195${DEV}
#  chown root:root /dev/nvidia\${DEV}
done
chmod +x /lib/udev/nvidia.sh

Oliver

2013-03-14, 11:57   #17
Graff

Jul 2006
USA (UT-5) via UK (UT)

EC16 Posts

Quote:
 Originally Posted by henryzz Just the last one should do. The first command was adding a repository. The second was downloading the lists from that repository.
After replacing the power supply in one of my machines, along with
a replacement GPU, mfaktc ran quite happily on both GPUs. One
process ran out of work, so I repopulated worktodo.txt and tried to
rerun mfaktc.exe. Got the following error:

NVIDIA: API mismatch: the NVIDIA kernel module has version 304.64,
but this NVIDIA driver component has version 304.84. Please make
sure that the kernel module and all NVIDIA driver components
have the same version.
Failed to initialize NVML: Unknown Error

(The other GPU is continuing to run and is not going to run out of work
any time soon.)

I tried running "sudo nvidia-smi -a" and got:

NVIDIA: API mismatch: the NVIDIA kernel module has version 304.64,
but this NVIDIA driver component has version 304.84. Please make
sure that the kernel module and all NVIDIA driver components
have the same version.
Failed to initialize NVML: Unknown Error

Presumably some update applied in the <12 hours that the machine
has been running broke something in the NVIDIA set up. (The machine
had been off-line for a couple of weeks, so there were a lot of

Any thoughts?

Gareth

2013-03-14, 23:44   #18
Graff

Jul 2006
USA (UT-5) via UK (UT)

3548 Posts

Quote:
 Originally Posted by Graff Any thoughts? Gareth
Fixed it. A simple reboot fixed the problem. The last patch install
was apparently the problem and it did not say that a reboot was
necessary.

Gareth

 Similar Threads Thread Thread Starter Forum Replies Last Post Stargate38 Operation Billion Digits 2 2017-05-08 18:53 Manpowre GPU Computing 43 2013-08-22 12:28 moebius Linux 1 2012-02-20 03:17 10metreh Aliquot Sequences 3 2010-02-15 15:57 mdettweiler Forum Feedback 3 2008-02-21 09:20

All times are UTC. The time now is 00:29.

Wed Mar 3 00:29:14 UTC 2021 up 89 days, 20:40, 0 users, load averages: 2.45, 2.63, 2.75