PDA

View Full Version : New user - nvidia drivers



SunshineSue
7th December 2017, 02:21 PM
I'm a new fedora linux user. I've been using it maybe 2 months.

I installed Fedora 26 from DVD on my new laptop with a nvidia Geforce 960M card. I also have windows 10 so I use UEFI. I couldn't get the nouveau driver to work so I started with getting the intel onboard graphics working. I put an entry in my grub file to do that. It's been running well the past 2 months so I thought it was time to get the nvidia driver installed.

That was harder because I had to figure out how to sign the kernel module. I made a pair of keys and ran a command to import a key and I put the key password in the UEFI window that popped up when I was first booting my computer.

I thought my keys were all set so I ran the .run installer from the Nvidia web site. The installer asked for both my public and my private key.

It finished without any errors so I thought the module was both installed and signed.

When I try to turn on the module, I get this error:

[root@localhost ~]# modprobe nvidia-drm
modprobe: ERROR: could not insert 'nvidia_drm': Required key not available

When I try to import my public key again, it tells me it's already imported:

[root@localhost ~]# mokutil --import pubkey.der
SKIP: pubkey.der is already enrolled

I kind of understand the two key system. It works like PGP email. There's a public key and a private key and one signs my module and the other opens it once it's been signed.

I tried running the nvidia run file again but now it doesn't ask for my keys to sign the module. I don't know how to manually sign it. I tried digging around in /usr/src/kernels but the modules aren't built there. I'm not sure how to find the compiled module to re-sign it.

So now I'm stuck...what part did I mess up and how do I fix it? I tried looking for answers but most suggest turning off secure boot in the bios which breaks my windows 10 install.

Note: Edited to try to make what I did and the question more clear.

HaydnH
7th December 2017, 04:12 PM
You need to sign the module with the key which is the command you're missing:



/usr/src/kernels/$(uname -r)/scripts/sign-file sha256 key.priv key.der /path/to/nvidia/module

SunshineSue
7th December 2017, 06:08 PM
Thank you for answering so quickly. Manually signing the module still caused the same error.

[root@localhost ~]# /usr/src/kernels/$(uname -r)/scripts/sign-file sha256 priv.key pubkey.der /usr/lib/modules/4.13.16-202.fc26.x86_64/extra/nvidia-drm.ko
[root@localhost ~]# modprobe nvidia-drm
modprobe: ERROR: could not insert 'nvidia_drm': Required key not available

Edit: Is there a log file I can look at for the signing or for modprobe that might give more information? Is there a way to confirm that I imported my key properly into the UEFI database?

amiga
8th December 2017, 02:40 AM
There is more than one NVidia kernel module. I have three loaded and there are four in the kernel package. The actual driver is nvidia.ko, the largest module. nvidia-modeset.ko and nvidia-drm.ko are used for kernel modesetting. The nvidia-drm.ko module is only loaded first by the kernel as it tries to do modesetting during the boot process. They all have to be signed.


$ lsmod | grep nvidia
nvidia_drm 53080 1
nvidia_modeset 790163 4 nvidia_drm
nvidia 11912556 58 nvidia_modeset

$ pwd
/usr/lib/modules/3.10.0-514.2.2.el7.x86_64/extra
$ du -h *
92K nvidia-drm.ko
16M nvidia.ko
1.1M nvidia-modeset.ko
1.2M nvidia-uvm.ko


Did you sign all of them ? The nvidia-drm.ko module depends on nvidia-modeset.ko which depends on the actual driver nvidia.ko itself. All three would have to be signed in order to load nvidia-drm.ko. Even if nvidia-drm.ko was signed properly the loading operation would fail if its dependent modules were not also signed. That could be what is happening in the error message you are getting.


$ modinfo nvidia-drm.ko
filename: /usr/lib/modules/3.10.0-514.2.2.el7.x86_64/extra/nvidia-drm.ko
version: 375.20
supported: external
license: MIT
...
depends: drm,drm_kms_helper,nvidia-modeset
...
$ modinfo nvidia-modeset.ko
filename: /usr/lib/modules/3.10.0-514.2.2.el7.x86_64/extra/nvidia-modeset.ko
version: 375.20
supported: external
license: NVIDIA
...
depends: nvidia



I saw a page once where someone used a bash for loop to sign all of the modules in this directory.

smr54
8th December 2017, 04:09 AM
You might have better luck with the rpmfusion version, which does all the work for you.
I believe the maintainer is on these forums. (I think he's still the maintainer, if he sees this, he will confirm or correct, I hope)
https://rpmfusion.org/Howto/NVIDIA?highlight=%28CategoryHowto%29

leigh123linux
8th December 2017, 07:11 AM
You might have better luck with the rpmfusion version, which does all the work for you.
I believe the maintainer is on these forums. (I think he's still the maintainer, if he sees this, he will confirm or correct, I hope)
https://rpmfusion.org/Howto/NVIDIA?highlight=%28CategoryHowto%29

We are working on automatic signing, it probably wont happen before F28

https://bugzilla.redhat.com/show_bug.cgi?id=1454824

smr54
8th December 2017, 07:19 PM
Thanks Leigh I wasn't sure if you were still involved with it.

SunshineSue
10th December 2017, 04:56 PM
You might have better luck with the rpmfusion version, which does all the work for you.
I believe the maintainer is on these forums. (I think he's still the maintainer, if he sees this, he will confirm or correct, I hope)
https://rpmfusion.org/Howto/NVIDIA?highlight=%28CategoryHowto%29

Thank you for answering. I saw the response that rpmfusion won't handle signing until later.

I used the .run file from the Nvidia web site and it had some sort of automated key signing. I don't know if it worked but the .run installer asked if I wanted to sign the modules and asked for my public and private key so it at least "tried" :D.






Did you sign all of them ? The nvidia-drm.ko module depends on nvidia-modeset.ko which depends on the actual driver nvidia.ko itself. All three would have to be signed in order to load nvidia-drm.ko. Even if nvidia-drm.ko was signed properly the loading operation would fail if its dependent modules were not also signed. That could be what is happening in the error message you are getting.

I saw a page once where someone used a bash for loop to sign all of the modules in this directory.

Oh. I didn't realize that I needed to sign all the nvidia modules. I'll look thorugh the extras directory and I'll try that.

Half of me wants to chuck Windows 10 out the window (unfortunately I can't - need it for work) and the other half is enjoying this because I'm learning so much about how Linux works.

Thanks for taking the time to answer my questions. I'll be back later to either mark the thread fixed or ask more questions.

amiga
11th December 2017, 02:27 AM
Oh. I didn't realize that I needed to sign all the nvidia modules. I'll look through the extras directory and I'll try that.

If you don't sign the actual 16MB driver module nothing would work anyway even if you didn't use modesetting.

SunshineSue
12th December 2017, 01:58 PM
I signed all the nvidia modules in "extras" and then I tested it with modprobe which completed silently.

Then I rebooted. Now I can't boot into Fedora. I can only boot into Windows 10. Fedora hangs when booting.

A start job is running for Hold until boot process finishes up (with a time counter afterwards)

I tried booting into run level 3 and I get a login prompt. When I login I can't "startx".

most of /usr/log/Xorg.0.log looks fine (to my newbie eyes). Stuff is just loading. I see it trying to load nvidia and glx.

Edit: I have a pastebin of the log : https://pastebin.com/Xs3vweXJ

amiga
12th December 2017, 11:01 PM
I signed all the nvidia modules in "extras" and then I tested it with modprobe which completed silently.

Then I rebooted. Now I can't boot into Fedora. I can only boot into Windows 10. Fedora hangs when booting.


Did you blacklist the open source nouveau drivers at boot time ? In order to use the nvidia driver the nouveau driver needs to be blacklisted so it doesn't load. If not there will be a conflict.


GRUB_CMDLINE_LINUX="rhgb rd.driver.blacklist=nouveau ...

You need to add rd.driver.blacklist=nouveau to the GRUB_CMDLINE_LINUX variable in the file /etc/default/grub. Then you would regenerate your grub.cfg with grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg.

Also check that the nvidia installer installed a file in /etc/modprobe.d/. The NVidia .run file does this. If this file exists and the kernel parameter above to blacklist nouveau exists nouveau should not load.


$ cat /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
# generated by nvidia-installer
blacklist nouveau
options nouveau modeset=0

SunshineSue
13th December 2017, 08:37 PM
Did you blacklist the open source nouveau drivers at boot time ? In order to use the nvidia driver the nouveau driver needs to be blacklisted so it doesn't load. If not there will be a conflict.

Yes, at the top of the pastebin log is my grub command line parameters:
Kernel command line: BOOT_IMAGE=/vmlinuz-4.13.16-202.fc26.x86_64 root=UUID=25237c23-9df6-4a51-b080-fdef636e779c ro rd.driver.blacklist=nouveau nvidia.modeset=0 nouveau.modeset=0 intel.modeset=1 gfxpayload=vga=normal rd.lvm.lv=fedora/swap rd.lvm.lv=fedora/root rhgb quiet LANG=en_US.UTF-8 3


Also check that the nvidia installer installed a file in /etc/modprobe.d/. The NVidia .run file does this. If this file exists and the kernel parameter above to blacklist nouveau exists nouveau should not load.

It had not so I put it in manually. No change.

The only slight weirdness I can see is that one of my grub parameters shows as "nvidia.modeset=0".

I have "nvidia.modeset=1" in my /etc/default/grub file and the grub.cfg file shows _both_ versions (nvidia.modeset=0 and nvidia.modeset=1) and "1" is the last entry (which I think is supposed to take precedence over the "0" setting?)

Something, "somewhere" in my system is forcing nvidia.modeset=0. It's not much to go on but maybe an experienced person can make sense of that?

Edit: I'd like to be told how to turn off the nvidia loading so I can at least go back to booting with my intel driver when I'm not trying to troubleshoot.

amiga
13th December 2017, 11:27 PM
The only slight weirdness I can see is that one of my grub parameters shows as "nvidia.modeset=0".

There is more weirdness than that. You have intel.modeset=1 in your parameters. If you are trying to use your NVidia card why are you still turning on modeset for the Intel GPU ?


BOOT_IMAGE=/vmlinuz-4.13.16-202.fc26.x86_64 root=UUID=25237c23-9df6-4a51-b080-fdef636e779c ro rd.driver.blacklist=nouveau nvidia.modeset=0 nouveau.modeset=0 intel.modeset=1 gfxpayload=vga=normal rd.lvm.lv=fedora/swap rd.lvm.lv=fedora/root rhgb quiet LANG=en_US.UTF-8 3


I have "nvidia.modeset=1" in my /etc/default/grub file and the grub.cfg file shows _both_ versions (nvidia.modeset=0 and nvidia.modeset=1) and "1" is the last entry (which I think is supposed to take precedence over the "0" setting?)

I don't see nvidia.modeset=1 in what you posted. I only see intel.modeset=1.

Please post your /etc/default/grub file or at least the GRUB_CMDLINE_LINUX variable.

SunshineSue
14th December 2017, 09:47 AM
There is more weirdness than that. You have intel.modeset=1 in your parameters. If you are trying to use your NVidia card why are you still turning on modeset for the Intel GPU ?

Because I haven't taken it out yet. This is behaving as expected at least for now.

Should I remove the intel modeset setting?


I don't see nvidia.modeset=1 in what you posted. I only see intel.modeset=1.

You seem to have misunderstood what I said. You won't see nvidia.modeset=1. You'll see nvidia.modeset=0

BOOT_IMAGE=/vmlinuz-4.13.16-202.fc26.x86_64 root=UUID=25237c23-9df6-4a51-b080-fdef636e779c ro rd.driver.blacklist=nouveau nvidia.modeset=0 nouveau.modeset=0 intel.modeset=1 gfxpayload=vga=normal rd.lvm.lv=fedora/swap rd.lvm.lv=fedora/root rhgb quiet LANG=en_US.UTF-8 3

There should be nvidia.modeset=1 but instead it's nvidia.modeset=0


Please post your /etc/default/grub file or at least the GRUB_CMDLINE_LINUX variable.

It would be very helpful if I could be told how to shut off whatever's trying to load my nvidia drivers so I can at least get into linux using my intel drivers.

Edit: here is /etc/default/grub:


GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
##GRUB_CMDLINE_LINUX="nomodeset rhgb quiet"
GRUB_DISABLE_RECOVERY="true"
GRUB_CMDLINE_LINUX="rd.driver.blacklist=nouveau nouveau.modeset=0 nvidia.modeset=1 intel.modeset=1 gfxpayload=vga=normal rd.lvm.lv=fedora/swap rd.lvm.lv=fedora/root"


Here is just grub_cmdline_linux:
GRUB_CMDLINE_LINUX="rd.driver.blacklist=nouveau nouveau.modeset=0 nvidia.modeset=1 intel.modeset=1 gfxpayload=vga=normal rd.lvm.lv=fedora/swap rd.lvm.lv=fedora/root"

Here is a line from my grub.cfg file:
linuxefi /vmlinuz-4.13.16-202.fc26.x86_64 root=UUID=25237c23-9df6-4a51-b080-fdef636e779c ro rd.driver.blacklist=nouveau nvidia.modeset=0 nouveau.modeset=0 nvidia.modeset=1 intel.modeset=1 gfxpayload=vga=normal rd.lvm.lv=fedora/swap rd.lvm.lv=fedora/root
initrdefi /initramfs-4.13.16-202.fc26.x86_64.img

In grub_cmdline_linux I set "nvidia.modeset=1"

In grub.cfg it shows "nvidia.modeset=0" and after it "nvidia.modeset=1"

In the list of boot variables in my pastebin file of the log it shows "nvidia.modeset=0"

I try to set nvidia.modeset=1 and something in my computer changes it to nvidia.modeset=0 along the way. I don't know what's doing it. This may be a clue to what's going on, or it may be unrelated but I noticed it happening and it seems like it might be useful to mention it.

amiga
14th December 2017, 04:02 PM
Because I haven't taken it out yet. This is behaving as expected at least for now.
Should I remove the intel modeset setting?

If you are trying to use the NVidia driver you shouldn't have the modesets for two different drivers enabled. You may also have to blacklist the intel driver just as you did with nouveau. I don't have an Intel GPU in my i7-2600. I only have my GTX550. I have never had to do this but you may have to. I recommend blacklisting the intel driver as well as setting its modeset to 0.

There are two things you may not realize that will make your life easier. The first is that even though grub.cfg is auto-generated you can create any custom entries you want in the /etc/grub.d/40_custom file and they will be copied to grub.cfg. If you create your own entries in this file all of the problems you are having below will go away as you can edit the entry as you want. Simply copy the existing entry in grub.cfg to this file and edit it to your liking.


In grub_cmdline_linux I set "nvidia.modeset=1"
In grub.cfg it shows "nvidia.modeset=0" and after it "nvidia.modeset=1"
In the list of boot variables in my pastebin file of the log it shows "nvidia.modeset=0"

I try to set nvidia.modeset=1 and something in my computer changes it to nvidia.modeset=0 along the way. I don't know what's doing it. This may be a clue to what's going on, or it may be unrelated but I noticed it happening and it seems like it might be useful to mention it.


If you create your own grub entries in /etc/grub.d/40_custom all of these problems go away.

The second thing is that you don't need modesetting at all to boot. This is just to get a smaller font during boot and for virtual consoles. You can simply use nomodeset for now.

SunshineSue
19th December 2017, 05:17 PM
If you are trying to use the NVidia driver you shouldn't have the modesets for two different drivers enabled. You may also have to blacklist the intel driver just as you did with nouveau. I don't have an Intel GPU in my i7-2600. I only have my GTX550. I have never had to do this but you may have to. I recommend blacklisting the intel driver as well as setting its modeset to 0.

<snip>

The second thing is that you don't need modesetting at all to boot. This is just to get a smaller font during boot and for virtual consoles. You can simply use nomodeset for now.

I set the intel modeset to 0 and set nomodeset at the end of the grub line. I didn't blacklist the intel driver because I found other discussions that explained that it doesn't stop the nvidia module from loading and blacklisting it can cause problems.

It still wouldn't boot.

I found the "lsmod" command and here is the nvidia output from it:



nvidia_drm 45056 0
nvidia_modeset 843776 1 nvidia_drm
nvidia 13119488 1 nvidia_modeset
drm_kms_helper 159744 3 nouveau,i915,nvidia_drm
drm 352256 6 nouveau,i915,ttm,nvidia_drm,drm_kms_helper


Nouveau is in there. Does that mean it wasn't blacklisted properly?

Also, here are the last few lines of the Xorg log file:


[ 109.602] (II) NVIDIA GLX Module 384.98 Thu Oct 26 14:35:55 PDT 2017
[ 109.602] (II) LoadModule: "nvidia"
[ 109.602] (II) Loading /usr/lib64/xorg/modules/drivers/nvidia_drv.so
[ 109.602] (II) Module nvidia: vendor="NVIDIA Corporation"
[ 109.602] compiled for 4.0.2, module version = 1.0.0
[ 109.602] Module class: X.Org Video Driver
[ 109.602] (II) NVIDIA dlloader X Driver 384.98 Thu Oct 26 14:06:45 PDT 2017
[ 109.602] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[ 109.602] xf86EnableIOPorts: failed to set IOPL for I/O (Operation not permitted)
[ 109.602] (II) systemd-logind: releasing fd for 226:1
[ 109.603] (EE) No devices detected.
[ 109.603] (EE)
Fatal server error:
[ 109.603] (EE) no screens found(EE)
[ 109.603] (EE)


This line looks like an error: xf86EnableIOPorts: failed to set IOPL for I/O (Operation not permitted)

I decided to search for the error and found this discussion: https://bbs.archlinux.org/viewtopic.php?id=230139

The resolution was to install Bumblebee for the two video chips (Intel and Nvidia).

I'm going to try this but first I have to be able to boot into the graphical mode. My wireless card won't work in text mode. I'm posting this now so that I don't forget what I did/what I was thinking.

I've asked twice how to stop the nvidia drivers from loading so I can boot using my intel chip but no one answered. I'm going to try to uninstall the nvidia run file and see if that works.

I'll edit the post when I'm done.

amiga
19th December 2017, 10:45 PM
Nouveau is in there. Does that mean it wasn't blacklisted properly?

Possibly. I don't have it when I search for nvidia. To be sure you need to search for nouveau in the left hand column.


lsmod | grep nouveau


This line looks like an error: xf86EnableIOPorts: failed to set IOPL for I/O (Operation not permitted)

This could be a permission issue. Is your user in the video group ?


groups $(whoami)

The video group should be in the list.


I've asked twice how to stop the nvidia drivers from loading so I can boot using my intel chip but no one answered. I'm going to try to uninstall the nvidia run file and see if that works.

The .run file overwrites some standard graphic files in standard locations. You will need to use dnf to reinstall some libraries that were overwritten.


## Fedora 27/26/25/24/23/22 ##
dnf reinstall xorg-\* mesa\*

leigh123linux
19th December 2017, 11:32 PM
Possibly. I don't have it when I search for nvidia. To be sure you need to search for nouveau in the left hand column.


lsmod | grep nouveau





The .run file overwrites some standard graphic files in standard locations. You will need to use dnf to reinstall some libraries that were overwritten.


## Fedora 27/26/25/24/23/22 ##
dnf reinstall xorg-\* mesa\*


You missed libglvnd* and egl-wayland

SunshineSue
20th December 2017, 03:52 AM
My laptop is an optimus laptop and I didn't realize I needed special software to do the switching. That was my newbie mistake.

I scrubbed everything I could remember doing and removed the Nvidia run installation. I also walked through all the steps to get rid of nouveau, again.

Then I followed the directions in the Fedora wiki to install Bumblebee. Bumblebee has an option to sign kernel modules. I had to manually sign bbswitch.

It's much better, but there are still problems. I have to start in the terminal, then I startx, and then I start bumblebeed. If I don't do it in that order the laptop hangs.

Still, the nvidia driver is installed, loads, and I can use optirun to call it for the programs that need it.

Thanks to everyone for all the help.

Edit: I'm not sure if my bumblebee issues should go in a new thread or I should stay in this one?

leigh123linux
20th December 2017, 09:23 AM
My laptop is an optimus laptop and I didn't realize I needed special software to do the switching. That was my newbie mistake.

I scrubbed everything I could remember doing and removed the Nvidia run installation. I also walked through all the steps to get rid of nouveau, again.

Then I followed the directions in the Fedora wiki to install Bumblebee. Bumblebee has an option to sign kernel modules. I had to manually sign bbswitch.

It's much better, but there are still problems. I have to start in the terminal, then I startx, and then I start bumblebeed. If I don't do it in that order the laptop hangs.

Still, the nvidia driver is installed, loads, and I can use optirun to call it for the programs that need it.

Thanks to everyone for all the help.

Edit: I'm not sure if my bumblebee issues should go in a new thread or I should stay in this one?

You could use PRIME instead of bumblebee if your not worried about power consumption.

https://rpmfusion.org/Howto/Optimus

amiga
20th December 2017, 05:22 PM
It should be mentioned that only several packages are corrupted by the NVidia .run file. You could run a loop such as the following.


for f in $(dnf list installed --quiet xorg-\* mesa\* libglvnd\* 2>&1 | tail -n +3 | awk '{print $1;}'); do rpm --query $f > /dev/null && (rpm --verify $f || echo package corrupted: $f); done


In my case using the NVidia .run file only four packages ( three mesa packages and one xorg) need to be re-installed out of the 56 total graphic library packages. None of the libglvnd/* were corrupted and I don't have egl-wayland installed as I use Xorg.


S.5....T. /usr/lib64/libEGL.so.1.0.0
package corrupted: mesa-libEGL.x86_64
S.5....T. /usr/lib64/libGL.so.1.2.0
package corrupted: mesa-libGL.x86_64
S.5....T. /usr/lib64/libGLESv2.so.2.0.0
package corrupted: mesa-libGLES.x86_64
....L.... /usr/lib64/xorg/modules/extensions/libglx.so
missing /usr/lib64/xorg/modules/libglamoregl.so
package corrupted: xorg-x11-server-Xorg.x86_64

You could then re-install only these few packages by modifying the previous loop or just run dnf reinstall manually for each if there are only 3-4.


for f in $(dnf list installed --quiet xorg-\* mesa\* libglvnd\* 2>&1 | tail -n +3 | awk '{print $1;}'); do rpm --query $f > /dev/null && (rpm --verify $f || dnf reinstall $f); done


You missed libglvnd* and egl-wayland

I was looking at this guide which didn't mention these.

https://www.if-not-true-then-false.com/2015/fedora-nvidia-guide/3/

In my case I have the libglvnd* libraries installed on my system but the NVidia .run file did not touch them. The guide likely does not tell you to re-install packages that aren't ever modified by the NVidia .run file. As I run Xorg on Centos 7 I have no Wayland libraries but perhaps the NVidia .run file does not modify these as well. Others may tell.