LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 04-16-2007, 02:32 PM   #1
htedrom
Member
 
Registered: May 2005
Posts: 30

Rep: Reputation: 15
2in1 problem thread. (nvidia kernel module vs X module, and strange workbug phenom)


hey all...got two delicious problems for you all today.

First one should be a cinche.

1: on boot,

(EE) NVIDIA(0): Failed to initialize the NVIDIA kernel module! Please ensure
(EE) NVIDIA(0): that there is a supported NVIDIA GPU in this system, and
(EE) NVIDIA(0): that the NVIDIA device files have been created properly.
(EE) NVIDIA(0): Please consult the NVIDIA README for details.
(EE) NVIDIA(0): *** Aborting ***

somewhere else, i found some logging that says the kernel module VERSION is different from the X module version, and that that's bad. i'll post that as soon as i find where it is...

anyway, the quick fix is easy enough. all i do is kill gdm, `rmmod nvidia`, and restart gdm and presto, it starts, with the right driver...`nvidia` gets loaded.

the kernel module installed is the one from the nvidia binary driver. i'm sure i just have to get the right module loaded at boot, i just don't know how to do it.

one more weird thing...although i can run glx apps without X crashing, and beryl works just fine, on glxgears i get like maybe 15 fps, and some glx apps don't do so hot (scorched3d crashes frequently, some screensavers really choke up) is that just a linux-drivers-blow thing, or is my card not being fully used?

OKAY, problem 2 is a bit more obtuse, i'll be amazed if anyone can help me with it.

2: sometimes when i'm just happily minding my own beeswax (surfing the web, listening to music...nothing cpu intensive) the CPU starts, out of no where, working on something furiously. so furiously, in fact, that X completely stops responding, i can't get to any other terminals, and the one time i did get to a terminal, the computer was so busy it wouldn't even run my "top" command. eventually i just have to cut the juice.

i have a cpu applet in my panel, and it shoots up to 100% and stays there, till it eventually stops showing new data, the computer is so busy. worth noting is that about half of the load is "IOWait" as opposed to actual work. anyway here's a list of apps that i'm usually running when it happens...unfortunately i'm usually running enough apps at one time that i can't really pin it down.

gnome-terminal
gedit
amarok
beryl 0.2
firefox 2
bittornado
evolution
liferea
vlc 0.8.6
nautilus

also to note is that
a. it's not the http cache cleaner
b. i'm usually using a lot of my ram, and usually around 50% of my paging file.


okay, thanks for any responses to my problems, let me know if there's any more info i can provide, especially with the second one i really don't know what to post.

here's some system info:

amd64, x2 4200 (dual core)
1gig or so ram
ubuntu edgy 6.10, kernel 2.6.17-11-generic
nvidia 7600GT using driver 1.0-9755 (for amd64)
xorg 7.1
gnome 2.16

thanks again

 
Old 04-17-2007, 01:19 PM   #2
AzrielMacKay
Member
 
Registered: Jul 2001
Location: Moody, AL
Distribution: Debian and Kubuntu
Posts: 249

Rep: Reputation: 30
you need to include more details on your first problem like: which distro are you using? what method did you use to install the drivers?
 
Old 04-18-2007, 12:17 AM   #3
htedrom
Member
 
Registered: May 2005
Posts: 30

Original Poster
Rep: Reputation: 15
the distro i include at the bottom of the post, and i used the nvidia installer for both the driver and kernel mod.
 
Old 04-18-2007, 03:57 AM   #4
samstar
Member
 
Registered: Apr 2007
Distribution: suse 10.2
Posts: 324

Rep: Reputation: 31
Hi,

Could you include your xorg.conf file?

When you used nvidia-installer, were there any error messages? You can check the nvidia log in /var/log location. If there are errors, please post them.

Also, what size is your swap partition?

Sam
 
Old 04-18-2007, 02:42 PM   #5
htedrom
Member
 
Registered: May 2005
Posts: 30

Original Poster
Rep: Reputation: 15
hey...swap size is 500meg. nvidia installer worked without any errors, but i found that mismatched version message in the installer log. posted the whole end of the log:

Code:
  NVIDIA: left KBUILD.
-> done.
-> Kernel module compilation complete.
-> Kernel messages:
   [  134.678476] NVRM: API mismatch: the client has the version 1.0-9755, but
   [  134.678478] NVRM: this kernel module has the version 1.0-8776.  Please
   [  134.678479] NVRM: make sure that this kernel module and all NVIDIA driver
   [  134.678480] NVRM: components have the same version.
   [  138.646395] eth0: no IPv6 routers present
   [  138.764159] NVRM: API mismatch: the client has the version 1.0-9755, but
   [  138.764161] NVRM: this kernel module has the version 1.0-8776.  Please
   [  138.764162] NVRM: make sure that this kernel module and all NVIDIA driver
   [  138.764164] NVRM: components have the same version.
   [  142.851234] NVRM: API mismatch: the client has the version 1.0-9755, but
   [  142.851236] NVRM: this kernel module has the version 1.0-8776.  Please
   [  142.851238] NVRM: make sure that this kernel module and all NVIDIA driver
   [  142.851239] NVRM: components have the same version.
   [  147.319218] Bluetooth: Core ver 2.8
   [  147.319224] NET: Registered protocol family 31
   [  147.319226] Bluetooth: HCI device and connection manager initialized
   [  147.319246] Bluetooth: HCI socket layer initialized
   [  147.367872] Bluetooth: L2CAP ver 2.8
   [  147.367877] Bluetooth: L2CAP socket layer initialized
   [  147.423885] Bluetooth: RFCOMM socket layer initialized
   [  147.423904] Bluetooth: RFCOMM TTY layer initialized
   [  147.423906] Bluetooth: RFCOMM ver 1.7
   [  176.448744] ACPI: PCI Interrupt 0000:02:00.0[A] -> Link [APC5] -> GSI 16
   (level, low) -> IRQ 50
   [  176.449040] PCI: Setting latency timer of device 0000:02:00.0 to 64
   [  176.449320] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  1.0-9755  Mon
   Feb 26 23:16:31 PST 2007
-> Installing both new and classic TLS OpenGL libraries.
-> Installing both new and classic TLS 32bit OpenGL libraries.
-> Install NVIDIA's 32-bit compatibility OpenGL libraries? (Answer: Yes)
-> Parsing log file:
-> done.
-> Validating previous installation:
-> done.
-> Uninstalling NVIDIA Accelerated Graphics Driver for Linux-x86_64
   (1.0-9755):
-> done.
-> Uninstallation of existing driver: NVIDIA Accelerated Graphics Driver for
   Linux-x86_64 (1.0-9755) is complete.
-> Searching for conflicting X files:
-> done.
-> Searching for conflicting OpenGL files:
-> done.
-> Installing 'NVIDIA Accelerated Graphics Driver for Linux-x86_64'
   (1.0-9755):
   executing: '/sbin/ldconfig'...
   executing: '/sbin/depmod -aq'...
-> done.
-> Driver file installation is complete.
-> Running post-install sanity check:
-> done.
-> Post-install sanity check passed.
-> Shared memory test passed.
-> Running runtime sanity check:
-> done.
-> Runtime sanity check passed.
-> Would you like to run the nvidia-xconfig utility to automatically update you
   r X configuration file so that the NVIDIA X driver will be used when you res
   tart X?  Any pre-existing X configuration file will be backed up. (Answer: N
   o)
-> Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64
   (version: 1.0-9755) is now complete.  Please update your XF86Config or
   xorg.conf file as appropriate; see the file
   /usr/share/doc/NVIDIA_GLX-1.0/README.txt for details.

hope that helps
 
Old 04-18-2007, 02:46 PM   #6
htedrom
Member
 
Registered: May 2005
Posts: 30

Original Poster
Rep: Reputation: 15
sorry, forgot my xorg. here it is. (edit) i should mention that the x log says everything in here is okay, and i only get errors when it loads the nvidia module.(/edit)

Code:
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 1.0  (buildmeister@builder26)  Fri Dec 15 10:40:27 PST 2006

# /etc/X11/xorg.conf (xorg X Window System server configuration file)
#
# This file was generated by dexconf, the Debian X Configuration tool, using
# values from the debconf database.
#
# Edit this file with caution, and see the /etc/X11/xorg.conf manual page.
# (Type "man /etc/X11/xorg.conf" at the shell prompt.)
#
# This file is automatically updated on xserver-xorg package upgrades *only*
# if it has not been modified since the last upgrade of the xserver-xorg
# package.
#
# If you have edited this file but would like it to be automatically updated
# again, run the following command:
#   sudo dpkg-reconfigure -phigh xserver-xorg

Section "ServerLayout"
    Identifier     "Default Layout"
    Screen         "Default Screen" 0 0
    InputDevice    "Generic Keyboard"
    InputDevice    "Configured Mouse"
EndSection

Section "Files"

# path to defoma fonts
    FontPath        "/usr/share/fonts/X11/misc"
    FontPath        "/usr/share/fonts/X11/cyrillic"
    FontPath        "/usr/share/fonts/X11/100dpi/:unscaled"
    FontPath        "/usr/share/fonts/X11/75dpi/:unscaled"
    FontPath        "/usr/share/fonts/X11/Type1"
    FontPath        "/usr/share/fonts/X11/100dpi"
    FontPath        "/usr/share/fonts/X11/75dpi"
    FontPath        "/usr/share/fonts/X11/misc"
    FontPath        "/var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType"
EndSection

Section "Module"
    Load           "i2c"
    Load           "bitmap"
    Load           "ddc"
    Load           "extmod"
    Load           "freetype"
    Load           "glx"
    Load           "int10"
    Load           "type1"
    Load           "vbe"
EndSection

Section "InputDevice"
    Identifier     "Generic Keyboard"
    Driver         "kbd"
    Option         "CoreKeyboard"
    Option         "XkbRules" "xorg"
    Option         "XkbModel" "pc105"
    Option         "XkbLayout" "us"
    Option         "XkbOptions" "lv3:ralt_switch"
EndSection

Section "InputDevice"
    Identifier     "Configured Mouse"
    Driver         "mouse"
    Option         "CorePointer"
    Option         "Device" "/dev/input/mice"
#	Option		"Protocol"		"ExplorerPS/2"
    Option         "Protocol" "ImPS/2"
    Option         "ZAxisMapping" "4 5"
    Option         "Emulate3Buttons" "true"
EndSection

Section "Monitor"
    Identifier     "Acer AL1916W"
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "NVIDIA GeForce 7600GT
    Driver         "nvidia"

    # additions for beryl
    Option         "DisableGLXRootClipping" "True" 
    Option         "XvmcUsesTextures" "true" 
    Option         "AllowGLXWithComposite" "true" 
    Option         "Coolbits" "1" 
    Option         "RenderAccel" "true" 
    Option         "NoLogo" "true" 

EndSection

Section "Screen"
    Identifier     "Default Screen"
    Device         "NVIDIA GeForce 7600GT
    Monitor        "Acer AL1916W"
    DefaultDepth    24

    # Compiz addition
    Option         "AddARGBGLXVisuals" "True"

    SubSection     "Display"
        Depth       1
        Modes       "1280x1024" "1024x768" "832x624" "800x600" "640x480"
    EndSubSection
    SubSection     "Display"
        Depth       4
        Modes       "1280x1024" "1024x768" "832x624" "800x600" "640x480"
    EndSubSection
    SubSection     "Display"
        Depth       8
        Modes       "1280x1024" "1024x768" "832x624" "800x600" "640x480"
    EndSubSection
    SubSection     "Display"
        Depth       15
        Modes       "1280x1024" "1024x768" "832x624" "800x600" "640x480"
    EndSubSection
    SubSection     "Display"
        Depth       16
        Modes       "1280x1024" "1024x768" "832x624" "800x600" "640x480"
    EndSubSection
    SubSection     "Display"
        Depth       24
        Modes      "1440x1024" "1280x1024" "1024x768" "832x624" "800x600" "640x480"
    EndSubSection
EndSection

Section "Extensions"
    Option         "Composite" "Enable"
EndSection

Last edited by htedrom; 04-18-2007 at 02:52 PM.
 
Old 04-18-2007, 04:43 PM   #7
samstar
Member
 
Registered: Apr 2007
Distribution: suse 10.2
Posts: 324

Rep: Reputation: 31
Do you remember how you installed the binary driver?

Did you tell it to update (nvidia-installer --update) or did you download a new one from nvidia?

And do you remember if you installed the 1.0-8776 driver, or if slackware did it for you? (I don't know much about slackware. If they have a different method of installing nvidia proprietary software, it may conflict with nvidia's install method)

It might be best to uninstall the nvidia drivers completely before updating them.

I read somewhere that nvidia drivers need to be patched in order to install them successfully on a 2.6.17 machine. Let's hope it doesn't come to that. The thread I read it on is below. It's for suse, though, not slackware.

http://forums.suselinuxsupport.de/in...d=176917&st=0&

When you re-install the nvidia drivers, try it this way first:

sh NVIDIA-Linux-x86-1.0-9755-pkg1.run --kernel-source-path=/pathto/linux-source

Where "/pathto/linux-source" is the path to your 2.6.17-11-generic kernel source files.
 
Old 04-18-2007, 05:55 PM   #8
rg.viza
Member
 
Registered: Aug 2006
Posts: 74

Rep: Reputation: 15
Quote:
Originally Posted by htedrom
hey all...got two delicious problems for you all today.

First one should be a cinche.

1: on boot,

(EE) NVIDIA(0): Failed to initialize the NVIDIA kernel module! Please ensure
(EE) NVIDIA(0): that there is a supported NVIDIA GPU in this system, and
(EE) NVIDIA(0): that the NVIDIA device files have been created properly.
(EE) NVIDIA(0): Please consult the NVIDIA README for details.
(EE) NVIDIA(0): *** Aborting ***
1. Try loading the nvidia driver as the last thing you do during startup sequence, in fact, put the insmod in rc.local to see if that helps. It looks like the driver's loading before the card is fully initialized and /dev entry created, evidenced by:
(EE) NVIDIA(0): that the NVIDIA device files have been created properly.

They are talking about the entries in /proc that the driver uses to communicate with the hardware.

If you load it as the last thing in the startup sequence, this issue should go away.

2. Freezing is a really vague problem. I'd unhook every piece of hardware you own, except what's necessary for the system to run, and see if it still happens. If it does, kill every last service you don't need for the system to run. If it still hangs then, I'd bet on hardware, like a spotty switch, NIC, loose cable, hard drive dying etc. I/O issues are the most common cause of stuff like this. This will happen with a fubar'd CD in the drive, a bad or loose drive cable, intermittent network cable, etc. IOWAIT is "I just sent data to a device and I'm waiting for it to respond with something I understand". This will hang your computer like a former dictator if the driver doesn't get an ACK.

-Viz
 
Old 04-18-2007, 09:46 PM   #9
htedrom
Member
 
Registered: May 2005
Posts: 30

Original Poster
Rep: Reputation: 15
haha okay, i'll cut my comp off...nothing but rice crackers and skim milk. hopefully it is just a bunk cd or something.

as for loading the driver...i thought that the driver doesn't load until X does? you mean load the kernel module last? i think the kernel module is getting loaded on runlevel 2... but my bash is a little rusty. this is S20nvidia_kernel (S20 is ubuntu speak for "enabled" i think)

/etc/rc2.d/S20nvidia_kernel:
Code:
#!/bin/sh

PATH=/sbin:/usr/sbin:/usr/local/sbin:/bin:/usr/bin:/usr/local/bin

# How many cards?
[ -r /etc/default/nvidia-kernel ] && . /etc/default/nvidia-kernel

# test if anything is requested
if [ -z "$NVIDIA_CARDS" ] || [ "$NVIDIA_CARDS" -lt 1 ]; then
  # Nothing to do but exit.
  exit 0
fi  
    
make_nodes () {
  if ! [ -e /dev/nvidiactl ]; then
    mknod -m 0660 /dev/nvidiactl c 195 255
    chgrp video /dev/nvidiactl
  fi
  for i in $(seq 0 $(($NVIDIA_CARDS - 1))); do
    if ! [ -e /dev/nvidia$i ]; then
      mknod -m 0660 /dev/nvidia$i c 195 $i
      chgrp video /dev/nvidia$i
    fi
  done
}
					
case "$1" in
  start|restart|reload|force-reload)
      make_nodes
      ;;

  stop) 
     :
     ;;
							    
   *)
     echo "Usage: /etc/init.d/nvidia-kernel {start|stop|restart|reload|force-reload}"
     exit 1
     ;;
esac
	        
exit 0
nvidia isn't in /etc/modules, and i'm not really sure what to do with /etc/modprobe.d/nvidia-kernel-nkc:
Code:
alias char-major-195* nvidia
how about disabling the S20nvidia-kernel, and adding "/etc/init.d/nvidia-kernel start" to my rc.local? would that work? i'm pissing in the dark.

samstar: i'm running ubuntu, not slack...hopefully that's not my prob

anyway i'll start messing around till i hear back
 
Old 04-19-2007, 10:42 AM   #10
rg.viza
Member
 
Registered: Aug 2006
Posts: 74

Rep: Reputation: 15
For x to start the video driver needs to be loaded. You specify the driver name in your xorg.conf but it needs to be loaded already for x to start.

If the driver's not loaded it will fail. This is why you should set your box to boot up to console and start x manually using startx when using vendor provided video drivers. Less hassle when you need to upgrade the kernel and your video driver dies.

If you boot up to shell you just need to re run the nvidia installation script, as opposed to waiting for things to fail so you can drop to a shell. It's just easier on your constitution.

There's more ways than one to skin this cat. This is just how I've been doing it since nvidia first put out a driver for linux. It's hard for old habits to die.

It's an issue that you deal with so infrequently it really doesn't matter as long as your box works.
 
Old 04-19-2007, 04:40 PM   #11
htedrom
Member
 
Registered: May 2005
Posts: 30

Original Poster
Rep: Reputation: 15
ahh I see...I didn't realize it had to be loaded beforehand.

yea, when i used to run deb I'd start X by hand, but since switching to ubuntu i've just left the default boot...anyway, i've done the changes, i'll let you know what happens.
 
Old 04-24-2007, 11:09 AM   #12
htedrom
Member
 
Registered: May 2005
Posts: 30

Original Poster
Rep: Reputation: 15
okay sorry it's been a while since i posted, been trying stuff out.

new news:
booting X last/booting X by hand doesn't make a difference

I noticed that the module that gets loaded during boot is actually smaller (nearly half the size) of the one that x loads after I remove the boot one. where can I change the module that loads at boot? for that matter, what do I even change it to? both modules have the same name. are they in fact two differnt modules, or is there something fishier afoot.
 
Old 05-21-2007, 05:10 PM   #13
MikeOfAustin
Member
 
Registered: Apr 2007
Location: texas
Distribution: mandriva 2007.0 / edgy
Posts: 63

Rep: Reputation: 15
Did you solve this issue? I'm having this same problem now too. I have to manually rmmod nvidia, modprobe nvidia, then restart the GDM, before I can get into X.
 
Old 05-22-2007, 04:05 PM   #14
htedrom
Member
 
Registered: May 2005
Posts: 30

Original Poster
Rep: Reputation: 15
no, still haven't solved it. let me know if you do.
 
Old 05-22-2007, 04:21 PM   #15
Hern_28
Member
 
Registered: Mar 2007
Location: North Carolina
Distribution: Slackware 12.0, Gentoo, LFS, Debian, Kubuntu.
Posts: 906

Rep: Reputation: 38
Had similar problem.

It was after one of those automatic updates. Had to re-install the NVIDIA driver to get to to work. Lemme know if that fixes it on your systems too.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
(EE) NVIDIA(0): Failed to initialize the NVIDIA Kernel module latino Linux - Hardware 5 06-03-2008 04:55 AM
nvidia driver 8762 kernel module mismatch x module openfun Ubuntu 5 06-27-2006 11:02 PM
NVIDIA module problem - Slackware 10.1 & kernel 2.6.** Vizy Slackware 7 05-20-2005 10:58 AM
install problem: Suse 9.2 AMD64 Nvidia 'forcedeth' kernel module croutonjones SUSE / openSUSE 1 01-16-2005 05:50 PM
(EE) NVIDIA(0): Failed to initialize the NVIDIA kernel module! Stan the caddy Linux - Hardware 8 01-18-2004 05:21 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 04:37 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration