PDA

View Full Version : Hard Locking F10 on New Server. Help!



Insomune
26th February 2009, 09:19 PM
Well, I guess it starts from me doing the stupid updates that are supposed to add stability to F10, instead, my 11 days record up-time will only remain a distant memory.

So, the problems start here;
My server, after roughly about 24hours of uptime (the amount of time between locking varies) is hard locking for no reason that I can figure out. During regular uptime everything runs smooth, my virtual server for win32, gproftpd, vnc, samba serving 18 computers.

I'm stuck, I don't know where to start. I cannot roll-back any of the updates because there was about 300 in total. I was doing 4-5 at a time, downloading and installing to make sure it wouldn't crash my system, well none did in the short term it seems.

A brief description;

8cores, dual xeon's
16 gigs ram
some old ass 7000 series ATi card (32 meg card just for video seeing as this is strictly an active fileserver for our internal network)
quad port intel gigabit server-grade NIC
3Ware 9690
8tb Raid 6 setup
640gb Raid 1 - OS setup. (both off the 9690 raid card)
3U Supermicro rackmount (16 bay)
supermicro X7DC model motherboard.

__________________________________________________ _________________________________

Its hard locking on any screen too, not just sitting on desktop,. or when I've manually password locked the screen.

I was having some problems with the NIC before that was causing frequent disconnections and not using all 4 ethernet ports (only using eth0 or eth1 or eth2 or eth3 at one time, never more).

I am also having booting problems sometimes where my raid card does not initialize in time and when I boot it goes through the whole startup and then;
'Operating System not found'
which after a reboot usually fixes it, but sometimes it takes a few before it even detects F10. I've already gone ahead and mod'ed the file for the scsi wait command which helped me getting past the logical volume error when booting, but am now encountering this. (This is not the main problem, I'm just trying to shed light so someone might be able to better help me.)


Basically that's it. I'd love to be able to run my server for at least a month of uptime. I just built a whole room for ventilation/cooling so its nice and cool, I just need to get past this technical barrier....

Thanks for taking the time to read this,

Insomune, Vancouver.

Insomune
27th February 2009, 10:00 PM
bump


nomodeset doesn't help it from hard locking.....


I'm thinking it has something to do with the xorg video driver or plasma workspace?... I could be wrong...

Insomune
28th February 2009, 12:36 AM
Update:

Happening more frequently now. Hard locking on any screen. I've read almost every post about the xorg drivers and other people's screen's hard locking. I've reinstalled fresh, dont updates, ran w/ and w/out.... Still not seeing a solution. I'm sure it has something to do with either my ATI card or my quad port NIC that keeps dropping.

again, any insight would be grateful as rebooting every half day is ridiculous.


on another note I'm now getting this error a few times daily that only fixes itself after a reboot;



Error Type: <type 'exceptions.TypeError'>
Error Value: rpmdb open failed
File : /usr/share/PackageKit/helpers/yum/yumBackend.py, line 2314, in <module>
main()
File : /usr/share/PackageKit/helpers/yum/yumBackend.py, line 2310, in main
backend = PackageKitYumBackend('', lock=True)
File : /usr/share/PackageKit/helpers/yum/yumBackend.py, line 182, in __init__
self.yumbase = PackageKitYumBase(self)
File : /usr/share/PackageKit/helpers/yum/yumBackend.py, line 2253, in __init__
self.repos.confirm_func = self._repo_gpg_confirm
File : /usr/lib/python2.5/site-packages/yum/__init__.py, line 589, in <lambda>
repos = property(fget=lambda self: self._getRepos(),
File : /usr/lib/python2.5/site-packages/yum/__init__.py, line 395, in _getRepos
self._getConfig() # touch the config class first
File : /usr/lib/python2.5/site-packages/yum/__init__.py, line 192, in _getConfig
self._conf = config.readMainConfig(startupconf)
File : /usr/lib/python2.5/site-packages/yum/config.py, line 774, in readMainConfig
yumvars['releasever'] = _getsysver(startupconf.installroot, startupconf.distroverpkg)
File : /usr/lib/python2.5/site-packages/yum/config.py, line 844, in _getsysver
idx = ts.dbMatch('provides', distroverpkg)

Insomune
25th March 2009, 09:21 PM
ran memtest for 24hours, no defective ram.

nada.

tgentry
25th March 2009, 09:52 PM
I know this may sound dumb since you are talking about running as a server but do you have any of the desktop effects on? If so turn them completely off. This is of course referencing to the the kde desktop

Insomune
25th March 2009, 10:27 PM
Yes, it happens KDE and Gnome, dekstop effects or not. I had the server stable at 11 days before I updated some of the stuff. Either Xorg drivers for my ATI 32meg 7xxx series card or something wrong with the quad-port intel NIC... firmware/driver?

I know tons of people are having hard locking problems with F10 kernel, hundreds of posts and different solutions for different people, unfortunately, still looking for mine.

kaos77
27th March 2009, 01:36 PM
I have to suggest maybe using something more suitable for running a server like CentOS if you're comfortable with RPM based distros or Debian or BSD if they're an option. Fedora isn't exactly known for its stability. My view of Fedora is that it's more cutting edge in terms of having the later releases of software for RedHat to test out to decide if software is ready to go into the commercial offerings.

If you're looking for uptimes, and seeing as how you've put this much money into your project already, go with CentOS. You're not going to get the level of prettiness that Fedora offers, but it's a server, you're not going to be looking at it every day. I've got a few CentOS servers that have been running well over a year without reboots at all and I know of other CentOS servers that have been up much longer.