PDA

View Full Version : F8 Kernel Segfault in init script



mjurgens
21st April 2008, 04:57 AM
I was running FC6 and yum upgraded to F8. The latest FC6 kernel version worked fine. Now that I am on F8 I can not run an F8 kernel since they always fail to boot. I am still running the latest FC6 kernel (everything else is F8). Every new F8 kernel I install has the same problem. I've extracted the initrd files and have tried a few things. If I do clean F8 installs it works every time (as you would expect) so I have some legacy problem coming from my FC6 upgrade. I have tried PAE and non-PAE kernels with the same result.

My FC6 upgrade was direct to F8 but I have tried FC6 -> F7 -> F8 as well with no luck.

The attached screenshot shows the error.
"init[1]: segfault at 00000006 eip 00000006 esp bf87a7e8 error 4"

I have now created a cut down F8 virtual machine under Vmware and copied the faulty initrd file to it. The fault is now reproducible within the VM.

I did find that if I downgraded nash and mkinitrd to an FC6 version and then installed F8 kernels that they would boot. But I have no idea why mkinitrd and nash from F8 create an initrd for me that is not bootable.

I have posted the faulty initrd file at http://www.edcint.co.nz/initrd-2.6.24.4-64.fc8.img

I'd love some suggestions as to what I should be trying/looking for etc

Thanks

mjurgens
16th June 2008, 01:01 AM
A downgrade of mkinitrd and nash to the latest FC6 versions gets further in the boot process but gives the following error (I suspect that this would never work any way due to dependencies but it does appear to show that changing mkinitrd fixes the initial fault)

JEO
16th June 2008, 07:53 AM
It sounds like you have some package that mkinitrd or nash depend on that didn't get sucessfully updated.

The command to see the requirements of the mkinitrd package for example:
rpm -q --requires mkinitrd

and you will have a list of what it requires and you can check each of them for what version it is and update any that are still fedora 6.

For example, if it shows a requirement for libglib-2.0.so.0 you can issue the command:
rpm -q --whatprovides libglib-2.0.so.0

and the answer is the glib2-2.14.6-1.fc8 package

mjurgens
16th June 2008, 08:43 AM
It sounds like you have some package that mkinitrd or nash depend on that didn't get sucessfully updated.


Didn't find anything. Everything is FC7 or F8. Wrote a script to do all the --whatprovides on the results of the "rpm -q --requires mkinitrd". Output below. Not sure about the "no package provides " results.

bash-3.2-20.fc8
bash-3.2-20.fc8
bash-3.2-20.fc8
module-init-tools-3.4-2.fc8
util-linux-ng-2.13.1-1.fc8
no package provides config
coreutils-6.9-13.fc8
cpio-2.9-5.fc8
udev-118-1.fc8
device-mapper-1.02.22-1.fc8
dmraid-1.0.0.rc14-4.fc8
e2fsprogs-1.40.4-1.fc8
filesystem-2.4.11-1.fc8
coreutils-6.9-13.fc8
findutils-4.2.31-2.fc8
glib2-2.14.6-1.fc8
grep-2.5.1-57.fc7
gzip-1.3.12-4.fc8
initscripts-8.60-1
nash-6.0.19-4.fc8
e2fsprogs-libs-1.40.4-1.fc8
glibc-2.7-2
device-mapper-libs-1.02.22-1.fc8
libdhcp-1.27-4.fc8
libdhcp4client-3.0.6-12.fc8
libdhcp6client-0.10-51.fc8
glib2-2.14.6-1.fc8
nash-6.0.19-4.fc8
libnl-1.0-0.15.pre8.git20071218.fc8
parted-1.8.6-10.fc8
popt-1.13-1.fc8
libselinux-2.0.43-1.fc8
libselinux-2.0.43-1.fc8
libsepol-2.0.15-1.fc8
libsepol-2.0.15-1.fc8
e2fsprogs-libs-1.40.4-1.fc8
lvm2-2.02.28-1.fc8
mdadm-2.6.2-5.fc8
mktemp-1.5-25.fc7
util-linux-ng-2.13.1-1.fc8
nash-6.0.19-4.fc8
no package provides rpmlib
no package provides rpmlib
no package provides rtld
tar-1.17-7.fc8

JEO
16th June 2008, 04:25 PM
What are the contents of /etc/fstab and /boot/grub/grub.conf and /etc/modprobe.conf? All of these files are used by mkinitrd.

JEO
16th June 2008, 04:32 PM
Also try the command "rpm -qva |grep fc6" to look for old packages and try "yum list extras" to list extra packages not in the repositories.

mjurgens
17th June 2008, 12:24 AM
What are the contents of /etc/fstab and /boot/grub/grub.conf and /etc/modprobe.conf? All of these files are used by mkinitrd.

==> /etc/fstab <==
LABEL=/ / ext3 defaults 1 1
LABEL=/boot /boot ext3 defaults 1 2
none /dev/pts devpts gid=5,mode=620 0 0
none /dev/shm tmpfs defaults 0 0
none /proc proc defaults 0 0
none /sys sysfs defaults 0 0
/dev/vg1/lv0 /export/shared ext3 suid,dev,exec 0 0
/dev/hdb4 /export/shared2 xfs suid,dev,exec 0 0
/dev/vg2/lv1 /export/shared1 xfs suid,dev,exec 0 0
LABEL=SWAP1 swap swap defaults 0 0
/export/shared/dl/iso/Fedora-8-dvd-i386/Fedora-8-i386-DVD.iso /var/www/html/ks/mounts/Fedora-8-i386-DVD iso9660 suid,dev,ro,mode=444,loop,exec 0 0
/export/shared/dl/iso/Fedora-9-i386-DVD/Fedora-9-i386-DVD.iso /var/www/html/ks/mounts/Fedora-9-i386-DVD iso9660 suid,dev,ro,mode=444,loop,exec 0 0
# gold:/export /export/shared1 nfs suid,dev,exec 0 0

==> /boot/grub/grub.conf <==
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/Volume00/LogVol00
# initrd /initrd-version.img
#boot=/dev/hda
default 3
timeout 5
splashimage=(hd0,0)/grub/splash.xpm.gz
title Fedora (2.6.25.4-10.fc8)
root (hd0,0)
kernel /vmlinuz-2.6.25.4-10.fc8 ro root=LABEL=/ rhgb quiet irqpoll
initrd /initrd-2.6.25.4-10.fc8.img
title Fedora (2.6.25.4-10.fc8PAE)
root (hd0,0)
kernel /vmlinuz-2.6.25.4-10.fc8PAE ro root=LABEL=/ rhgb quiet irqpoll
initrd /initrd-2.6.25.4-10.fc8PAE.img
title Fedora (2.6.22.14-72.fc6)
root (hd0,0)
kernel /vmlinuz-2.6.22.14-72.fc6 ro root=LABEL=/ rhgb quiet irqpoll
initrd /initrd-2.6.22.14-72.fc6.img
title Fedora (2.6.22.14-72.fc6PAE)
root (hd0,0)
kernel /vmlinuz-2.6.22.14-72.fc6PAE ro root=LABEL=/ rhgb quiet irqpoll
initrd /initrd-2.6.22.14-72.fc6PAE.img

==> /etc/modprobe.conf <==
install snd-intel8x0 /sbin/modprobe --ignore-install snd-intel8x0 && /usr/sbin/alsactl restore >/dev/null 2>&1 || :
alias usb-controller uhci-hcd
alias net-pf-10 off
alias usb-controller1 ehci-hcd
alias scsi_hostadapter ata_piix

# --- BEGIN: Generated by ALSACONF, do not edit. ---
# --- ALSACONF version 1.0.10rc3 ---
# --- END: Generated by ALSACONF, do not edit. ---
alias dev24174 r8169
alias dev14102 e100
alias dev3375 3c59x
options snd-hda-intel index=0
remove snd-hda-intel { /usr/sbin/alsactl store 0 >/dev/null 2>&1 || : ; }; /sbin/modprobe -r --ignore-remove snd-hda-intel
alias eth1 sky2
alias snd-card-0 snd-hda-intel
options snd-card-0 index=0
alias eth0 r8169

mjurgens
17th June 2008, 12:29 AM
Also try the command "rpm -qva |grep fc6" to look for old packages and try "yum list extras" to list extra packages not in the repositories.

I'm not sure how useful the 'yum list extras' since it gives stuff like:
vim-enhanced.i386 2:7.1.211-1.fc8 installed

Anyway, the "rpm -qva |grep fc6" shows lots of stuff. I'll work through it and try and remove what is not needed (lots of it looks like it is related to mymythtv installation):

a52dec-0.7.4-8.fc6.rf
avahi-0.6.16-1.fc6
dmidecode-2.7-1.26.1.fc6
enchant-1.3.0-1.fc6
faac-1.25-2.fc6.rf
faad2-2.5-7.fc6.at
fftw2-double-2.1.5-13.fc6.at
fping-2.4b2-7.fc6
jakarta-commons-cli-1.0-6jpp_10.fc6
kernel-2.6.22.14-72.fc6
kernel-devel-2.6.22.14-72.fc6
kernel-PAE-2.6.22.14-72.fc6
kernel-PAE-devel-2.6.22.14-72.fc6
keyutils-libs-1.2-2.fc6
keyutils-libs-devel-1.2-2.fc6
lame-3.97-1.fc6.rf
libavc1394-0.5.3-1.fc6
libavformat51-0.4.9-23_r8743.fc6.at
libdc1394_control13-1.1.0-6.fc6.at
libds-1.5.4-1.2.fc6.rf
libdvbpsi-0.1.5-2.fc6.rf
libdvdcss-1.2.9-3.fc6.at
libfame-0.9.1-12.fc6.rf
liblzo1-1.08-3.fc6.at
libmad-0.15.1b-4.fc6.rf
libquicktime0-0.9.10-18.fc6.at
libsmi-0.4.5-2.fc6
mjpegtools-1.9.1-14_cvs20061009.fc6.at
mplayer-fonts-1.1-3.fc6.rf
perl-Class-MethodMaker-2.08-6.fc6.at
perl-DateTime-Format-Builder-0.7807-4.fc6
perl-IO-stringy-2.110-8.fc6.at
perl-Lingua-Preferred-0.2.4-3.fc6.at
perl-Locale-Hebrew-1.04-2.fc6.at
perl-SOAP-Lite-0.69-5.fc6.at
perl-Term-ProgressBar-2.09-2.fc6
perl-Tk-TableMatrix-1.2-17.fc6.at
perl-Unicode-UTF8simple-1.06-2.fc6.at
perl-XML-Validator-Schema-1.08-1.2.fc6.rf
rng-utils-2.0-1.14.1.fc6
tux-3.2.18-9.fc6
x264-0.0.0-0.3.20061214.fc6.rf
xorg-x11-filesystem-7.1-2.fc6
xvidcore-1.1.2-1.fc6.rf

JEO
17th June 2008, 04:31 AM
In /etc/fstab you have a line /dev/hdb4 /export/shared2 xfs suid,dev,exec 0 0
All of the old hd(x) devices are now sd(x) under Fedora8, so it's possible that could cause problems. Try commenting it out.

JEO
17th June 2008, 04:38 AM
Not all the fc6 packages are wrong for Fedora 8. You can ignore the following as I have them too.
enchant-1.3.0-1.fc6
xorg-x11-filesystem-7.1-2.fc6
rng-utils-2.0-1.14.1.fc6
libavc1394-0.5.3-1.fc6
dmidecode-2.7-1.26.1.fc6
keyutils-libs-1.2-2.fc6

JEO
17th June 2008, 04:47 AM
I would look at the following after excluding .at, .rf and kernels."
avahi-0.6.16-1.fc6
fping-2.4b2-7.fc6
jakarta-commons-cli-1.0-6jpp_10.fc6
libsmi-0.4.5-2.fc6
perl-DateTime-Format-Builder-0.7807-4.fc6
perl-Term-ProgressBar-2.09-2.fc6
tux-3.2.18-9.fc6

JEO
17th June 2008, 04:54 AM
"I'm not sure how useful the 'yum list extras' since it gives stuff like:
vim-enhanced.i386 2:7.1.211-1.fc8 installed"

try a yum update command first. Here is the output of my yum list extras:
Extra Packages
jre.i586 1.6.0_03-fcs installed
kernel.i686 2.6.24.7-92.fc8 installed

mjurgens
19th June 2008, 01:58 AM
In /etc/fstab you have a line /dev/hdb4 /export/shared2 xfs suid,dev,exec 0 0
All of the old hd(x) devices are now sd(x) under Fedora8, so it's possible that could cause problems. Try commenting it out.

The device names changes are part of Fedora 9. I boot off /dev/md0 which is made up of /dev/hda1 and /dev/hdb1

JEO
19th June 2008, 02:18 AM
Well that sounds like that would be the problem. Try modifying the raid to use sd(x)1 and sd(y)1 you can type blkid to see a list of the available fedora 9 devices.

mjurgens
22nd June 2008, 10:58 AM
Well that sounds like that would be the problem. Try modifying the raid to use sd(x)1 and sd(y)1 you can type blkid to see a list of the available fedora 9 devices.

I am on Fedora 8 - the /dev/hd devices don't go away until Fedora 9

mjurgens
22nd June 2008, 11:06 AM
I would look at the following after excluding .at, .rf and kernels."
avahi-0.6.16-1.fc6
fping-2.4b2-7.fc6
jakarta-commons-cli-1.0-6jpp_10.fc6
libsmi-0.4.5-2.fc6
perl-DateTime-Format-Builder-0.7807-4.fc6
perl-Term-ProgressBar-2.09-2.fc6
tux-3.2.18-9.fc6

All of these packages have way too many dependencies to remove. For example, avahi has about 105 dependencies. fping is used for my nagios installation. jakarta is used by azureus.

I removed libsmi.

I wouldn't mid betting I have some perl scripts that use perl-DateTime-Format-Builder
perl-Term-ProgressBar is used by mythtv
I have already removed tux

JEO
22nd June 2008, 11:31 AM
"I am on Fedora 8 - the /dev/hd devices don't go away until Fedora 9"

You said you were booting the FC6 kernel. The kernel is the place where the device name changes have taken place. The old IDE drivers have been replaced by the new LIBATA drivers. There are no /dev/hda and /dev/hdb in Fedora 8 kernels. Boot from the F8 installation media and choose rescue mode, and do an ls /dev/hd* and an ls /dev/sd* and also a blkid command to see.

JEO
22nd June 2008, 11:41 AM
I think it is unlikely that any of those remaining fc6 packages is the cause of your mkinitrd segfault during booting problem.

stevea
22nd June 2008, 08:29 PM
The kernel starts the "init" program form the inital ram file system (initrd), but in your case init fails with a segfault.

You can unpack the initrd like this (in a empty directory so it doesn't splatter):
gunzip </root/Download/initrd-2.6.24.4-64.fc8.img | cpio -iv
This is what your initrd looks like:


[stevea@nidula foo]# find .
.
./usr
./usr/lib
./usr/lib/libnl.so.1
./usr/lib/libdhcp6client-0.10.so.0
./usr/lib/libnash.so.6.0.19
./usr/lib/libnl.so.1.0-pre8
./usr/lib/libparted-1.8.so.6.0.0
./usr/lib/libdhcp4client-3.0.6.so.0
./usr/lib/libdhcp.so.1
./usr/lib/libparted-1.8.so.6
./usr/lib/libbdevid.so.6.0.19
./init
./sbin
./dev
./dev/tty1
./dev/zero
./dev/tty
./dev/ram1
./dev/tty7
./dev/ttyS0
./dev/ram
./dev/ttyS2
./dev/tty6
./dev/tty9
./dev/systty
./dev/tty3
./dev/mapper
./dev/tty8
./dev/ram0
./dev/tty2
./dev/console
./dev/tty4
./dev/tty5
./dev/tty10
./dev/tty0
./dev/rtc
./dev/tty11
./dev/ttyS3
./dev/ttyS1
./dev/null
./dev/ptmx
./dev/tty12
./lib
./lib/mbcache.ko
./lib/libuuid.so.1.2
./lib/raid1.ko
./lib/async_tx.ko
./lib/sd_mod.ko
./lib/uhci-hcd.ko
./lib/ld-linux.so.2
./lib/firmware
./lib/libcrypto.so.6
./lib/libuuid.so.1
./lib/raid456.ko
./lib/libdl-2.7.so
./lib/libc.so.6
./lib/ata_generic.ko
./lib/pata_jmicron.ko
./lib/pata_acpi.ko
./lib/libgcc_s.so.1
./lib/i686
./lib/i686/nosegneg
./lib/i686/nosegneg/libm-2.7.so
./lib/i686/nosegneg/libc-2.7.so
./lib/ext3.ko
./lib/libz.so.1
./lib/libata.ko
./lib/async_memcpy.ko
./lib/libglib-2.0.so.0
./lib/async_xor.ko
./lib/libblkid.so.1.0
./lib/libdl.so.2
./lib/libselinux.so.1
./lib/ehci-hcd.ko
./lib/libglib-2.0.so.0.1400.6
./lib/jbd.ko
./lib/libblkid.so.1
./lib/libresolv.so.2
./lib/libgcc_s-4.1.2-20070925.so.1
./lib/libpopt.so.0
./lib/libz.so.1.2.3
./lib/libresolv-2.7.so
./lib/libm.so.6
./lib/scsi_mod.ko
./lib/libdevmapper.so.1.02
./lib/xor.ko
./lib/libcrypto.so.0.9.8b
./lib/ld-2.7.so
./lib/libsepol.so.1
./lib/ohci-hcd.ko
./lib/ata_piix.ko
./lib/scsi_wait_scan.ko
./lib/libpopt.so.0.0.0
./etc
./etc/mdadm.conf
./etc/ld.so.cache
./etc/ld.so.conf
./etc/ld.so.conf.d
./etc/ld.so.conf.d/mysql-i386.conf
./etc/ld.so.conf.d/qt-i386.conf
./etc/ld.so.conf.d/opensync-32.conf
./sys
./sysroot
./proc
./bin
./bin/rmmod
./bin/modprobe
./bin/insmod
./bin/nash
./bin/mdadm


The ./init program is the one the kernel tries to start and it's a nash script:

#!/bin/nash

mount -t proc /proc /proc
setquiet
echo Mounting proc filesystem
echo Mounting sysfs filesystem
mount -t sysfs /sys /sys
...


The script dies before the first "echo" so either nash, mount, or echo caused the segfault in nash. ... most likely this nash or a support library is broken.


[stevea@nidula foo]# ldd ./bin/nash
linux-gate.so.1 => (0x00110000)
libnash.so.6.0.19 => /usr/lib/libnash.so.6.0.19 (0x006ee000)
libbdevid.so.6.0.19 => /usr/lib/libbdevid.so.6.0.19 (0x006ca000)
libdevmapper.so.1.02 => /lib/libdevmapper.so.1.02 (0x00981000)
libparted-1.8.so.6 => /usr/lib/libparted-1.8.so.6 (0x00c9f000)
libblkid.so.1 => /lib/libblkid.so.1 (0x006be000)
libselinux.so.1 => /lib/libselinux.so.1 (0x00a5e000)
libsepol.so.1 => /lib/libsepol.so.1 (0x009cd000)
libuuid.so.1 => /lib/libuuid.so.1 (0x00963000)
libpopt.so.0 => /lib/libpopt.so.0 (0x00168000)
libresolv.so.2 => /lib/libresolv.so.2 (0x00abb000)
libdl.so.2 => /lib/libdl.so.2 (0x006b7000)
libdhcp.so.1 => /usr/lib/libdhcp.so.1 (0x00173000)
libnl.so.1 => /usr/lib/libnl.so.1 (0x00111000)
libdhcp4client-3.0.6.so.0 => /usr/lib/libdhcp4client-3.0.6.so.0 (0x002f9000)
libdhcp6client-0.10.so.0 => /usr/lib/libdhcp6client-0.10.so.0 (0x00816000)
libglib-2.0.so.0 => /lib/libglib-2.0.so.0 (0x065c2000)
libm.so.6 => /lib/libm.so.6 (0x0068c000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00a86000)
libc.so.6 => /lib/libc.so.6 (0x00531000)
/lib/ld-linux.so.2 (0x00512000)
libcrypto.so.6 => /lib/libcrypto.so.6 (0x03afa000)
libz.so.1 => /lib/libz.so.1 (0x006d9000)


So ldd says all the shared libraries for nash are present on my F8 laptop. I unpacked a working initrd for F8 here and the nash binary is different. (I have nash-6.0.19-4.fc8).

The list of libraries on our identical except ....
Your initrd has
usr/lib/libnl.so.1.0-pre8
while mine has
usr/lib/libnl.so.1.1

The "pre8" is odd and may be the problem.
It's a network utility library - so it seems unlikely but hey - who know ! Nash may use something innocuous from libnl and hit a snag.

You should definitely install libnl-1.1-1.fc8 and try again.

stevea
22nd June 2008, 08:57 PM
FWIW *all* of your libraries and all the binaries like insmod an modprobe I've tested differ from mine. I am not surprised that some do (many updates since the original F8) but I am concerned that all so. I think you have some old F6 libriaries or tools in the mix or an incomaptible library change (like libns *might be).

[edit]

Since I haven't said so before - I think upgrading Fedora is a marginal policy. Yes it normally works but you are logging around a trunk full of potential troubles with any any legacy executables and binaries.

JEO
22nd June 2008, 09:56 PM
"FWIW *all* of your libraries and all the binaries like insmod an modprobe I've tested differ from mine."

Could prelink have something to do with this?

JEO
22nd June 2008, 10:16 PM
"Your initrd has
usr/lib/libnl.so.1.0-pre8
while mine has
usr/lib/libnl.so.1.1"

The libnl package that comes with the original Fedora 8 Live CD is libnl-1.0-0.10.pre5.4 and contains /usr/lib/libnl.so.1.0-pre5 which is older so yours is fully updated whereas his might be from a respin or something.

mjurgens
23rd June 2008, 12:50 AM
"I am on Fedora 8 - the /dev/hd devices don't go away until Fedora 9"

You said you were booting the FC6 kernel. The kernel is the place where the device name changes have taken place. The old IDE drivers have been replaced by the new LIBATA drivers. There are no /dev/hda and /dev/hdb in Fedora 8 kernels. Boot from the F8 installation media and choose rescue mode, and do an ls /dev/hd* and an ls /dev/sd* and also a blkid command to see.

I see what you are saying.
I will try this as soon as I can (couple of days).

mjurgens
23rd June 2008, 12:53 AM
Since I haven't said so before - I think upgrading Fedora is a marginal policy. Yes it normally works but you are logging around a trunk full of potential troubles with any any legacy executables and binaries.

I agree. Its just weighing up the work of
1. doing an in-place upgrade vs
2. reinstalling the OS and the hundreds of apps/scripts/configs I have built up over the years.

In the past I have chosen item 1 and resolved any issues.

I will do a yum upgrade and also check out some of the other items you have mentioned.

Thanks for the suggestions

mjurgens
23rd June 2008, 02:30 AM
I see what you are saying.
I will try this as soon as I can (couple of days).

I've reviewed the Fedora 8 release notes and I think you are referring to the fact that partitions must be labelled. See http://docs.fedoraproject.org/release-notes/f8/en_US/sn-Installer.html#sn-label-disk-partitions.

I have done this for the boot partitions already. I actually boot from a /dev/md device made up of /dev/hd devices.
/dev/md0: UUID="f5879d23-70c6-4f0c-aec0-a941213a22ef" SEC_TYPE="ext2" TYPE="ext3" LABEL="/boot"
/dev/hda1: UUID="f5879d23-70c6-4f0c-aec0-a941213a22ef" SEC_TYPE="ext2" TYPE="ext3" LABEL="/boot"
/dev/hda3: UUID="2b13e65c-b9d6-41e6-9923-b5e9e1700c2d" SEC_TYPE="ext2" TYPE="ext3" LABEL="/"
/dev/md2: LABEL="/" UUID="2b13e65c-b9d6-41e6-9923-b5e9e1700c2d" SEC_TYPE="ext2" TYPE="ext3"
/dev/md1: TYPE="swap" LABEL="SWAP1" UUID="546f4aeb-2411-4b4a-aa1c-5aaadacdc242"
/dev/hda2: TYPE="swap" LABEL="SWAP1" UUID="546f4aeb-2411-4b4a-aa1c-5aaadacdc242"
/dev/hdb3: UUID="2b13e65c-b9d6-41e6-9923-b5e9e1700c2d" SEC_TYPE="ext2" TYPE="ext3" LABEL="/"
/dev/hdb2: TYPE="swap" LABEL="SWAP1" UUID="546f4aeb-2411-4b4a-aa1c-5aaadacdc242"
/dev/hdb1: UUID="f5879d23-70c6-4f0c-aec0-a941213a22ef" SEC_TYPE="ext2" TYPE="ext3" LABEL="/boot"
/dev/hdb4: UUID="d263c21f-056b-4e4b-9c97-52b36c516229" TYPE="xfs"

The only one that is missing a label is the xfs partition on /dev/hdb4. I will remove this from the /etc/fstab file and try again.

I will also still boot from the rescue DVD and take a look at the output of the blkid command.

When I get to Fedora 9 I will lose /dev/hd devices. See http://docs.fedoraproject.org/release-notes/f9/en_US/sn-Installer.html#sn-IDE-devices

stevea
23rd June 2008, 03:12 AM
I've reviewed the Fedora 8 release notes and I think you are referring to the fact that partitions must be labelled. ....
When I get to Fedora 9 I will lose /dev/hd devices. See ....

This confuses me too. Libata generally replaced the ide driver at FC7 (not F9). The issue with using devnames vs labels vs UUIDs ONLY applies once the disk file system is mounted - it has nothing to do w/ the current problem.
==
FWIW I keep careful notes about installing a distros and as a result it only takes ~45 minutes of face-time to install a new Fedora distro + extras.

JEO
23rd June 2008, 04:48 AM
I see in the unpacked initrd there is a file called /etc/mdadm.conf This is the file that I suspect contains the device names for the raid devices.

Edit:
Ok I looked, it just contains:

DEVICE partitions
MAILADDR root@localhost

mjurgens
23rd June 2008, 10:00 AM
There are no /dev/hda and /dev/hdb in Fedora 8 kernels. Boot from the F8 installation media and choose rescue mode, and do an ls /dev/hd* and an ls /dev/sd* and also a blkid command to see.

I think this is sort of right. I booted into rescue mode and there were no /dev/hd* devices. However, I could still see /dev/md* devices and mount them (they are the IDE drives I boot off). blkid also returned no hd* devices (as expected) but did return the /dev/md* devices with the expected labels.

My take on this is that /dev/hd* might be a red herring.
I removed /dev/hdb4 from the /etc/fstab and rebuilt the f8 kernel with the same bad result.

I have yum upgraded now and reinstalled the latest f8 kernel.
This cleaned up the libnl issue raised earlier.
My new init (which still fails) is at http://www.edcint.co.nz/initrd-2.6.25.6-27.fc8PAE.img

I've also lodged a bug against mkinitrd for f8. See https://bugzilla.redhat.com/show_bug.cgi?id=451576

mjurgens
25th June 2008, 03:06 AM
How about this:
I wrote a script to compare all the files in the initrd against the matching files in the system it came from. It does a diff on the files to find exactly where it came from an tries to find the rpm that owns it. Have a look at the attached report.

I am most concerned about the sections highlighted red and yellow. They appear to be related to the glibc package which is a major component.
The files in the green sections have no match but are not as much of a concern.

ll /lib/i686/nosegneg
total 2104
drwxr-xr-x 2 root root 4096 2008-02-18 16:43 .
drwxr-xr-x 4 root root 4096 2008-02-18 18:06 ..
-rwxr-xr-x 1 root root 1701336 2007-10-18 18:48 libc-2.7.so
lrwxrwxrwx 1 root root 11 2008-02-18 16:43 libc.so.6 -> libc-2.7.so
-rwxr-xr-x 1 root root 208308 2007-10-18 18:48 libm-2.7.so
lrwxrwxrwx 1 root root 11 2008-02-18 16:43 libm.so.6 -> libm-2.7.so
-rwxr-xr-x 1 root root 129472 2007-10-18 18:48 libpthread-2.7.so
lrwxrwxrwx 1 root root 17 2008-02-18 16:43 libpthread.so.0 -> libpthread-2.7.so
-rwxr-xr-x 1 root root 46024 2007-10-18 18:48 librt-2.7.so
lrwxrwxrwx 1 root root 12 2008-02-18 16:43 librt.so.1 -> librt-2.7.so
-rwxr-xr-x 1 root root 33716 2007-10-18 18:48 libthread_db-1.0.so
lrwxrwxrwx 1 root root 19 2008-02-18 16:43 libthread_db.so.1 -> libthread_db-1.0.so

None of the /lib/i686/nosegneg are owned by any RPM. I think they might be related to Xen - which I do not use.
I think I will remove the files in /lib/i686/nosegneg directory and try again.

JEO
25th June 2008, 04:50 AM
rpm -q --whatprovides /lib/i686/nosegneg/libc-2.7.so
glibc-2.7-2

What do you get when you issue
rpm -qv glibc
?

mjurgens
25th June 2008, 07:07 AM
rpm -q --whatprovides /lib/i686/nosegneg/libc-2.7.so
glibc-2.7-2

What do you get when you issue
rpm -qv glibc
?

I get:
rpm -q --whatprovides /lib/i686/nosegneg/libc-2.7.so
file /lib/i686/nosegneg/libc-2.7.so is not owned by any package (Same as my attached report gave)

rpm -qv glibc
glibc-2.7-2

JEO
25th June 2008, 10:34 AM
What I would do is manually download the package from a mirror and force it to update itself. It might just be that your rpm database is corrupted but why take chances when you have that segfault problem?

rpm -Uvh --force (filename.rpm)

mjurgens
25th June 2008, 11:09 AM
Major progress. I renamed /lib/i686/nosegneg to /lib/i686/nosegneg.old. Reinstalled the latest f8 kernel and rebooted. It actually started the init script this time and then I got to:
"mdadm: /dev/md2 not identified in config file" - which just happens to be my root partition. It then failed to switchroot.
I rebooted on the fc6 kernel and
mdadm --detail --scan >> /etc/mdadm.conf
so that it now looks like:
DEVICE partitions
MAILADDR root@localhost
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=d6ed8775:c6c35fc5:e63d5aca:cdb18416
ARRAY /dev/md3 level=raid5 num-devices=4 UUID=7202561b:f2bd943c:30754545:235bb205
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=92ed2b0e:fcdd4ea1:62a44e43:d2576a42
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=624234e1:e79aad4d:fe6cc5ef:1ac02e84

And now it boots into Fedora 8 kernel.
I am now left with 2 more minor problems:
1. "Unable to access resume device (LABEL=SWAP1)" - but it still mounts /dev/md1 as swap
2. I would like to be able to simplify my mdadm.conf file so that I do not explicity have to specify the array details. I want it simple again - like DEVICE partitions. I have to have a play around with it.

Thanks to JEO and Stevea for all their ideas