PDA

View Full Version : booting from /dev/md0 sw raid1 device panics



komanek
28th March 2006, 06:05 PM
Hi all,

please, do you thing this may be a bug or my fault ?

Thanks a lot,

David Komanek




Description of problem:

booting from /dev/md0 sw raid1 device panics with the following messages:
mount: could not find filesystem '/dev/root'
setuproot: moving /dev failed: No such file or directory
setuproot: error mounting /proc failed: No such file or directory
setuproot: error mounting /sys failed: No such file or directory
switchroot: mount failed: No such file or directory
Kernel panic: not syncing: Attempted to kill init!


Version-Release number of selected component (if applicable):

kernel 2.6.15-1.2054_FC5
mkinitrd: version 5.0.32
mdadm - v2.3.1 - 6 February 2006
grub (GNU GRUB 0.97)

How reproducible:

migrating boot+root partition from single-disk device to the software raid1

Steps to Reproduce:
1. booting sata-disk:
Device Boot Start End Blocks Id System
/dev/sda1 * 1 36970 296961493+ 83 Linux ==> /
/dev/sda2 36971 38390 11406150 83 Linux ==> /test
/dev/sda3 38391 38913 4200997+ 82 Linux swap / Solaris

2. second (identical) disk prepared for raid1
Device Boot Start End Blocks Id System
/dev/sdb1 * 1 36970 296961493+ fd Linux raid autodetect
/dev/sdb2 36971 38390 11406150 83 Linux
/dev/sdb3 38391 38913 4200997+ 82 Linux swap / Solaris

3. creating array in degraded mode:
mdadm --create /dev/md0 -l 1 -n 2 missing /dev/sdb1
mkfs -t ext3 /dev/md0 # the same results I have for ext2
mount /dev/md0 /foo
tar c / | tar xvC /foo
adjusting /etc/fstab, /etc/grub.conf, running grub-install to be sure
optional (has no effect for me): using mkinitrd --preload=raid1 ....

4. reboot with root=/dev/md0

Actual results:

won't boot, produces following messages:

mount: could not find filesystem '/dev/root'
setuproot: moving /dev failed: No such file or directory
setuproot: error mounting /proc failed: No such file or directory
setuproot: error mounting /sys failed: No such file or directory
switchroot: mount failed: No such file or directory
Kernel panic: not syncing: Attempted to kill init!


Expected results:

working system

Additional info:

this procedure of migration to raid1 on boot device worked in FC4 without problems

lvanek
28th March 2006, 06:49 PM
Im probabily out of my depth here but on my FC5 system I have a separate boot partition (/boot ) which I setup as /dev/md0 for two drives in question at install time. I also setup /dev/md1 for swap & /dev/md2 (mounted on /) for the remaining space.

Doesnt your sda1 also need to marked as "Linux raid autodetect" if its part of the sda1, sdb1 mirror?

komanek
28th March 2006, 07:03 PM
Im probabily out of my depth here but on my FC5 system I have a separate boot partition (/boot ) which I setup as /dev/md0 for two drives in question at install time. I also setup /dev/md1 for swap & /dev/md2 (mounted on /) for the remaining space.

Doesnt your sda1 also need to marked as "Linux raid autodetect" if its part of the sda1, sdb1 mirror?


Yeah, in the final step yes. I had fc4 installation for about 1/2 of a year. Last week I upgraded to fc5 and now I bought a second disc do have some redundancy, so I need to migrate from /dev/sda1 to /dev/md0. There for is the array now in degraded state. I did the same some time ago on another computer with fc4 and everything was fine ....

I have /boot as a subdirectory of a root partition because this is workstation, so there is no need to have tons of partitions for all the /boot /var etc. stuff. I don't think this should cause problems, because grub knows from where to boot on /dev/sda, so it should do the same on /dev/md0. Or maybe not ?

I have the same results if I cange root device on the kernel line to /dev/sdb1, /dev/md0 and also it seems to be independent of root (0,0) vs. root (1,0) line. Here is my grub.conf file:

default=0
timeout=5
splashimage=(hd0,0)/boot/grub/splash.xpm.gz
hiddenmenu
title Fedora Core (2.6.15-1.2054_FC5smp)
root (hd0,0)
kernel /boot/vmlinuz-2.6.15-1.2054_FC5smp ro root=/dev/sda1 rhgb quiet
initrd /boot/initrd-2.6.15-1.2054_FC5smp.img
title Fedora Core-up (2.6.15-1.2054_FC5)
root (hd0,0)
kernel /boot/vmlinuz-2.6.15-1.2054_FC5 ro root=/dev/sda1 rhgb quiet
initrd /boot/initrd-2.6.15-1.2054_FC5.img




And here is how I change it which does not work:
title Fedora Core (2.6.15-1.2054_FC5smp)
root (hd0,0)
kernel /boot/vmlinuz-2.6.15-1.2054_FC5smp ro root=/dev/md0 rhgb quiet
initrd /boot/initrd-2.6.15-1.2054_FC5smp.img

OR

title Fedora Core (2.6.15-1.2054_FC5smp)
root (hd1,0)
kernel /boot/vmlinuz-2.6.15-1.2054_FC5smp ro root=/dev/md0 rhgb quiet
initrd /boot/initrd-2.6.15-1.2054_FC5smp.img

OR

title Fedora Core (2.6.15-1.2054_FC5smp)
root (hd1,0)
kernel /boot/vmlinuz-2.6.15-1.2054_FC5smp ro root=/dev/sdb1 rhgb quiet
initrd /boot/initrd-2.6.15-1.2054_FC5smp.img


Please, note that if I boot from /dev/sda1, I can normally mount /dev/md0 at a given mountpoint with no problems.

David

bachtiar
30th March 2006, 07:56 AM
I am experiencing exactly the same problem in FC5 (2.6.15-1.2054_FC5).

If I boot into FC4 (2.6.14-1.1656_FC4), kernel detects the array and makes md0 available. With FC5, it seems to somehow miss the autodetect stage and panics at root filesystem mount (root=/dev/md0).

I have created md0 as raid-1 from two partitions, both of them are linux raid autodetect. Not quite sure, but I think I don't see any "md: Autodetecting RAID arrays" printed.

However on a different machine, the same kernel version detects an array normally (in that case, md0 is not the root device, so I guess that doesn't count).

komanek
30th March 2006, 09:29 AM
However on a different machine, the same kernel version detects an array normally (in that case, md0 is not the root device, so I guess that doesn't count).[/QUOTE]

Yes, I also think the problem is only with root (or probably boot ? - I have /boot on the root partition) device or maybe with grub itself. Or could it be sata-specific problem ? Mounting /dev/md0 manually or via fstab in another mountpoint works fine for me, too.

I guess, even no initrd does not happen to be found by kernel, since I made a new one with raid1 module and it made no difference.

bachtiar
30th March 2006, 09:32 AM
FYI - problem solved after a mkinitrd.

Old initrd didn't include kernel modules for running raid. This was due to upgrade, because at time kernel was installed, mkinitrd didn't know it should include raid modules.

komanek
30th March 2006, 09:35 AM
FYI - problem solved after a mkinitrd.

Old initrd didn't include kernel modules for running raid. This was due to upgrade, because at time kernel was installed, mkinitrd didn't know it should include raid modules.

Hm, interesting why not working for me, but I am at least glad you solved it. Do you mind if you post your grub.conf and menu.lst ? Probably I am not able to see my own stupidity archived in these files on my machine :-) Thanks.

bachtiar
31st March 2006, 03:10 PM
There's nothing special about my grub.conf:

default=0
timeout=3
hiddenmenu
title Fedora Core 5 (2.6.15-1.2054_FC5)
root (hd0,0)
kernel /boot/vmlinuz-2.6.15-1.2054_FC5 ro root=/dev/md0
initrd /boot/initrd-2.6.15-1.2054_FC5.img

Are you seeing "md: autodetecting raid arrays" messages at boot? If not, reading Documentation/md.txt from kernel source may help you, as well as "cat initrd.img | gzip -dc | cpio -idvm" (look for raid1.ko in lib/).

lvanek
1st April 2006, 12:50 AM
as another data point here is mine:

# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/md2
# initrd /initrd-version.img
#boot=/dev/md0
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
#hiddenmenu
title Fedora Core (2.6.16-1.2080_FC5)
root (hd0,0)
kernel /vmlinuz-2.6.16-1.2080_FC5 ro root=/dev/md2
initrd /initrd-2.6.16-1.2080_FC5.img
title Fedora Core (2.6.15-1.2054_FC5)
root (hd0,0)
kernel /vmlinuz-2.6.15-1.2054_FC5 ro root=/dev/md2
initrd /initrd-2.6.15-1.2054_FC5.img

komanek
3rd April 2006, 02:26 PM
Thank you for your configuration files and all the other comments. The problem is now solved, here is how:

- upgraded from 2.6.15 to 2.16.16 kernel (the new one is also FC5 official)
- mounted /dev/md0 as /foo and mkinitrd issued with --fstab=/foo/etc/fstab (where the /dev/md0 was written for root+boot partition)
- followed standard procedure up to partition sync
- after the sync, mkinitrd again
- grub with setup to both disks

All problems seem to be vanished. The only thing I do not know how to do is how to boot from rescue CD in the case I will need it without destroying the array again.

So, thank you again.

komanek
3rd April 2006, 02:26 PM
2.6.16, not 2.16.16, of course, sorry for the typo

lvanek
3rd April 2006, 06:08 PM
I have had the need to "rescue" a couple of times after my RAID 1 arrays were setup. Nothing about FC5 that caused it, just me fooling around.

I put CD #1 in and typed "linux rescue" at the install boot prompt. Then chroot as instructed. Make whatever edits you like then exit. Pull CD & reboot.

I did not have any issues with broken arrays doing this. Our setups are not exactly the same but hopefully you wont either. Check with cat /proc/mdstat. If missing pieces can hotadd then back in with mdadm --add /dev/md*** /dev/****