PDA

View Full Version : [SOLVED] Raid 1 /home not booting



multipitch
25th July 2012, 09:16 PM
I'm not sure whether this is a software issue or a hardware issue.

My computer is set up as follows:
128 GB SSD (/dev/sda)for everything but /home
2x 2TB HDD (/dev/sdb and /dev/sdc) with Raid 1 used for /home

I turned it on a day or two ago and it won't boot properly.
I don't recall making any major changes on the previous boot.

The boot hangs with the following (transcribed from a photo of the screen!):

[ TIME ] Timed out waiting for device dev-disk-by\x2duuid-e4e34dac\x2d1eec\x2d42c2\x2dba9b\x2dae06357236c7.d evice...
[DEVICE] Dependency failed for /home
...and a load more [DEVICE] lines

Below is the output of a few relevant commands:

cat /proc/mdstat

Personalities : [raid1]
md0 : active raid1 sdb1[0] sdc1[1]
1952766840 blocks super 1.2 [2/2] [UU]
bitmap: 0/15 pages [0KB], 65536KB chunk

unused devices: <none>

fdisk -l

Disk /dev/sda: 128.0 GB, 128035676160 bytes
255 heads, 63 sectors/track, 15566 cylinders, total 250069680 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e1fd1

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1026047 512000 83 Linux
/dev/sda2 1026048 33794047 16384000 82 Linux swap / Solaris
/dev/sda3 33794048 250068991 108137472 83 Linux

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdb1 * 1 3907029167 1953514583+ ee GPT
Partition 1 does not start on physical sector boundary.

Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdc1 * 1 3907029167 1953514583+ ee GPT
Partition 1 does not start on physical sector boundary.

Disk /dev/sdd: 3995 MB, 3995074560 bytes
128 heads, 15 sectors/track, 4064 cylinders, total 7802880 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xc3072e18

Device Boot Start End Blocks Id System
/dev/sdd1 * 5304 7802879 3898788 c W95 FAT32 (LBA)

Disk /dev/md0: 1999.6 GB, 1999633244160 bytes
2 heads, 4 sectors/track, 488191710 cylinders, total 3905533680 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

cat /etc/mdadm.conf

# mdadm.conf written out by anaconda
MAILADDR root
AUTO +imsm +1.x -all
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=302d59a4:67761eb8:0a2923b7:fa4a0763

mdadm --detail /dev/md0

/dev/md0:
Version : 1.2
Creation Time : Fri Jun 22 06:53:22 2012
Raid Level : raid1
Array Size : 1952766840 (1862.30 GiB 1999.63 GB)
Used Dev Size : 1952766840 (1862.30 GiB 1999.63 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Thu Jul 26 18:40:12 2012
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

Name : leviathan.multipitch:0 (local to host leviathan.multipitch)
UUID : 302d59a4:67761eb8:0a2923b7:fa4a0763
Events : 8080

Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1

I have absolutely no idea what to do and am slow to try anything in case I lose my data (most of it is backed up but it would be a major pain to reconstitute it all)

---------- Post added at 08:16 PM ---------- Previous post was at 06:00 PM ----------

Also, here's the output of:
cat /etc/fstab


#
# /etc/fstab
# Created by anaconda on Fri Jun 22 06:53:52 2012
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=815f73e5-35e4-4269-ba7e-f42a5972ef47 / ext4 defaults,noatime,discard 1 1
UUID=c7ad2129-0b27-419b-9283-573bff5b2b96 /boot ext3 defaults,noatime,discard 1 2
UUID=e4e34dac-1eec-42c2-ba9b-ae06357236c7 /home ext4 defaults 1 2
UUID=851089ba-4064-412c-94fc-289102c5aa73 swap swap defaults 0 0
none /tmp tmpfs size=10% 0 0


I'm using an Asus P8Z77-V Pro motherboard with the SSD in one of the Intel SATA 6G ports and the two raided HDDs in the 2 AsMediaSATA 6G ports. I mention this because i had trouble trying to get these ports to play nice when installing the system.
As a check, I plugged the two HDDs into two of the Intel SATA 3G ports, in case the problem was with the AsMedia controller, but it seemed to make no difference.

Any help would be appreciated!

rtguille
26th July 2012, 12:55 AM
Can you comment the /home entry in the fstab.
Reboot, and try mount it manually.

If there are no problems, then unmount it and perform an fsck to the device.
Uncomment the fstab entry and test it without rebooting.

If the fstab entry work (i does not seem to be bad) the reboot and boot normally.

Just in case, use blkid to display all uuids and make sure fstab uuid is the raid uuid.
If the issue happens again:

* post the /proc/cmdline contents.
* whenever you reached single-user/emergency or had to boot from alternate media (i do assume you reached emergency)
* are there pending updates? / did you update the system between it was working ok and the first time it experienced the issue?
* check the /var/log/messages files, if there is a problem, some usefull information might have been loged.

george_toolan
26th July 2012, 11:33 AM
Can you read the smart values for the drives?


smartctl -a /dev/sdb

smartctl -a /dev/sdc

Are both drives recognized by the host adapter when you start your computer?

And don't use the AsMediaSATA unless you have to ;-)

multipitch
27th July 2012, 09:04 AM
Thanks for the help, rtguille.

I commented out the relevant line from /etc/fstab
I then rebooted and ran fsck /dev/md0

It took about 2-3 hours, with lots of numbers rapidly flashing across the screen.
It stopped halfway, prompting me regarding fixing errors (I just held down the y key - there were a lot of prompts!), then it continued and I had to do the same at the end of the process.
I presume this means it found errors on both hard drives.

I uncommented the relevant line in /etc/fstab and rebooted and it booted fine.
It was quite late by that stage so I haven't had a chance to see if all my files are as they should be. I'll have a look when I get home this evening.

I'm still not sure what might have caused the problem.
I don't recall doing any major updates or experiencing a crash on the boot previous to the issue.
As george_toolan suggested, it might be the AsMedia SATA chip so I'm going to put the drives on the slower Intel SATA ports, spinning disks won't max out SATA 3G anyway.