Fedora Linux Support Community & Resources Center

Go Back   FedoraForum.org > Fedora 17/18 > Servers & Networking
FedoraForum Search

Forgot Password? Join Us!

Servers & Networking Discuss any Fedora server problems and Networking issues such as dhcp, IP numbers, wlan, modems, etc.

Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old 10th April 2009, 04:36 PM
savage's Avatar
savage Offline
Registered User
 
Join Date: Jun 2005
Location: Mission Control
Posts: 1,229
RAID 5 - Need Help - I'm an idiot!

Ok, I have a RAID-5 array consisting of 5x500GB SATA disks... sda-sde.

I've been shifting the hard disks around to make space between them (for cooling), and being an absolute moron, forgot to check that all disks were wired in correctly, and booted the array with a disk missing power - oops!

So sdc is now "removed", then to add to the fun, sdd died, but not thoroughly, I/O errors, but is still accessible after a power cycle. I've been fitting the disks into self-cooled caddys, but suspect they could have caused the IO errors, they're not simple pass throughs, so I'm rebuilding now without the caddys - 53% so far.

I've tried re-assembling the array, with --force, adding sdc back into the mix, but it gets to about 58% rebuilding, then sdd gets upset and it all goes belly up.

I can access the data on the array while it's rebuilding, and all essential data is backed up, but I'd rather not loose the other 1TB of data.

Anyone got any suggestions on this? other than taking myself outside and shooting myself?

I've been thinking about trying to do a dd if=/dev/sdd of=/dev/newdisk, and then trying to rebuild the array with a new disk for sdd.

Any magical bodges like this that could get my array back to life would be greatly appreciated.

Thanks, a feeling rather dense Savage
_____________

Update 1: Removing the caddy's didn't help, I'm currently trying dd to make a clone of sdd, and will then try to re-assemble the array with the new disk. Is there a way to fsck a "Linux RAID" partition?

Last edited by savage; 10th April 2009 at 05:48 PM.
Reply With Quote
  #2  
Old 11th April 2009, 05:36 AM
stevea's Avatar
stevea Offline
Registered User
 
Join Date: Apr 2006
Location: Ohio, USA
Posts: 8,302
I think your plan (dd the entire image and use the new disk) is a wise choice.
IIRC the disk ident is from a software generated UUID - so I think it;'s a clean
solution.

Please post if this works (or not), and also ...
how long is the rebuild taking ?
Reply With Quote
  #3  
Old 11th April 2009, 03:35 PM
savage's Avatar
savage Offline
Registered User
 
Join Date: Jun 2005
Location: Mission Control
Posts: 1,229
Thanks for the reply, I have successfully dd'd the disk, it didn't seem to error, but it wont let me rebuild the array with it because it doesn't have a superblock, which has really thrown me off as I thought dd would copy EVERYTHING!

So anyway I've got another 2x500GB disks (they were destined to replace faulty disks), but I am using them as temporary storage, and currently backing up data to those - I am confused as I haven't had any kind of error retrieving data from the array running with 4 disks (including sdd - which fails consistenty during rebuilds).

Once I've got the bulk of stuff I don't want to loose, I'm going to try to add a new disk as sdc (the one I removed power from), hopefully writing a new superblock, and then 'dd' sdd onto it again.

I've seen another post on linuxquestions.org that suggests that using 'mdadm --create' won't shaft the data, but will write a superblock, but I wanted backups before trying that.

A normal rebuild of this array takes a good 3-6 hours.

As much as this does suck, my confidence levels with mdadm and RAID has gone through the roof

Like I say all the essential stuff is safe, but there's a lot of movies and music in on there that I'm trying to save too -- they're ripped from family DVDs, so no issue re-ripping them, but I'd rather avoid that if I can (most of my DVDs have been boxed for years).

I'll keep this thread updated with what happens, once I've got as much off as possible, I'll be free to go to town on it and try any and everything.

Savage

Update: I spoke too soon, the array falls over and then I got read errors copying some movies. I'll re-assemble and skip onto the next set of directories.

Last edited by savage; 11th April 2009 at 04:31 PM.
Reply With Quote
  #4  
Old 11th April 2009, 04:39 PM
savage's Avatar
savage Offline
Registered User
 
Join Date: Jun 2005
Location: Mission Control
Posts: 1,229
Another question, I am getting really confused.

If the disk can be copied entirely without error, that says to me the disk itself is OK, but when rebuilding/copying data I get errors.

Is it possible for an ext3 file system error to trip up mdadm and make it think the disk is at fault?

----

I am well and truely beyond confused now. Smartctl (smartdctl?) says that the disk is fine. fsck on the md1 array running with 4 disks reports it's fine. Rebuilding failed.

The only good thing to come of this, is that sdc that I just tried adding should now have a valid superblock, so I'm dd'ing sdd onto that again, it'll take a few hours, but hopefully work. Then with any luck I can recover the array with that disk.

All in all, I really don't understand what's going on here, if the disk is fine, and the file system is fine, where's the problem!?

Last edited by savage; 11th April 2009 at 10:01 PM.
Reply With Quote
  #5  
Old 12th April 2009, 06:32 PM
savage's Avatar
savage Offline
Registered User
 
Join Date: Jun 2005
Location: Mission Control
Posts: 1,229
Well in the end I was defeated, but I did manage to recover roughly 90% of the data, and important stuff was backed up.

For anyone else who accidentally removes 2 disks from a RAID-5 array, don't panic, it's not that bad to fix (provided you don't have satanic hard disks like me):
Code:
mdadm --assemble /dev/mdX --force
--force is required as the disks are flagged as removed, force will remove that flag.

If it complains the superblock is missing, you can apparently re-create the array without damaging data*:
Code:
mdadm --create /dev/mdX -l5 -n5 /dev/sda1 /dev/sdb1...
You'll then need to assemble it, as above.

* I didn't try this, as no matter what I did, it always failed around 53%.
Reply With Quote
  #6  
Old 12th April 2009, 11:23 PM
stevea's Avatar
stevea Offline
Registered User
 
Join Date: Apr 2006
Location: Ohio, USA
Posts: 8,302
I greatly doubt (unsure) that an ext3 error could glitch your rebuild.
Glad you recovered all the import stuff, but but 2 failures on a 5x RAID5 is beyond my experience.

As Meatloaf sang, "four out of five ain't bad". 3 of five is a problem.
Reply With Quote
  #7  
Old 13th April 2009, 05:33 PM
savage's Avatar
savage Offline
Registered User
 
Join Date: Jun 2005
Location: Mission Control
Posts: 1,229
Quote:
Originally Posted by stevea View Post
I greatly doubt (unsure) that an ext3 error could glitch your rebuild.
Glad you recovered all the import stuff, but but 2 failures on a 5x RAID5 is beyond my experience.

As Meatloaf sang, "four out of five ain't bad". 3 of five is a problem.
I know big corps like to blame the technology, but ultimately it was my fault. I was putting the disks into self-cooled caddys and missed the power off one, dropping it to 4 disks, while it was being re-added, another disk failed. If I hadn't missed the power, it wouldn't have been a problem.

As for the disk that failed, I have no idea, I moved it into this PC, reformatted it and left it creating a file from /dev/zero, it didn't error or crash, just filled.

I've been running a home server since 2005 (when I got into Linux), and that was my first major disaster with it. I came out pretty well, and am amazed by Linux for it.

I restored user documents from backup to the new array last night, as soon as I did, restarted a few services that were failing, and everything just clicked and was back running, I was massively impressed, 2 days ago I didn't expect to come out of this with just a few scratches.

Initially that "(Filesystem Recovery #): " prompt was intimidating, but soon became my best friend, and a relaxed environment where I could experiment.

I've come out of this a lot more confident about RAID and mdadm, and loving Linux even more.

Quote:
Originally Posted by Ralph Waldo Emerson
Bad times have a scientific value. These are occasions a good learner would not miss.
Reply With Quote
Reply

Tags
idiot, raid

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Who's the idiot... Wayne Fedora Focus 34 10th March 2009 05:41 PM
Is this guy an idiot? rappermas Wibble 33 19th June 2007 11:29 PM
Like an IDIOT !! Rob Mitchell Using Fedora 9 5th August 2005 12:28 AM
I am an idiot haplo Using Fedora 2 17th December 2004 10:43 PM


Current GMT-time: 11:56 (Friday, 24-05-2013)

TopSubscribe to XML RSS for all Threads in all ForumsFedoraForumDotOrg Archive
logo

All trademarks, and forum posts in this site are property of their respective owner(s).
FedoraForum.org is privately owned and is not directly sponsored by the Fedora Project or Red Hat, Inc.

Privacy Policy | Term of Use | Posting Guidelines | Archive | Contact Us | Founding Members

Powered by vBulletin® Copyright ©2000 - 2012, vBulletin Solutions, Inc.

FedoraForum is Powered by RedHat