 |
 |
 |
 |
| Reviews, Rants & Things That Make You Scream The place for you to submit reviews of all those applications you use with Fedora. The Devs probably aren't listening, but some times you've just GOT to blow off steam or sing its praises. |

16th September 2012, 06:19 AM
|
 |
Registered User
|
|
Join Date: May 2004
Location: Mexico City, Mexico
Age: 35
Posts: 4,418

|
|
|
btrfs trashed my system
I know I was playing with fire by making my root partition into btrfs ever since Fedora 15... And it worked fine until fairly recently, actually!
I started to get some wierd messages on boot up and the system would hang on occasion with the HDD spinning like crazy. But today it finally was clear that either the filesystem has become corrupt or the latest kernel did not like my F15 formatted btrfs root partition... I feared my HDD was failing, but it was not the case (as the other partitions on it were fine)... It all happened a bit fast, by first having a couple of messages about errors and inconsistencies regarding btrfs, to blunt kernel panics when trying to even mount the partition from a rescue environment!
Good thing my important data is backed up and stored in a different /home partition on another physical disk formatted as ext4  , still I have to replace a whole bunch of applications and that is taking too long... I now see there is a reason why it hasn't become the standard filesystem... I do think that fsck.btrfs might be at fault here as it might have corrupted the partition at some point (seen it in action a few times... power failures due to storms down here recently), plus it apparently didn't like suspend-resume cycles (I tend to suspend my system rather than turning it off for faster startup)
__________________
If ain't broken, don't fix it! :eek:
If can be improved, go for it! :cool:
FedoraForum Community forums lurker.
Fedora user since RHL 5.2 :cool:
Systems: Laptop, Main System, Netbook.
|

16th September 2012, 06:22 AM
|
 |
Registered User
|
|
Join Date: Aug 2012
Location: Australia
Posts: 803

|
|
|
Re: btrfs trashed my system
btrfs should be much improved in F18, dunno bout the fsck though
|

16th September 2012, 06:45 AM
|
 |
Registered User
|
|
Join Date: Apr 2006
Location: Ohio, USA
Posts: 8,302

|
|
|
Re: btrfs trashed my system
I don't believe btrfs will be "much improved" in F18. F17 btrfs is near the current rev.
I think you should run
/sbin/btrfsck
over that drive (unmounted) and see what it says. Ther hasn't been a fsck.btrfs for some time.
Also consider ...
btrfs scrub /dev/whatever # see man page for options
you should be scrubbing & defragging periodically for the current btrfs IIRC.
__________________
None are more hopelessly enslaved than those who falsely believe they are free.
Johann Wolfgang von Goethe
Last edited by stevea; 16th September 2012 at 06:50 AM.
|

16th September 2012, 06:48 AM
|
 |
Registered User
|
|
Join Date: May 2004
Location: Mexico City, Mexico
Age: 35
Posts: 4,418

|
|
|
Re: btrfs trashed my system
Will have it into account, Stevea. But I ended up reinstalling and reformatting back into ext4 for the time being. One thing I can say, though. Apparently btrfs did not like being suspended at all, as it would cause the hangs when I suspended the machine rather than turn it off. I still have a laptop which is formatted as btrfs with Fedora 17 on it, so I'll use it as testbed.
__________________
If ain't broken, don't fix it! :eek:
If can be improved, go for it! :cool:
FedoraForum Community forums lurker.
Fedora user since RHL 5.2 :cool:
Systems: Laptop, Main System, Netbook.
|

16th September 2012, 06:54 PM
|
 |
Registered User
|
|
Join Date: Jul 2009
Posts: 46

|
|
|
Re: btrfs trashed my system
I had something similar happen to me after a btrfs install of Fedora 16. Basically killed my hard drive. Luckily it was a crappy 80GB backup drive, but I'm very hesitant to install anything but ext4 now.
|

16th September 2012, 08:03 PM
|
 |
Registered User
|
|
Join Date: Apr 2006
Location: Ohio, USA
Posts: 8,302

|
|
|
Re: btrfs trashed my system
First - good to see you posting again Thetargos. It's been a while.
btrfs is marked experimental for a reason, but frankly I haven't seen anything fishy w/ btrfs in over a year, and I have a (well backed up) partition as btrfs in daily use.
I'd really like to hear how the suspend problem can be reproduced, if possible.
btrfs doesn't kill drives Shady. All the block I/O goes through the same drivers regardless of the filesysy type. At most btrfs could cause more I/O and more seeks, but that's not my experience (actually measured) wrt btrfs. I think you are blaming btrfs for a coincidental drive failure.
\
__________________
None are more hopelessly enslaved than those who falsely believe they are free.
Johann Wolfgang von Goethe
Last edited by stevea; 16th September 2012 at 08:07 PM.
|

16th September 2012, 08:40 PM
|
 |
Registered User
|
|
Join Date: May 2004
Location: Mexico City, Mexico
Age: 35
Posts: 4,418

|
|
|
Well, in occasion it happened when resuming, though it started to happen more often, until it died.
My suspicion is that the drive may be starting to fail, and btrfs is more sensible to it than ext4. I'm in the process of changing the drive hopefully getting an SSD, and format it as btrfs... When I get it. Sadly they are still a bit expensive.
__________________
If ain't broken, don't fix it! :eek:
If can be improved, go for it! :cool:
FedoraForum Community forums lurker.
Fedora user since RHL 5.2 :cool:
Systems: Laptop, Main System, Netbook.
|

16th September 2012, 09:27 PM
|
 |
Registered User
|
|
Join Date: Apr 2006
Location: Ohio, USA
Posts: 8,302

|
|
|
Re: btrfs trashed my system
Yeah SSDs aren't getting cheaper at any noticeable rate, tho' they are getting faster. You can get a ~120GB 2011 era SSD for ~$90USD on sale, but the really good performers are still over $1USD/GB.
If the drive has problems does smartctl show it ?
__________________
None are more hopelessly enslaved than those who falsely believe they are free.
Johann Wolfgang von Goethe
|

17th September 2012, 01:04 AM
|
 |
Registered User
|
|
Join Date: May 2004
Location: Mexico City, Mexico
Age: 35
Posts: 4,418

|
|
I'll test that with my three drives and see.
---------- Post added at 07:04 PM ---------- Previous post was at 05:32 PM ----------
smartctl doesn't report problems, I will try to run a full set of tests.
Code:
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.5.3-1.fc17.x86_64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.9
Device Model: ST3808110AS
Serial Number: 5LR4A3D2
Firmware Version: 3.AAH
User Capacity: 80.026.361.856 bytes [80,0 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Sun Sep 16 19:04:02 2012 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 430) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 27) minutes.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 117 085 006 Pre-fail Always - 141785371
3 Spin_Up_Time 0x0003 097 094 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 099 099 020 Old_age Always - 1727
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 084 060 030 Pre-fail Always - 328139453
9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 6691
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 098 098 020 Old_age Always - 2530
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 065 046 045 Old_age Always - 35 (Min/Max 25/35)
194 Temperature_Celsius 0x0022 035 054 000 Old_age Always - 35 (0 17 0 0 0)
195 Hardware_ECC_Recovered 0x001a 067 045 000 Old_age Always - 184716999
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 199 000 Old_age Always - 345
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
__________________
If ain't broken, don't fix it! :eek:
If can be improved, go for it! :cool:
FedoraForum Community forums lurker.
Fedora user since RHL 5.2 :cool:
Systems: Laptop, Main System, Netbook.
|

17th September 2012, 02:40 AM
|
 |
Registered User
|
|
Join Date: Apr 2006
Location: Ohio, USA
Posts: 8,302

|
|
|
Re: btrfs trashed my system
Quote:
|
199 UDMA_CRC_Error_Count 0x003e 200 199 000 Old_age Always - 345
|
Bad data comm. Might mean bad data or power cables or a bad/overloaded PSU.
__________________
None are more hopelessly enslaved than those who falsely believe they are free.
Johann Wolfgang von Goethe
|

17th September 2012, 10:04 PM
|
 |
Registered User
|
|
Join Date: May 2004
Location: Mexico City, Mexico
Age: 35
Posts: 4,418

|
|
|
Re: btrfs trashed my system
Hmm... Will have to test, then. I have a 500 W CoolerMaster PSU, the drive in question is attached to a SATA power connector straight off the PSU. I just changed the data cable with the same result, will have to try another branch with a converter... If not, a PSU change is in order.
---------- Post added at 04:04 PM ---------- Previous post was at 03:39 PM ----------
Looks like btrfs is less tolerant to hardware failure than ext4... in the long run, data gets lost, though, so btrfs spotted the problem earlier, it would seem.
__________________
If ain't broken, don't fix it! :eek:
If can be improved, go for it! :cool:
FedoraForum Community forums lurker.
Fedora user since RHL 5.2 :cool:
Systems: Laptop, Main System, Netbook.
|

19th September 2012, 07:28 AM
|
 |
Registered User
|
|
Join Date: Apr 2006
Location: Ohio, USA
Posts: 8,302

|
|
|
Re: btrfs trashed my system
The UDMA_CRC_Error_Count count is historical, it may increase, but won't decrease. The point is to stop it from increasing.
Quote:
|
Looks like btrfs is less tolerant to hardware failure than ext4... in the long run, data gets lost, though, so btrfs spotted the problem earlier, it would seem.
|
Well all filesystems do the same basic thing, they generate block I/O requests which pas through the block manager and into the hardware driver and the queued results come back. The filesystem doesn't do squat with the hardware - thats managed at lower levels.
So if you use the btrfs scan you can detect latent disk errors. By default btrfs has duplicate file metadata, and it does a checksum on every block, So it may detect errors (via checksum) sooner and it may recover better dues to the metadata duplication. But whether one filesystem cause/trips-across a particular block flaw earlier is more or less arbitrary.
This says your disk was always able to correct all the hundreds of millions of raw & ecc errors.
Quote:
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
|
So the *probable* cause of any disk spew in in the logs is that it was recently getting the UDMA_CRC interface errors. That's a guess, but ....
__________________
None are more hopelessly enslaved than those who falsely believe they are free.
Johann Wolfgang von Goethe
|

22nd September 2012, 10:47 PM
|
 |
Registered User
|
|
Join Date: Sep 2009
Location: Teetering between the edge of insanity and the border of all that's weird
Posts: 100

|
|
|
Re: btrfs trashed my system
BTRFS was my choice for a dual-thumbdrive install of fedora 17 because of the compression, and the first time the filesystem died with no hope of recovery. Now it's working fine, but the moral of the story is: don't install your OS on BTRFS unless you regularly make complete backups.
__________________
Often the only way to do a job right is to do a laughable job at it.
My advice is generally cheap and saturated with laziness, but it might work, or I wouldn't have posted it.
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
Current GMT-time: 12:57 (Thursday, 23-05-2013)
|
|
 |
 |
 |
 |
|
|