After having the joys of trying to recover data from a drive that has been formatted and is now failing I thought I would document what I did and provide a guide for those searching how to do this, especially on fedora.
To set the scene of what is in this guide I will first summarise my situation.
- One laptop that had windows install.
- Windows stopped booting and so Linux was installed (it was intended that the windows partition would be shrunk but this failed for some reason).
- After some months of use the disk began to fail.
- After mounting the disk on a livecd the only needed files, photos could not be found on the windows partition.
The first thing to know is that data is not really deleted from a hard drive, it is only ever deleted from the partition. As such unless the area of the hard drive that the data was on has had new data written to it, the original is still there.
Even when a drive is re-formatted/partitioned the data can still be there (which is why a hard drive should never be discarded without wiping in completely). Even installing a new OS on top of an old one will leave some of the data recoverable.
A word of warning, once it has been decided that data has been lost and needs to be recovered then you first need to power down the drive/machine; You can then start the machine again booting from a live cd or, if it is not an OS drive mount the drive as read only.
I would always suggest that regardless of whether data has been accidentally deleted, a partition corrupted or the disk is dying you should always make an image of the drive and work from that. In the case of a dying disk this is a necessity, although for the other two you can get away with working with the disk directly, but its always nice to test with an image first, especially trying to reconstruct a partition.
Imaging a drive/filesystem (assumed damaged)
There are a few options for imaging a drive, all of these are variations or improvements on
dd.
- dd - simple low-level copying
- dd_rescue - an evolution of dd designed to handle bad sectors.
- GNU ddrescue - disk imaging tool that "copies data from one file or block device to another, trying hard to rescue data in case of read errors."
dd is a tool designed for low-level copying of raw data; as such it may be used to copy one device to another. One flaw with it is that it cannot handle bad blocks of data, to get around this two other programs were created,
dd_rescue and
GNU ddrescue. (GNU) ddrescue is the superior and faster of the two programs, thus I will only focus on using
ddrescue.
Warning - dd is also known as "death and destruction" since if you use it incorrectly it can wipe your hard drive and leave the data unrecoverable.
ddrescue
To install ddrescue issue the following command from the terminal.
Code:
su -c 'yum install ddrescue'
ddrescue is used with the following parameters
Code:
ddrescue [options] in_device out_device [logfile]
whilst a log file is optional it is highly recommended that you use one, without it you will have to start from scratch if you run out of space or the process is killed. If the device has bad sectors then a logfile is non-optional.
In this guide it is assumed that the disk /dev/sda has bad sectors and needs imaging. We shall also assume that it is going to be imaged to a folder called 'recovery' on a USB hard drive that is mounted to '/media/usbdrive'. Do not forget that the drive or device you are imaging the disk to must be bigger then the original device.
Data recovery is divided into two parts, the main reason this is split into two parts is because if the drive is failing we wish to minimise use of the drive. This is achieved by only copying the parts of the drive that are healthy first, and then going back to copy the faulty parts of the drive ( a process that requires intensive disk usage). The end result is maximising the amount of data copied before the drive dies.
This is achieved by running ddrescue with the '-n' flag, this tells ddrescue not to split or retry the damaged sections of the disk.
Code:
su ddrescue -n /dev/sda /media/usbdrive/recovery/sda.img /media/usbdrive/recovery/sda.log
Once this has finished we can go back and try to recover the data in blocks with bad sectors. Here we use two flags, '-d' to access the disk directly and bypass the kernel cache, and '-r', the maximum number of retries.
Note that the output file given and log file should be the same as in the previous command as ddrescue will read the locations of the bad blocks from the logfile and append its new copies to the image.
Code:
su ddrescue -d -r3 /dev/sda /media/usbdrive/recovery/sda.img /media/usbdrive/recovery/sda.log
This command can be run again with more retries but 3 is the recommended.
Extracting files from the recovered image.
There are many options for extracting files from a disk. We shall only discuss a few here.
- foremost - This recovers files based on their headers and footers and is filesystem independent.
- Scalpel - This is a fast file carver similar to foremost.
- Photorec - Another filesystem independent recovery program, despite the name it does much more than recover photos.
In my experience not all of these programs will find the same files, so there is some use to trying them all. Scalpel and foremost will also be able to find fragments of files. Scalpel was by far the fastest, but did not retrieve the meta-data for jpegs so was not as useful to me (see 'Cleaning up')
Foremost
Foremost can recover files from an image or a device, and the syntax for both is the same. The files will be recovered to a directory that you specify so it is best to create one.
Foremost is not installed by default so we install it by doing:
Code:
su -c 'yum install foremost'
Then create a drive to recover the files too.
Code:
mkdir /media/usbdrive/recovery/foremost
whilst there are several options, only two are really need, '-i' for the input device/image and '-o' for the output directory. You can also tell foremost which files to recover. In my case, I only needed jpg so the flag was added for that (-t jpg).
foremost comes preconfigured to recover the following types of files: jpg,gif,png,bmp,avi,exe,mpg,mp4,wav,riff,mat,wmv,m ov,pdf,ole (This will grab any file using the OLE file structure. This includes PowerPoint, Word, Excel, Access, and StarWriter),doc,zip
To use foremost do:
Code:
su foremost -t jpg -i /media/usbdrive/recovery/sda.img -o /media/usbdrive/recovery/foremost
Scalpel
Scalpel is a fast file carver. It works much like foremost by looking at file headers and matching them to a list of known types. By default in fedora all file types are enabled. It is best to edit this to just the types you want. This can be done by commenting out the appropriate lines in
/etc/scalpel/scalpel.conf
To install it type:
Code:
su -c 'yum install scalpel'
Scalpel takes the input device/image as an argument and the output directory as an option. So first we create a directory to dump the output to:
Code:
mkdir /media/usbdrive/recovery/scalpel
and then set it to work
Code:
su scalpel /media/usbdrive/recovery/sda.img -o /media/usbdrive/recovery/scalpel
[SIZE]Photorec[/SIZE]
The final program is photorec. It was designed to recover photos from digital cameras, but has been extended to cover hard disks and many more file types. It is part of the testdisk package so it installed by doing
Code:
su -c 'yum install testdisk'
Photorec is initiated from the command line but it is interactive and will prompt you with what to do. Start it by doing:
Code:
photorec /media/usbdrive/recovery/sda.img
For the time being I shall skip providing instructions for photorec as I think it is clear, but if people want instructions I can add them.