Fedora Linux Support Community & Resources Center

Go Back   FedoraForum.org > Fedora 17/18 > Using Fedora
FedoraForum Search

Forgot Password? Join Us!

Using Fedora General support for current versions. Ask questions about Fedora and it's software that do not belong in any other forum.

Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old 20th June 2012, 10:38 AM
deadeyese Offline
Registered User
 
Join Date: May 2012
Location: UK
Posts: 8
linuxfirefox
Copy gets stuck / freezes

Hi all, I've had a bit of a search around but cant really find any information relating to my problem.

I am attempting to perform a backup of data using the cp command in Fedora 16 x64 Desktop edition. However, the data transfer stops for no apparent reason and freezes on a file. The only option is to cancel the process and attempt to start again.

I am copying from a Windows NTFS partiton (I have the drive piggy-backed on my Fedora system at current) and copying to a Software RAID 5 volume. I initially attempted to do this over the network and placed the fault on potentially a Samba issue, however, with the drive mounted the error still persists.

I have the drive mounted using the options "-o=defaults,mand". The mand option seemed to help the transfer get a bit further, but it still gets stuck

Anyone have any ideas? Is this an issue with NTFS compatibility? I don't think so due to it occuring on the network as well. Or is this an error with the RAID volume and its ability to cope with large data changes? I have checked /proc/mdstat and all appears fine.

Im sorry its so long winded, but hopefully someone can help. Any more information you need, just ask
Reply With Quote
  #2  
Old 20th June 2012, 11:52 AM
george_toolan Offline
Registered User
 
Join Date: Dec 2006
Posts: 1,718
linuxfirefox
Re: Copy gets stuck / freezes

Copying a large number of files shouldn't be a problem.

Does it always stop on the same file?

Please check /var/log/messages for errors.

Can you try a different cable which connects the drive to the system?

You should also check smart values of the drive. It probably has a bad sector or something like that.

What kind of drive is this? Is is also possible it is overheating when you try to copy the whole thing.

Code:
smartctl -a /dev/sdX
where X is the number of your ntfs drive.
Reply With Quote
  #3  
Old 20th June 2012, 05:07 PM
deadeyese Offline
Registered User
 
Join Date: May 2012
Location: UK
Posts: 8
linuxfirefox
Re: Copy gets stuck / freezes

Hi, thanks for the reply

It was stopping on the same file until I added the mand option. It got further and stopped around the same file every time but not always. There were no errors in /var/log/messages which is what had me a little stumped. I ran several smart scans before posting (forgot to mention that) and they all passed short self tests with drive temperatures less than 35C (well within tolerance)

The drives are Samsung F2 1.5TB drives

I believe I have however found the issue. Further details showed that although the RAID array initially appeared ok, it was in fact stuck tryin to perform a resync. A system restart and increasing the speed of the resync appears to have cleared the errors for now as data transfer is in progress (once the resync completed 5 hours later...).

I'll post back if anything happens again or if this was indeed the fault.


##### EDIT #####

Ok, the error is still present. The RAID array is doing nothing this time, its status is idle and therefore ready for data to be read / written. The cp command has just stopped on a file and there is no disk activity. The previous files that were getting stuck have been copied successfully without even a hint of a problem.

Last edited by deadeyese; 20th June 2012 at 06:17 PM. Reason: New Information
Reply With Quote
  #4  
Old 20th June 2012, 06:35 PM
george_toolan Offline
Registered User
 
Join Date: Dec 2006
Posts: 1,718
linuxfirefox
Re: Copy gets stuck / freezes

You should try a long self test on the drives (especially the ntfs source drive). This will take a couple of hours since it's trying to read all sectors on the drives, but it should be able to detect or re-map any bad sectors. The short self test just tries to read some of the sectors.
Reply With Quote
  #5  
Old 20th June 2012, 06:59 PM
deadeyese Offline
Registered User
 
Join Date: May 2012
Location: UK
Posts: 8
linuxfirefox
Re: Copy gets stuck / freezes

Agreed, a long self test would be a better option and i'll likely conduct one over night on all the drives.

However, some further information on this; It appears that cp isnt actually getting stuck. Ive switched to using rsync based on some information I found elsewhere. With its ability to show progress I can see that the transfer is taking place but at glacial speeds (20 kBps instead of 200+ MBps). Using "dd if=/dev/sdX of=/dev/null bs=1M count=1024" to test read speeds, both the NTFS drive and RAID array are functioning correctly.

From this I take it there is an issue with writing to the RAID array, correct? Also, I guess a resync doesn't help to avoid bad sectors. These tests are going to take forever...
Reply With Quote
  #6  
Old 20th June 2012, 09:34 PM
george_toolan Offline
Registered User
 
Join Date: Dec 2006
Posts: 1,718
linuxfirefox
Re: Copy gets stuck / freezes

Some more tests to keep you busy:

You should check your memory with Memtest86+. I hear bad memory can affect RAID 5s.

What about the write speed of the RAID?

Code:
dd if=/dev/zero of=/myraid/somelargefile bs=1M count=1024
then try to increase the file size.

You can monitor drive speeds with a system monitor like gkrellm.

So what exactly is it doing when you say it gets "stuck"? To check try something like

Code:
strace -p `pidof cp`
Reply With Quote
  #7  
Old 20th June 2012, 10:31 PM
deadeyese Offline
Registered User
 
Join Date: May 2012
Location: UK
Posts: 8
linuxfirefox
Re: Copy gets stuck / freezes

Well, another update :P

Long SMART test has found nothing on the NTFS drive which is good. Still waiting for the larger drives in the RAID array.

However, doing the speed test for writing (didnt consider writing to a file - still learning :P) the NTFS drive came back with 84.8MB/s which is a little slower but definitely acceptable. The RAID array however has been going for 5 mins which I attribute to the aforementioned glacial speeds

Obviously something is going wrong with the array. Why would it one minute have 200+ MB/s throughput then stop?

As for cp, obviously the command isnt at fault - it was still processing just incredibly slow due to the RAID array, but without any progress indicator (which rsync has) I couldn't easily tell.


EDIT:

btw, output from /proc/mdstat

Code:
cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 sdd1[0] sde1[2] sdc1[1] sdf1[3]
      4395412224 blocks level 5, 256k chunk, algorithm 2 [4/4] [UUUU]
      
unused devices: <none>
and the progress of dd on md0 so far ( which is terrible )
Code:
dd if=/dev/zero of=/share/tempfile bs=1M count=1024
29+0 records in
29+0 records out
30408704 bytes (30 MB) copied, 1478.61 s, 20.6 kB/s
30+0 records in
30+0 records out
31457280 bytes (31 MB) copied, 1529.81 s, 20.6 kB/s

EDIT2:

Ok, long self-test on the hard drives in the RAID array have completed with no errors. So they appear to be health, but obviously the array as a whole has some issues.

I should note that ive made some modifications such that the /proc/sys/dev/raid/speed_limit_min has been increased to 150000 and the /sys/block/md0/md/stripe_cache_size has been increased to 8192. The previous values were 1000 and 256 respectively.

The cache increase did increase write speed from 20 kBps to about 8 MBps for a minute or so then the it fell again.

Just in case its important or not etc, the RAID 5 volume is formatted as ext3

Last edited by deadeyese; 21st June 2012 at 12:54 AM. Reason: More Information
Reply With Quote
  #8  
Old 21st June 2012, 04:07 AM
DBelton's Avatar
DBelton Offline
Administrator
 
Join Date: Aug 2009
Posts: 6,612
linuxfirefox
Re: Copy gets stuck / freezes

There were some Samsung drives that had firmware problems, which would cause errors if smartctl queried the drive while the drive was writing data. (smartctl queries the drive every so often automatically)

Check to see if there is a firmware update for your drives.
Reply With Quote
  #9  
Old 21st June 2012, 10:02 AM
george_toolan Offline
Registered User
 
Join Date: Dec 2006
Posts: 1,718
linuxfirefox
Re: Copy gets stuck / freezes

This problem only affects Samsung F4 drives and smartctl will tell you about it:

Code:
==> WARNING: Using smartmontools or hdparm with this
drive may result in data loss due to a firmware bug.
****** THIS DRIVE MAY OR MAY NOT BE AFFECTED! ******
Buggy and fixed firmware report same version number!
See the following web pages for details:
http://www.samsung.com/global/business/hdd/faqView.do?b2b_bbs_msg_id=386
http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks
You should check your cables and find out why your RAID is resyncing.

During a resync it could be really slow.

How much memory do you have?

The explanation for high transfer speeds (200 MiB/sec) at the beginning of a transfer is rather simple: the data is buffered in memory first and later written to the disk(s). Of course this only works for small files. When you try to transfer a large amount of files the speed will then drop to the actual write speed of your drive(s).
Reply With Quote
  #10  
Old 21st June 2012, 10:42 AM
deadeyese Offline
Registered User
 
Join Date: May 2012
Location: UK
Posts: 8
linuxfirefox
Re: Copy gets stuck / freezes

The RAID array was resyncing because of an unclean shutdown. Several processes got stuck and I was unable to end them in any manner that I am aware of (kill, kill -9, pkill etc). In the end it was a hard reset which obviously upset the array.

However, my tests occured once the resync had completed so this shouldn't have affected the performance.

As for SMART, there is no warning that you describe abou the Samsung drive, but I will check for updates. After a restart, the array is performing at a much better efficiency. SMART is only running via the command line, the daemon is not running in the background so should prevent this.

Code:
dd if=/dev/zero of=/share/largefile bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 2.87954 s, 373 MB/s
Unfortunately, a subsequent test didnt go as well. I attempted to use the same parameters except with a count of 8192. It seemed to be working then stopped.


(There is supposed to be an image above this text but it isn't showing up :s - here is the link gkrellm screenshot)

Also, I have 4GB DDR3 at 1,333 MHz and 4GB of Swap on my main drive where the OS is located - this is a Corsair ForceGT SSD. I am also running an Intel Core i7 2600K at stock settings - I know the specs are overkill for what is essentially a server to me, but I got the CPU cheap from a friend at Intel :P In any case, it shouldn't be the bottleneck.

Time for a memory test and firmware update it seems.

Thanks again for all your help!


##### EDIT #####

Ok, lots of memory tests later there is something wrong... The modules together were for one in single channel (the Manual for the MoBo and QVL contradict each other on which DIMMs are on the same channel). But, the two modules together cause 1 or 2 errors using Memtest86+ but each module by itself has no errors so a very weird error. The set I have (2x 2GB) isn't on the QVL BUT the code for a single stick is? Very strange. In any case, I will more than likely be investing in new memory to see if it fixes these issues

Once I get it, i'll let you guys know the results.


##### EDIT 2 #####

Ok, new memory purchased, installed and tested using memtest86+ and no errors (phew)

Had to wait a further 4.5 hours for the array to resync due to unclean shutdown because of stuck processes (in D state). Finally tested and it appears to be better, hopefully I won't suffer any more errors.

Code:
dd if=/dev/zero of=/share/largefile bs=1M count=8192
8192+0 records in
8192+0 records out
8589934592 bytes (8.6 GB) copied, 66.3043 s, 130 MB/s
Thanks again for all your help. Topic may have gone off on a bit of a tangent but hopefully the issue is resolved and documented for someone else :P

Ok well that was short lived The problem is back. Did a transfer of maybe 60-80GB and then suddenly the speed dropped to below 1 MBps. This just doesnt make any sense, Hard Drives are clean with no bad sectors and memory is brand new and clean

Everytime this happens, cp seems to enter a process state of 'D' (uninteruptable sleep) which apparently means it is waiting on I/O. All the drives are still responding though, albeit the RAID array is incredibly slow. This I believe is because of the HDDs im using. I found an article that was performing benchmarking including multi-process writes - the Samsung F2 drives performed terribly at around 6-8 MBps which is somewhat near that which im experiencing. SO, the big question is why does cp keep ending up hanging on I/O?

This post is getting pretty long now but I figure its better to edit than create a new post. Hopefully someone has some idea Im currently trying to perform a backup due to a systems failure on another machine so this fault has completely messed everything up over the past few days.

Just in case this is helpful

Last edited by deadeyese; 22nd June 2012 at 12:30 AM.
Reply With Quote
  #11  
Old 23rd June 2012, 01:23 PM
deadeyese Offline
Registered User
 
Join Date: May 2012
Location: UK
Posts: 8
windows_7firefox
Re: Copy gets stuck / freezes

Sorry to bump, but any ideas anyone?
Reply With Quote
  #12  
Old 24th June 2012, 03:57 PM
deadeyese Offline
Registered User
 
Join Date: May 2012
Location: UK
Posts: 8
windows_7firefox
Re: Copy gets stuck / freezes

Ok, I have managed to find out some more information. This bugzilla report outlines my issue and it appears to be a problem within the Fedora 16 Kernel.

As such, I wiped my OS and re-installed Fedora 15 and magically everything is working perfectly again. Just transferred 470GiB across my network to the RAID array without error.

Thanks for the support from george_taloon in helping me narrow down the fault and hopefully someone else will find this info somewhat useful
Reply With Quote
Reply

Tags
copy, freezes, stuck

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Fedora 11 Gets Stuck/Freezes on login window KaRiToR Using Fedora 0 13th June 2009 06:37 AM
Anaconda freezes -- really freezes LinGreg Installation and Live Media 1 8th December 2008 04:23 PM
Stuck with FC5- Please Help! src2206 Using Fedora 24 29th August 2006 09:56 PM
Stuck between FC4 and FC5 nasht EOL (End Of Life) Versions 4 28th April 2006 08:53 AM


Current GMT-time: 17:21 (Thursday, 23-05-2013)

TopSubscribe to XML RSS for all Threads in all ForumsFedoraForumDotOrg Archive
logo

All trademarks, and forum posts in this site are property of their respective owner(s).
FedoraForum.org is privately owned and is not directly sponsored by the Fedora Project or Red Hat, Inc.

Privacy Policy | Term of Use | Posting Guidelines | Archive | Contact Us | Founding Members

Powered by vBulletin® Copyright ©2000 - 2012, vBulletin Solutions, Inc.

FedoraForum is Powered by RedHat