PDA

View Full Version : Second Appeal for Help (Kernel upgrade makes one Windows share inaccessible)



billquinn
24th December 2008, 02:11 AM
Three or four weeks ago, I started a new thread and asked some questions. I checked again tonight, and though it got 100 views, there were zero responses. Perhaps it slipped through the cracks because of the right people not being online at the right time.

So as not to double post, I've deleted that thread. However, the issue is still very present, and I have not been able to solve it. It is affecting my ability to remain up to date with the latest kernel. Just tonight I did a yum upgrade, and another new kernel was installed, 2.6.27.7-53.fc9.i686. This is the third kernel that is giving me this problem. It's a great mystery, because all I have to do is go back to 2.6.26.6-79.fc9.i686, and the problem disappears.

Here's the problem:

(1) I have a number of VFAT Windows partitions on my disk drive. All of them can be accessed except one! df lists the partition that can't be accessed as /dev/sda6. sda2 and sda5 can be accessed with no problem. (I don't know why there is no sda3 or sda4.)

(2) I have another computer connected through a router which is a pure Linux machine (no Windows partitions). After mounting its drive using nfs, I can't access that drive either.

(3) The problem partition on the main computer, /dev/sda6, cannot be accessed whether I try to access it from a terminal using ls, from File Browser, or from an application like OpenOffice. The same thing happens in all cases: the process hangs. In the terminal window, CTRL-C does not abort the attempt, but the terminal window can be closed using the "X" in the upper right-hand corner. File Browser hangs so badly that the "X" does not close it: I must use the "force-quit" popup window.

(4) After any of the attempts described in (2), System->Shut Down->Restart does not work properly. The screen clears normally, but the command-line screen with "localhost1 login:" that usually appears for only moments now stays indefinitely. The only way to restart the system is to use the restart button on the computer case.

All these problems simply disappear when I reboot using the old kernel. Does anyone have any ideas?

Thanks.

Bill

scottro
24th December 2008, 02:47 AM
I'm afraid I can't help, but I've edited the thread title. It gives it a much better chance of being seen by someone with ability to help.
(Note that threads with titles like help and other generic type things are more likely to be overlooked.)

stlouis
24th December 2008, 04:09 PM
What happens when you try and mount/unmount your Windows partition manually? Does is give you any errors?

Can you post the output of fdisk -l

Use the following command to mount the troubled Windows Partition:

mount -t vfat -o iocharset=utf8,umask=000 /dev/<your drive> /media/<mount point>

Does that make a difference?


Is your logs telling you anything? Have you checked root's mail, to see if you are getting any errors in the reports it is sending you?

David Becker
24th December 2008, 04:14 PM
Three or four weeks ago, I started a new thread and asked some questions. I checked again tonight, and though it got 100 views, there were zero responses. Perhaps it slipped through the cracks because of the right people not being online at the right time.

So as not to double post, I've deleted that thread. However, the issue is still very present, and I have not been able to solve it. It is affecting my ability to remain up to date with the latest kernel. Just tonight I did a yum upgrade, and another new kernel was installed, 2.6.27.7-53.fc9.i686. This is the third kernel that is giving me this problem. It's a great mystery, because all I have to do is go back to 2.6.26.6-79.fc9.i686, and the problem disappears.

Here's the problem:

(1) I have a number of VFAT Windows partitions on my disk drive. All of them can be accessed except one! df lists the partition that can't be accessed as /dev/sda6. sda2 and sda5 can be accessed with no problem. (I don't know why there is no sda3 or sda4.)

Show us your parition table:

fdisk -l /dev/sda

You've probably mounted an extended partition while you should only mount primary and logical partitions.

David

billquinn
24th December 2008, 05:20 PM
Thanks St.Louis and David!

Here's the fstab file:

/dev/VolGroup00/LogVol00 / ext3 defaults 1 1
UUID=99881b3e-77e1-43f5-affa-afd24010c13f /boot ext3 defaults 1 2
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
/dev/VolGroup00/LogVol01 swap swap defaults 0 0
/dev/sda1 /mnt/DriveC auto umask=0000 0 0
/dev/sda2 /mnt/DriveG auto umask=0000 0 0
/dev/sda5 /mnt/DriveD auto umask=0000 0 0
/dev/sda6 /mnt/DriveE auto umask=0000 0 0

I've been using this same file since back to about FC4, so I doubt that it has a problem. I do mount the drive from my second computer (a Linux-only machine) manually. And, of course, all this works fine just by going back to kernel 2.6.26.6-79.fc9.i686. However, I'll try the manual mount you suggest for the vfat partition on my main computer, if you still think that will help find the problem. I'd have to reboot to get the newest kernel running in order to put it to the test.

Here's the result of df:

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
16223664 8170372 7229180 54% /
/dev/sda7 194442 32119 152284 18% /boot
tmpfs 517380 84 517296 1% /dev/shm
/dev/sda1 1027856 11888 1015968 2% /mnt/DriveC
/dev/sda2 6132864 3829760 2303104 63% /mnt/DriveG
/dev/sda5 2047968 670112 1377856 33% /mnt/DriveD
/dev/sda6 2047968 1930400 117568 95% /mnt/DriveE
/dev/sr0 622954 622954 0 100% /media/SerwayPSE7e_V2_1

Finally, here's the result of fdisk-l /dev/sda:

Disk /dev/sda: 30.7 GB, 30758289408 bytes
255 heads, 63 sectors/track, 3739 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x33243323

Device Boot Start End Blocks Id System
/dev/sda1 * 1 128 1028128+ 6 FAT16
/dev/sda2 129 893 6144862+ 1b Hidden W95 FAT32
/dev/sda3 894 3739 22860495 5 Extended
/dev/sda5 894 1148 2048256 6 FAT16
/dev/sda6 1149 1403 2048256 6 FAT16
/dev/sda7 1404 1428 200781 83 Linux
/dev/sda8 1429 3739 18563076 8e Linux LVM

Thanks again.
Bill

billquinn
29th December 2008, 03:02 PM
A new kernel appeared with my yum upgrade today--kernel 2.6.27.9-73. Same problem persists. However, as Stlouis suggested, I tried to mount /dev/sda6 manually while running the new kernel.

I first had to umount /dev/sda6 because fstab mounts it at boot up. It would not umount. The message said it was busy. I assume this is because it is also mounted onto the file system of my second computer (it is always mounted onto that computer; the partition, however, is physically on the disk drive on my main computer that is giving me all this trouble).

I rebooted into my old reliable kernel 2.6.26.6-79. Trying to umount /dev/sda6 produced the same message: it was busy. So the two kernels are consistent in this respect.

However, the annoying difference between the kernels still remains: with kernel 2.6.26.6-79 I can access the partition from my main computer (which I have always been able to do), but with all the new kernels I cannot!!!

I still have been able to make no sense of all this.

billquinn
29th December 2008, 05:32 PM
As a follow up to my previous post:

I turned off both computers, then rebooted only my main computer that has the /dev/sda6 partition--using the newest kernel. NOW I am able to access that partition!!

Next I booted up the second computer, which mounts /dev/sda6 onto its file system. That computer also has an fstab file to do this mounting at boot up. With the new kernel running on the main computer, booting the second computer did not work properly. At the nfs step, it was unable to mount /dev/sda6. Nevertheless, once it finished booting, I could no longer access /dev/sda6 on the main computer, which had just been working before booting the second computer.

????????????????

Could it be that these new kernels are not handling nfs correctly? I must sound like a stuck record, but it bears repeating: none of these problems occur when the main computer is running kernel 2.6.26.6-79.

stlouis
29th December 2008, 05:35 PM
OK... We need to get some sort of error, so we can narrow down exactly where your problem lies.

First things first now... Modify your "fstab mounts" so that /dev/sda6 does NOT get mounted at all...


*** Check you logs for anything at all related to the mounting of /dev/sda6... What does it say, if anythign!


Once you modified your "fstab mounts", boot your server, then manually try to mount the drive... What error do you get?

After you attempt a mount, what Exit Code are you left with? To find this, simply type the following:

echo $?


You can look at the MAN page for mount to see what the Error Code corresponds to.

*** If you mount with the "-f" option, this will go through the process, but NOT actually mount... It will fake it...

*** You may also have to specify that it is a FAT16 table, and override the auto-detection of this... CHECK MAN PAGE for SPECIFICS...


Let me know the results of this.

billquinn
29th December 2008, 08:30 PM
The steps I followed:

(1) With the old kernel (the one that works) running on main computer, I booted second computer. It's fstab file mounted /dev/sda6 onto its file system successfully as always in the past (because the "good" kernel was running of main computer).

(2) I removed the line from the fstab file on the main computer that mounts /dev/sda6 onto its file system.

(3) I rebooted the main computer with the new kernel.

(4) Then I manually mounted /dev/sda6 as follows:

mount -t vfat /dev/sda6 /mnt/DriveE

The error code returned was 0 ("success").

(5) Lo and behold! I am able to access /dev/sda6 on the main computer!

(6) I went to the second computer to see if it could still access /dev/sda6. It could not. Message: "stale NFS file handle."

(7) I rebooted second computer. Boot process switched to details with an nfs error: "Stale NFS file handle." This surprised me, because the main computer had the partition successfully mounted at the time the second computer was rebooted. In the past, the second computer always mounted this partition if booted when the main computer was on.

(8) I'm not sure what log files I should check or where they are found.

stlouis
29th December 2008, 09:47 PM
OK, it looks like we're making some progress here. Just to RECAP:

1) You are now able to manually mount the "problem" partition using the new Kernel, that would NOT previously auto-mount via the /etc/fstab

2) After manually mounting the partition (with new Kernel), you still cannot access the mounted partition from a Remote PC via an NFS Share.

*** We will deal with auto-mounting via the /etc/fstab, after we figure out the NFS issue...

This type of error message is seen when a file or directory that was opened by an NFS client is removed, renamed, or replaced.

To fix this problem, the NFS file handles must be renegotiated. Try one of these on the client machine:

a) Unmount and remount the file system, may need to use the -O (overlay option) of mount.

From the man pages:
-O Overlay mount. Allow the file system to be
mounted over an existing mount point, making
the underlying file system inaccessible. If a
mount is attempted on a pre-existing mount point
without setting this flag, the mount will fail,
producing the error "device busy".

b) Kill or restart the process trying to use the nonexistent files.

c) Create another mount point and access the files from the new mount point.

d) RESTART NFS Client

e) Reboot the client having problems


*** This may help you some, if the above does NOT work...

http://www.cyberciti.biz/tips/nfs-stale-file-handle-error-and-solution.html


hope this helps

billquinn
29th December 2008, 11:24 PM
In attempting to get the partition mounted on the second computer, I at first tried all of your suggestions with the main computer still running with the manually-mounted partition. None worked. Then I shut the second computer off. After doing that, I rebooted the main computer and again manually mounted the partition (when I did this the first time, the second computer was on). Then I turned the second computer back on and rebooted it. The boot process went through with no nfs errors. However, it did not come up correctly. After logging in, the icons on the desktop did not appear. The restart and shutdown from the menus did not work correctly either.

So we are still left with several mysteries:

(1) NFS on the second computer cannot deal with the mounted vfat partition on the main computer's disk drive when the new kernel is running on the main computer.

(2) When the new kernel is running, the fstab file on the main computer cannot properly mount the partition.

This is all very strange. Never had anything like this happen before, and I've been using this same setup between the two computers for four of five releases of Fedora.

One other fact might be pertinent. The second computer is still running FC4. Its kernel is 2.6.17-1.2142_FC4.