<---- template headericclude ----->
Fedora 10 NFS root boot and RedHat nash 6.0.71-4.fc10
FedoraForum.org - Fedora Support Forums and Community
Results 1 to 8 of 8
  1. #1
    Join Date
    May 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Fedora 10 NFS root boot and RedHat nash 6.0.71-4.fc10

    Hi all.

    I have just within the last couple of weeks installed Fedora 10 to 3 of my pentium 4 computers.

    I have been running Windows XP on them while I have been in school for the last 7 years, but they have also had fedora 3 and 4 running on them at various times.


    Background:

    In the past when Fedora was RedHat, I had versions 6 through 8 running on all of my machines which were made up of 486s, pentium IIs and IIIs. They could all boot up over the ethernet (10Base-T) from a single server (an old 486) running as a simple file server using the dhcpd, tftp and nfs protocols. They could all run the complete redhat suite of programs without the need for any additional disk space. I used them as a small cluster to run pvm, mosix, or mpi to speed up certain jobs and spread the workload by running various tasks concurrently or in parallel on all of the machines at once.

    From my Fedora 4 box I have dhcpd running to configure my network interfaces, and it will load various kernels and initrd images to the clients over the ethernet using tftp and pxe boot processes.

    Fedora 10 initially complained that there was a security problem with the dhcp protocol and failed to configure it, but I see that a new update just came in which might have fixed the problem.

    I don't know what cobblerd does, but it looks like it has something to do with dhcp protocols. I seem to be having problems with the help screens in gnome finding the correct links to the documentation pages. They almost all say that the page has been moved or is not found. This is true because I have found the pages externally in new or different directories than the ones described in the help page links.

    On one of the Fedora 10 boxes I copied the complete root partition to the /tftpboot/Fedora10 directory. This serves a dual purpose as a backup to my primary system, and as a read only file server for my nfs clients.

    Each of the clients has its own individual directory structure where I hard linked the files in places like /bin /sbin to the Fedora10 files, and left mount points for /lib and /usr. /etc, /home, /tmp, and /var are unique and writeable for each client installed at /tftpboot/192.168.50.X where X is a number from 1 to 30 identirying each client.

    The problem:

    It appears that the nfsroot options are not enabled in the vanilla Fedora 10 kernels. I thought that I downloaded all of the developement packages and kernel source at installation time, but when I attempted to compile the kernel, the compile crapped out with an error related to "missing syscalls."

    So I downloaded the same kernel version (2.6.27.21) from kernel.org, and compiled that with the nfsroot options turned on. The problem I have now is figuring out the right boot parameters, and mkinitrd options to connect to and mount the remote file server disk.

    It appears that there is currently some ongoing issues or development cycles occurring with nfsV4, and with the dhcp and bootp kernel processes. It is not documented in the kernel (nfs or bootparams) documentation, but the fedora4 pxe setup configures a nfs root disk with the kernel parameter

    method=nfs:192.168.50.8:/tftpboot/192.168.50.12 ip=dhcp

    there is no root=anything kernel configuration parameter, and the nfsroot= parameter is now deprecated I believe.

    With this option (method=nfs), and a straight initrd startup script, I can see mount and unmount requests appear in /var/log/messages on the remote server when booting the test client. However bin/nash aborts with some cryptic output messages, or goes into some kind of infinite loop tyring to mount / remount the remote file system. According to the RedHat nash documentation, the mount command no longer supports nfs mounts. So I tried loading mount.nfs into the bin directory of initrd, and replacing the "mount /sysroot" with a mount.nfs variant, but that didn't work either.

    I notice just before the mount commands occur in the init script in the initrd file, there is a command passed to nash to configure the network. The kernel has already configured the network parameters previously with a dhcp probe, but the command in the initrd script apparently overrides the probe.

    network --device eth0 --bootproto none --ip 192.168.50.8 --netmask 255.255.255.224 --gateway 192.168.50.1 --domain "TfJC files.csupomona.edu" --dns 192.168.1.1

    This command appears to be an undocumented feature of /sbin/nash. I have been unable to find any reference to it in any nash documentation.

    I suspect that there is some developement going on with this somewhere, but am unable to locate the development source for nash either.

    I believe I have the source code for nfsutils-1.1.6, kernel-2.6.27.21 source, and some variations of binutils, but I can't locate the source for /bin/nash. I see that a new nash has come in from the updates version 6.0.71-4.fc10, but I the man pages documentation is still dated Aug 2, 2004.

    I have collected all of the console logs and placed them on my server in addition to various init scripts. You can find everything here:

    http://www.csupomona.edu/~cthompson1/badboots/

    I have tried various combinations of mkinitrd parameters, and removing / replacing / modifying the init script file manually with various results.

    Any help is welcome, espcially in locating /bin/nash source code.

  2. #2
    Join Date
    May 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    After looking at my old linux-2.2.15-2.5.0 bootup scripts, it dawned on me that the kernel shouldn't need any initrd scripts or /bin/nash at all to mount the nfsroot file. It should all be handled by the kernel....

    Any necessary module loading can be done with modprobes after /sbin/init has started up!

    I'll dig into the kernel a little more.

  3. #3
    Join Date
    May 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    got's me mystified

    Ok. This is bugging me now. I've been messing around with the boot parameters again. I know that I had the root=/dev/nfs in this startup...

    http://www.csupomona.edu/~cthompson1...od=nfs,ip=dhcp)

    ... But, it doesn't show up in the log "Kernel parameters=" for some reason.

    This is the same kernel I am using now, but it is not mounting the nfsroot files. Shoot... can't remember which one of these things was sending mount requests to the server.

    Anyway, I've dumped a bunch more console logs into the web site with different kernel parameters.

    Nothing is working here now.

    I wonder if it has something to do with the device file mknod /dev/nfs c 0 255. I think I stuck this in one or some of the initrd files, and on the server.... But devices.txt says that /dev/nfs now is a soft link to socksys ( mknod socksys c 30 0 ) ...

    I'm getting tired. Gonna sleep on it a while.

  4. #4
    Join Date
    May 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Fedora 10 NFS root boot and RedHat nash 6.0.71-4.fc10

    Here are those links again... Looks like something broke on the last post.

    These are the bootups that connected to the nfs server:

    http://www.csupomona.edu/~cthompson1...log.root=0:255
    http://www.csupomona.edu/~cthompson1...od=nfs,ip=dhcp)
    http://www.csupomona.edu/~cthompson1...log.method=nfs
    http://www.csupomona.edu/~cthompson1...s(noblankspace)

    Verified from /var/log/messages.

    http://www.csupomona.edu/~cthompson1...essages-12only

    What I am looking for here are mounts that are proximate in time, slightly earlier than the consolelog modified time, and after the boot time adjusted for TZ and GMT. The consolog modify time lags because I had to manually remove gobs of blank space spit into the logs by /bin/nash.

    Here is a log that I missed that still has all of the white space in it.

    http://www.csupomona.edu/~cthompson1...nfsroot,ip=off

    The funny thing here is that all of the mounts on the server occurred when the client had an initrd script loaded??? None of the *.noinitrd (kernel only) bootups made a successfull mount.

    All of the files I've collected to date are stored here:

    http://www.csupomona.edu/~cthompson1/badboots/

    Hope these links stay in one piece this time!

    p.s.
    I found these nash locations on my computer. I'm going to try a yum local install to see if there is any source here.

    http://www.csupomona.edu/~cthompson1...nash.locations

    Good Luck!

  5. #5
    Join Date
    May 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Fedora 10 NFS root boot and RedHat nash 6.0.71-4.fc10.abw

    ok.

    Guess we don't like paren's here.

    I'll rename 'em with brackets instead and see if that works.

  6. #6
    Join Date
    May 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    nash source

    Found the nash source

    git://git.fedoraproject.org/git/hosted/mkinitrd

  7. #7
    Join Date
    May 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Nash source

    And by the way... It has modules to mount nfs v3 and v4...

    Stay tuned!

  8. #8
    Join Date
    May 2009
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Almost Solved - Fedora 10 NFS root boot and RedHat nash 6.0.71-4.fc10

    If I had bothered to look at my kernel .config file, I would have immediatly noticed that the fs / nfs options had the "m" option selected, which compiles them as modules.... That is why I needed the initrd script to make a connection. To nfs root boot a kernel without initrd means you can't have the code floating around in an un-installed module.

    http://www.csupomona.edu/~cthompson1...iguration-file

    I thought I had selected "compile into kernel," but missed the dot at the top.

    "Make xconfig" should look like this when done correctly, and the mount root nfs option will magically appear. Make sure you have check marks, and not dots in the select boxes.



    Once I selected this and recompiled the kernel, I removed the initrd file from the boot parameters, and "oh happy days" received a mount request on the server.

    Unfortunately that is a far as I got. A message was displayed that said something about ~ "warning no tty available" or something like that. I'll try to post some more consolelogs later if anyone is interested.

    This is some kind of problem booting up directly from the disk without the help of an initrd file, and it looks like rc.sysinit or the kernel or /sbin/init maybe forgets to load some important modules or device files before mounting nfs drives.

    This is something I need to work out. Lots of /etc/rc.d/init.d/xxx scripts depend on the network being up first... which by the way already is, because of the nfsroot capability. eth0, route and all that is already configured.


    RedHat nash has all the code necessary... if you compile as modules, to load the modules, configure the network, and to make the nfs client request to the server. You may still need the "root file systems on NFS" option selected which seems to require this module to not be compiled as a module.

    After messing around with the bin/nash init script in the initrd file, I was able to mount the root directory on the server and get a bash prompt in single user mode.

    The latest init script is here. my mkinitrd program seems to have left a couple of things out.

    First I changed the ip= in the network command from the server's ip address to the client's ip address.

    That didn't help.

    Second I added a couple of mknod's to create /dev/nfs and /dev/nfsd, and changed the mount command to the normal format "mount -t nfs 192.168.50.8:/tftpboot/192.168.50.12 /sysroot"... Don't know what these do, or which ones are actually necessary, but this is when I got /sbin/init to run and start initializing the operating system to give me a bash prompt.

    Everything would probably work now, but I have elected to leave /usr empy and mount it later using nfs as read-only. This is obviously giving me headaches now, because there is a bunch of stuff that /sbin/init wants from there, and it won't mount until lockd, or rpc.statd is up.

    So I can't figure out why I can mount the root file system, but not the secondaries. The next step is to see if nash can do the job again before the switchroot command is executed.

    This is the latest init script that worked with /bin/nash.

    echo starting nash 0.0.0.2 change network --ip 192.168.50.12 added /dev/{nfs,nfsd} changee mount -t nfs
    mount -t proc /proc /proc
    setquiet
    echo Mounting proc filesystem
    echo Mounting sysfs filesystem
    mount -t sysfs /sys /sys
    echo Creating /dev
    mount -o mode=0755 -t tmpfs /dev /dev
    mkdir /dev/pts
    mount -t devpts -o gid=5,mode=620 /dev/pts /dev/pts
    mkdir /dev/shm
    mkdir /dev/mapper
    echo Creating initial device nodes
    mknod /dev/nfs c 0 255
    mknod /dev/nfsd c 3 0

    mknod /dev/null c 1 3
    mknod /dev/zero c 1 5
    mknod /dev/systty c 4 0
    mknod /dev/tty c 5 0
    mknod /dev/console c 5 1
    mknod /dev/ptmx c 5 2
    mknod /dev/fb c 29 0
    mknod /dev/tty0 c 4 0
    mknod /dev/tty1 c 4 1
    mknod /dev/tty2 c 4 2
    mknod /dev/tty3 c 4 3
    mknod /dev/tty4 c 4 4
    mknod /dev/tty5 c 4 5
    mknod /dev/tty6 c 4 6
    mknod /dev/tty7 c 4 7
    mknod /dev/tty8 c 4 8
    mknod /dev/tty9 c 4 9
    mknod /dev/tty10 c 4 10
    mknod /dev/tty11 c 4 11
    mknod /dev/tty12 c 4 12
    mknod /dev/ttyS0 c 4 64
    mknod /dev/ttyS1 c 4 65
    mknod /dev/ttyS2 c 4 66
    mknod /dev/ttyS3 c 4 67
    /lib/udev/console_init tty0
    daemonize --ignore-missing /bin/plymouthd
    plymouth --show-splash
    echo Setting up hotplug.
    hotplug
    echo Creating block device nodes.
    mkblkdevs
    echo Creating character device nodes.
    mkchardevs
    echo Bringing up eth0
    echo network --device eth0 --bootproto none --ip 192.168.50.12 --netmask 255.255.255.224 --gateway 192.168.50.1 --domain "TfJC files.csupomona.edu" --dns 192.168.1.1
    network --device eth0 --bootproto none --ip 192.168.50.12 --netmask 255.255.255.224 --gateway 192.168.50.1 --domain "TfJC files.csupomona.edu" --dns 192.168.1.1
    mkblkdevs
    resume /swapfile
    echo Creating root device.
    echo mkrootdev -t nfs -o defaults,ro 192.168.50.8:/tftpboot/192.168.50.12
    mkrootdev -t nfs -o defaults,ro 192.168.50.8:/tftpboot/192.168.50.12
    echo Mounting root filesystem.
    echo mount -t nfs 192.168.50.8:/tftpboot/192.168.50.12 /sysroot
    mount -t nfs 192.168.50.8:/tftpboot/192.168.50.12 /sysroot
    echo cond -ne 0 plymouth --hide-splash
    cond -ne 0 plymouth --hide-splash
    echo Setting up other filesystems.
    echo setuproot
    setuproot
    echo loadpolicy
    loadpolicy
    echo plymouth --newroot=/sysroot
    plymouth --newroot=/sysroot
    echo Switching to new root and running init.
    echo switchroot
    switchroot
    echo Booting has failed.
    sleep -1

Similar Threads

  1. FC10 custom kernel won't boot root via UUID
    By llowrey in forum Using Fedora
    Replies: 4
    Last Post: 19th January 2009, 04:45 PM
  2. Fedora boot stalls at "Red Hat nash version 5.1.19 starting"?
    By Cheval in forum Installation, Upgrades and Live Media
    Replies: 5
    Last Post: 3rd December 2006, 04:15 AM
  3. slow boot -- Red Hat nash ver 4.1.8
    By eagle57 in forum Using Fedora
    Replies: 1
    Last Post: 2nd June 2005, 04:10 PM
  4. boot hangs at red hat nash...
    By grogdog in forum Installation, Upgrades and Live Media
    Replies: 6
    Last Post: 2nd February 2005, 02:48 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
[[template footer(Guest)]]