The order in which the protocols are used (from memory, I don't have a sniffable sun handy anymore):

  1. rarp
  2. tftp
  3. bootparams
  4. nfs
  5. bootparams (yeah, it uses bootparams more than once)

Troubleshooting procedure with a standard sniffer (this is the easy way, when it works) :

  1. Start up a sniffer, like ethereal or snoop. Sniff on the ethernet address if you want to see rarp packets too. If you don't care about rarp packets, go ahead and sniff on the hostname.
  2. Boot the sun diskless
  3. Watch what comes up in the sniffer.
  4. 99 times out of a 100, the last thing exchanged between the sun diskless client and the boot server is the thing you need to troubleshoot. Then proceed as below for just that protocol.

The decaying usefulness of sniffers

Switches have made sniffers a lot less useful, as they only repeat packets on "needed" interfaces. You can still see broadcast packets on subnet, which is sometimes enough, sometimes not. Also, sometimes it helps to start up a sniffer on the bootserver, as the two endpoints of the conversation should see the packets they exchange. ettercap is a sniffer that claims to be able to get around switches by using arp poisoning, but it's never worked as advertised for me. Finally, sometimes you can dig up an old hub (not a switch) and insert the hub into the network above the diskless client, with a laptop hung off the same hub. You can then sniff well from the laptop.

Without any sniffer at all next to the boot client

  1. If the machine knows its IP address, it probably rarp'd successfully.
  2. If the machine counts up in hex for a while, it probably tftp'd at least some of what it needs. If it moves on to another step after that (like the spinner), it's probably tftp'd fine.
  3. If the machine knows a pathname to a root filesystem, it's probably getting bootparams at least sometimes.
  4. If you get kernel boot messages (like you would see if booting from a harddisk), you're getting NFS.

Troubleshooting rarp.

Start up rarpd -d or in.rarpd -d. Boot the diskless client. If it says nothing, the boot client and boot server aren't talking. If it outputs errors, take a guess at what they mean and proceed accordingly. If you have a sniffer, make sure the ethernet address sent in the rarp packet matches the ethernet address you expected. If rarpd -d says it can't find your host, check /etc/ethers, and check for duplicates in /etc/ethers. It's easy to forward rarp through a cisco, at least to cisco people. This is a broadcast.

Troubleshooting tftp

Change inetd/xinetd to start up tftpd with the -l option. Boot the sun. Then check syslog for messages. Sometimes I truss/strace/trace/par the process to get errors without having to change inetd/xinetd (hint: tell the syscall tracer to follow forks, and then trace inetd/xinetd. This is better on a quiet system, and worse on a busy system. The option to follow forks is usually -f). If you have a sniffer, check if the filename being asked for is available on the bootserver (and that it looks as expected. It should be the IP address in hex, and then for older suns you should have ".karch"). If you have sufficiently recent sun firmware, you can specify a tftp server, see "help boot". With "the workaround for the bpgetfile problem" we use at UCI, this will be talking to autoinst.nacs.uci.edu, not the on-subnet tftp server; for OS revisions without the workaround, it should still try to use the on-subnet tftp server via broadcast. It's easy to forward tftp through a cisco, at least to cisco people.

Troubleshooting bootparams

Start up rpc.bootparamd -d. If you see no output during a boot, then the client and bootserver aren't talking. If you see output, interpret the messages, and act accordingly. Some or all of this is done by broadcast. It's hard to forward bootparams through a cisco. The cisco doc claims it works, but an experienced cisco guy had a hard time getting it to work.

Troubleshooting NFS

Check syslog on the NFS server. Rerun sh -x /etc/dfs/dfstab. If you have a sniffer, look for error messages there - sometimes it'll give a useful error message as well as the path the client is erroneously trying to get or identify what the client needs exported that wasn't. You don't want to forward NFS thru a cisco. :)
  • Booting from the net with a nondefault adapter, with nondefault negotiable network parameters:

    Changelog

    Tue Mar 30 15:16:54 PST 2004 First draft, Dan Stromberg.




    Hits: 4705
    Timestamp: 2024-04-15 23:38:44 PDT

    Back to Dan's tech tidbits

    You can e-mail the author with questions or comments: