Troubleshooting the NetBoot process

* Introduction

Network booting a computer is a fairly straightforward, yet complex task involving many different pieces of technology. As such, troubleshooting it can be challenging. In this article I lay out the steps of the Netboot process on Mac OS X clients and indicate what technologies are involved at each step, how they could fail, and how to solve the issue.

1-19-06 Update: On January 10th, Apple announced new Intel-based Macs. Instead of Open Firmware, the Intel Macs use Intel's Extensible Firmware Interface (EFI). While most of the NetBoot process is exactly the same for EFI-based Macs, I will point out any differences between the two platforms throughout the article. These changes will be marked with "†(EFI)". In cases where EFI and Open Firmware behave the same, I have replaced platform-specific language with simply "machine firmware".

* Netboot, from the viewer's perspective

Here is a brief overview of what happens when you Netboot a client, and what you'll see on the screen as this occurs.

  1. Computer chimes when you turn it on
    The computer runs a self test and loads the machine firmware.
  2. A blinking globe appears.
    The computer is requesting an IP address and Netboot information, and begins downloading a boot file
  3. The gray Apple logo and a small spinning globe appear
    The computer is loading the boot file, which downloads and loads the kernel and kernel extension cache
  4. The spinning globe turns into a circular progress indicator
    The computer has loaded the kernel and the boot process has begun. The kernel mounts the Netboot disk image via NFS and loads the kernel extension cache. The remainder of the boot process is mostly the same as a standard local-disk boot.

* 1) Machine chimes

This is the standard "POST", or power on self test, that occurs regardless of how you intend to boot the client. If you don't hear a chime and you're sure that the audio of the machine is working and not muted, you probably have a hardware problem.

* 2) Blinking globe

After the chime, the machine firmware loads, reads the boot settings, and in the case of Netboot, starts a DHCP and BSDP (boot service discovery protocol) discovery process. Its important to draw a distinction between the two. The two protocols are very similar in behavior and can both be administered by the bootpd process on Mac OS X Server. It is not necessary, however, for a client to get both DHCP and BSDP information from one server, nor is it necessary that they even come from a Mac OS X server (although configuring another OS to hand out Mac-specific BSDP information is not an easy task -- that is the value of Mac OS X Server).

†(EFI): EFI provides much richer graphics support than Open Firmware -- the blinking globe has more detail and is no longer on a square button background. Additionally, EFI loads much faster than OF, shaving 10 - 15 seconds off the boot process.

Requirements for this step to proceed:

  • A DHCP server must respond with an IP address within the subnet of the Netboot server
  • A Netboot server must respond with a "BSDP ACK[SELECT]" -- an acknowledgment that it will be the server for this client

What you'll see in the server log:

netboot_server:~ root# tail -f /var/log/system.log
bootpd[456]: BSDP DISCOVER [en0] 1,0:a:95:c4:21:9c arch=ppc sysid=PowerMac7,2
bootpd[456]: DHCP DISCOVER [en0]: 1,0:a:95:c4:21:9c
bootpd[456]: OFFER sent <no hostname> 10.0.1.7 pktsize 300
bootpd[456]: DHCP REQUEST [en0]: 1,0:a:95:c4:21:9c
bootpd[456]: ACK sent <no hostname> 10.0.1.7 pktsize 300

Above, the client simultaneously made separate DHCP and BSDP requests. The server (in this case running both Netboot and DHCP) responds first with a DHCP response. You see the typical DISCOVER-OFFER-REQUEST-ACK.

†(EFI): Now that there is more than one architecture (ppc and i386), it is important to point out that a NetBooting client includes its architecture in the BSDP DISCOVER.  For example, VC:"AAPLBSDPC/i386/iMac4,1". What the NetBoot server does with this information will be discussed in more detail in the "Architectures" section.

bootpd[456]: BSDP INFORM [en0] 1,0:a:95:c4:21:9c arch=ppc sysid=PowerMac7,2
bootpd[456]: NetBoot: [1,0:a:95:c4:21:9c] BSDP ACK[LIST] sent 10.0.1.7 pktsize 300
bootpd[456]: DHCP INFORM [en0]: 1,0:a:95:c4:21:9c
bootpd[456]: ACK sent <no hostname> 10.0.1.7 pktsize 300
bootpd[456]: BSDP INFORM [en0] 1,0:a:95:c4:21:9c arch=ppc sysid=PowerMac7,2
bootpd[456]: NetBoot: [1,0:a:95:c4:21:9c] BSDP ACK[SELECT] sent 10.0.1.7 pktsize 364
bootpd[456]: DHCP INFORM [en0]: 1,0:a:95:c4:21:9c
bootpd[456]: ACK sent <no hostname> 10.0.1.7 pktsize 300

And now the client has handled a BSDP response. The key parts here are BSDP INFORM-BSDP ACK[LIST]-BSDP INFORM-BSDP ACK[SELECT]. If you see only parts of this "conversation", check to see that there is not another Netboot server on the network responding to your client. A packet trace can help rule that out (described below).

The last thing that occurs while you still see the blinking globe icon is that the client downloads the "booter" file that you can see in the NetBoot image.nbi set (/Library/NetBoot/NetBootSP0/image_name.nbi). The booter file is simply a copy of the "BootX" file that you can find in /System/Library/CoreServices on any Mac OS X installation. This file is responsible for the very first stage of booting the machine, it loads the Mac OS X kernel file.

†(EFI): EFI uses a different booter file. The source is located at /usr/standalone/i386/boot.efi. On a blessed volume, you will find this file at /System/Library/CoreServices/boot.efi. Additionally, the "booter" file for EFI must be stored in an architecture-specific directory within the NetBoot set. This will be described in more detail in the "Architectures" section.

In the case of Netboot, the location of the file is advertised in the BSDP response. If you do a packet trace you will see a packet similar to this:

16:23:19.979291 IP (tos 0x0, ttl 255, id 58694, offset 0, flags [none], length: 382) 10.0.1.1.bootps > 0.0.0.0.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length: 354, xid:0x4149, flags: [none] (0x0000)
     Server IP: 10.0.1.1
     Client Ethernet Address: 00:0a:95:c4:21:9c
     sname "xserve.apple.edu"
     file "/private/tftpboot/NetBoot/NetBootSP0/Panther Server.nbi/booter"
     Vendor-rfc1048:
     DHCP:OFFER
     SID:10.0.1.1
     VC:"AAPLBSDPC"
     RP:"nfs:10.0.1.1:/Library/NetBoot/NetBootSP0:Panther Server.nbi/Install.dmg"
     VO:8.4.129.0.1.145.130.10.78.101.116.66.111.111.116.48.48.50

The firmware has a very lightweight tftp client (trivial FTP) that it uses to download this file. Once the file is downloaded, it is executed and the boot process is handed off from firmware to the boot file.

Potential problems
If your client does not get past the blinking globe icon, look for the following problems. As the Netboot process is fairly difficult to troubleshoot at this stage, examine the Netboot and DHCP server logs, and perform a packet trace to see what information is coming from and going to the client. These methods are described at the end of this article.

Problem: Client does not get an IP address

Characteristics: You may see DHCP DISCOVERs in your server's log, but not a DHCP OFFER or ACK. You may also see BSDP SELECT[ACK]s in your logs, but the client does not proceed. A packet trace will reveal that no OFFER broadcast is sent to the client.

Cause: A DHCP server is not available, or does not have any available IP addresses

Solution: Resolve the DHCP problem. Always verify that your client can get a DHCP address while booted from a typical system prior to Netbooting.

Other suggestions: Make sure that there are not startup network connectivity delays. "Initial Connectivity Delay" is the general term used to describe a short router-imposed delay to network connectivity. On a managed switch, there are several features that prevent things such as network looping, which can take down a network (for example, plug both ends of an ethernet cable into a switch -- what happens? Hint: Nothing good). These protocols probe the attached device when a connection is first detected on the port, and often take 15-30 seconds before allowing traffic across the port. Some of the terms that you may see in relation to Initial Connectivity Delay are "PortFast", "Spanning Tree Protocol", "Etherchanneling", and "Trunking". There are others, but these are the ones you'll see most frequently. These are not "bad" protocols, in fact they are quite important for a managed network environment. However, they are not typically necessary on ports with hosts (computers) attached.

Initial Connectivity Delay can kill Netboot functionality -- a Netbooting client really needs to have immediate network connectivity. If you notice that it takes a particularly long time for the blinking globe to disappear, or it never does and you're sure DHCP and Netboot are configured properly, try isolating your server and client to a private network on a dumb switch. If performance is fine on the dumb switch, have a discussion with your network administrator about "configuring the ports that computers are connected to for host configuration". Most routers today have macros for easily making this change. Finally. refer to this Cisco article for a background on Initial Connectivity Delays and how to mitigate them (applicable to non-Cisco network gear as well)

 

Problem: Client DISCOVERs and DHCP server OFFERs, but client doesn't REQUEST the offered IP address.

Characteristics: The DHCP server log shows a DHCP DISCOVER and subsequent OFFER, but no DHCP REQUESTs. The ethernet switch is a fairly new Cisco device.

Cause: Back when they were classified by IANA as "site specific" options, Apple originally use DHCP options 220 and 221 for NetBoot purposes. Recently those options were reclassified for "general use", and Cisco applied for them. Now Cisco uses them in their DHCP server:

  • cisco-subnet-allocation 220 Cisco Subnet Allocation
  • cisco-vpn-id 221 Cisco VPN Identifier

Solution: As the use of these options is built into Open Firmware, its not necessarily a trivial problem to fix from an Apple perspective. There are two simple workarounds to this problem, however:

    At the Cisco Network Registrar:

  1. Disable vpn-communication at the DHCP server level or to use the ignore-cisco-options DHCP server attribute to cause the CNR DHCP server to ignore "cisco-vpn-id" and/or "vpn-id".
  2. Or, at every single Mac client:

  3. Run the following command in the Terminal to disable the use of these options in Open Firmware:

    sudo nvram default-bootp-vexts="%00"

    Then reboot the client. This change will be effective until you zap the PRAM. Also, instead of running the command on each client, you could use Apple Remote Desktop to "Send UNIX command" to multiple machines simultaneously.

 

Problem: Client performs the DHCP handshake, but fails to get a BSDP ACK[SELECT]

Characteristics: The server log shows a BSDP DISCOVER, but no BSDP ACK[LIST]s. A packet trace will reveal that no BSDP ACK[SELECT] broadcast is sent to the client.

Cause: This could be a misconfigured Netboot server. Do you have a Netboot image enabled? This could also be an issue with not getting an IP address within the same subnet range as the server. DHCP and BSDP requests and initial responses occur via broadcast, thus require that either the server and client are in the same subnet or that your routers are configured to handle this traffic specially to facilitate DHCP and Netbooting. Finally, this could simply be a timing issue. Sometimes the bootpd process needs to be restarted before it recognizes configuration changes.

†(EFI): This could also occur if your NetBoot image does not support the architecture of the machine you are trying to boot. See the "Architectures" section for more details.

Solution: Verify that you have a Netboot image enabled at your server. Try restarting the Netboot service in Server Admin. Verify that you can see the Netboot image in the Startup Disk preference pane while booted from the client's typical OS (also verify the client is configured for DHCP while doing this!).

 

Problem: Client gets DHCP and BSDP information, but fails to download the booter file

Characteristics: You see in your server logs that your client is getting an IP address in the same subnet as the Netboot server, and it is negotiation a Netboot set with the Netboot server, but the client is failing to get to the gray Apple logo. You may also see a Mac OS 9-ish blinking question mark.

Cause: First, confirm that your DHCP server is providing your client with a pingable router address. Often, people will omit the router address for a single-subnet, isolated test network, but this will definitely cause the NetBoot process to fail at this point. Even if a router does not exist, you must specify an IP address that the client will be able to ARP. Specifying the IP address of the DHCP server in cases such as this is the best approach. You can determine if your client is getting a default router address by examining a packet trace (more info on packet traces below):

Your IP: 10.0.1.7
Server IP: 10.0.1.1
Client Ethernet Address: 00:0a:95:c4:21:9c
sname "roscoe.bombich.com"
Vendor-rfc1048:
DHCP:OFFER
SID:10.0.1.1
LT:1197504
SM:255.255.0.0
DG:10.0.1.1

If you have confirmed that your client is getting a pingable IP address for the default router, the this is probably a problem with tftp. After verifying that your Netboot set actually has a booter file, test that your tftp service is working. At another client, run this command in the Terminal, substituting your server's hostname and your Netboot set's name:

[admin:~/Desktop] tftp 10.0.1.21
tftp> get NetBoot/NetBootSP0/NetRestore.nbi/booter
Received 174997 bytes in 0.2 seconds
tftp>

Note: this test will fail if your Netboot set has spaces in its name. In general, however, its OK to have spaces in your Netboot set's name

If you get an error, you probably have a tftp configuration problem.

Other suggestions:

  • Check that your server's firewall settings allow traffic on port 69
  • Verify that tftp is enabled in /etc/xinetd.d/tftp (Panther) or /System/Library/LaunchDaemons/tftp.plist (Tiger)
  • Verify that the "booter" file exists in your NetBoot set and is readable (has read privileges for "everyone")
  • Verify that your client can at least ping the router address returned by your DHCP server

* 3) Gray Apple logo, spinning globe icon

When you see the gray Apple logo, it means that the booter file has been downloaded and executed. In the case of Netboot, the booter file then downloads two additional files via tftp: the mach.macosx and mach.macosx.mkext files. The mach.macosx file is simply a copy of the /mach_kernel file located at the root of any Mac OS X filesystem. The mach.macos.mkext file is a kernel extensions cache -- a file containing all the important kernel extensions for basic network booting. While these files are downloaded, the small globe icon rotates. When the file downloads are complete, the booter file loads the kernel and the kernel carries forth with the boot process.

†(EFI): The kernel and kext cache files are very architecture-dependent. As of 10.4.4, these files are "fat-but-extracted" files. That is, they contain header information that describes the binaries available for each architecture within the file, but the architecture-specific binaries have been extracted to reduce the overall size of the files. This will be explained in more detail in the "Architectures" section.

Its fairly uncommon to run into problems in this stage of the Netboot process, however, there are a couple specific issues that can cause kernel panics at this point. Possible problems would be:

  • Not having a mach.macosx and mach.macosx.mkext file in your Netboot set
  • Either of those files being corrupt or otherwise inaccessible
  • The mach.macosx (kernel) file does not contain the binary for the client architecture or is otherwise incompatible
  • The mach.macosx.mkext (kernel extension cache) file does not contain kernel extensions required for the machine

These files take up about 12-15MB of space, so it should take a few seconds (or several seconds for many machines) for this step to complete. If you experience problems at this stage of the process, fixing the problem is fairly trivial:

  1. Reboot the affected client machine from a local drive containing the most current OS available. The OS version should also match the version of OS on your NetBoot disk image. If the OS on the NetBoot disk image is older than that on your affected client machine, you should recreate your NetBoot disk image. It is most important that the OS on the NetBoot disk image be newer than (or the same as) the OS that the machine shipped with.
  2. Mount via AFP the NetBoot sharepoint of the NetBoot server that contains the affected NetBoot set.
  3. Recreate the mach.macosx and/or the mach.macosx.mkext files. See the "Architectures" section for more details.

If all else fails, simply recreate the entire NetBoot set on the affected hardware. Be sure to delete (or move out of the NetBoot sharepoint) any non-functional NetBoot sets.

* 4) Spinning globe turns into indeterminate progress indicator

Once the kernel loads, it changes the spinning globe icon into an indeterminate, circular progress indicator, and the boot process functions mostly the same as a standard boot process. If you were holding down Command+V during start up, you'd get the verbose boot at this point. Two interesting things happen here that are relevant to troubleshooting Netboot. First, the kernel loads the kernel extension cache to give the young OS the functionality it needs to perform advanced network communication, mount disks, etc before the rest of the OS loads.

Second, the kernel executes the /etc/rc.netboot startup script. This script attempts to mount the disk image inside your Netboot set via NFS. The path to this disk image is obtained from the BSDP response and maintained in memory (much like your DHCP packet is maintained and accessible via the ipconfig command). If you do a packet trace you will see a packet similar to this:

Server IP: 10.0.1.1
Client Ethernet Address: 00:0a:95:c4:21:9c
sname "xserve.apple.edu"
file "/private/tftpboot/NetBoot/NetBootSP0/Panther Server.nbi/booter"
Vendor-rfc1048:
DHCP:OFFER
SID:10.0.1.1
VC:"AAPLBSDPC"
RP:"nfs:10.0.1.1:/Library/NetBoot/NetBootSP0:Panther Server.nbi/Install.dmg"
VO:8.4.129.0.1.145.130.10.78.101.116.66.111.111.116.48.48.50

After these occur, the kernel initiates the /etc/rc.boot and/or /etc/rc.cdrom scripts which complete the boot process. Eventually the screen turns blue as the WindowServer loads and you begin to see the more familiar parts of the boot process.

Potential problems

Problem: Soon after the circular progress indicator appears under the gray Apple logo, white horizontal lines appear on the screen and the progress indicator stops spinning.

Cause: This is probably a kernel panic, and it is likely a result of the machine trying to mount the NFS-hosted disk image and failing.

Suggestions:

  • Verify that you have a kernel panic by holding down Command+V while you reboot the client. There should be some indication of a panic.
  • Verify that NFS is running on the server
  • Verify that the NetBootSPx sharepoint is valid and accessible. Remember that the NetBoot sharepoint should look like this:

cd /Library/NetBoot
ls -la

.sharepoint --> NetBootSP0
.clients --> NetBootClients0
NetBootSP0
NetBootClients0

If it doesn't, you can manually repair it, or run this command:

/System/Library/ServerSetup/NetBoot

Or you could reset the NetBoot sharepoints in Server Admin:

  1. Navigate to NetBoot > Settings > General in Server Admin
  2. Deselect all checkboxes in the bottom pane ("Select where to put images and client data")
  3. Save changes
  4. Reselect the desired volumes for storing images and client data
  5. Save changes

 

Problem: The system reboots about ten seconds or so after the circular progress indicator appears

Cause: To really determine the cause, you should do a verbose boot and try to catch the error message indicated on the screen. More often than not, the problem is with an incompatible kernel extension cache. The machine tried to load the cache, but some important piece was missing and the computer could not continue booting.

Solution: Rebuild your Netboot image set on a machine that you would like to boot from that set. Typically this means that you want to use your latest and greatest machine for creating Netboot sets. Newly released Apple hardware *always* fails to boot from last year's Netboot set. Keep your Netboot images fresh and you shouldn't run into this.

 

Problem: The system never progresses beyond the circular progress indicator

Cause: Again, to really determine the cause, you should do a verbose boot and to see specific error messages indicated on the screen. Often this a misconfiguration of NFS at the server, characterized by messages like "RPC timeout for server <NetBoot server IP>". Occasionally it is due to bugs in the (third party) startup scripts.

Solution: Basic NFS troubleshooting -- start with resetting the NetBoot sharepoints in Server Admin as indicated above. Verify that your firewall is not blocking ports required by NFS: 111 (UDP), 989 (UDP), 2049 (UDP and TCP). Also, use the commands "showmount" and "mount_nfs" to verify that NFS is functioning. From a client booted from its own hard drive, run these commands:

showmount -e <NetBoot Server IP>

mkdir /tmp/mnt
mount_nfs <NetBoot Server IP>:/Library/NetBoot/NetBootSP0 /tmp/mnt

The "showmount" command will indicate what NFS sharepoints are available on your NetBoot server. If you do not see your NetBoot sharepoint, reset the NetBoot sharepoint in Server Admin. The mount_nfs command actually attempts to mount the NFS sharepoint.

* NetBoot Troubleshooting Topics

* General Troubleshooting suggestions

  • Start simple using Apple's System Image Utility
  • Isolate your server and client to a private network on a dumb switch
  • Recreate the Netboot set
  • Try booting verbosely to see if any error messages point you in the right direction
  • Verify that you're getting an IP address within the subnet range of your Netboot server

* Packet traces

This packet trace can be really useful (performed at the Netboot server):

sudo tcpdump -i en0 -s 0 -nvX port bootps or port bootpc or port tftp

or if you're planning to send the results to someone else:

sudo tcpdump -i en0 -s 0 -w ~/Desktop/packets.trace port bootps or port bootpc or port tftp

What the arguments mean:
-i en0: Listen to traffic on en0
-s 0: Do not truncate packets
-n: Do not convert IP addresses to names
-v: Verbose output (give me a pretty summary of what the packet means)
-X: Print the contents of the packet in ASCII and hex
-x: print the contents of the packet in hex
-A: Print the contents of the packet in ASCII
-w: write the packets to a file instead of displaying them

There is a lot of information in packet traces, and it can be daunting to figure out what it all means. You can also download my package of annotated packet traces for reference. The most important thing to know about packet traces is how to do them. Even if you don't know what to glean out of the trace, having it to hand to someone else can make troubleshooting much easier.

* Getting BSDP information at the command line

If you edit your Netboot set to provide a shell early in the boot process, you can see what BSDP information your client is getting from the server with the following commands:

ipconfig netbootoption shadow_mount_path
ipconfig netbootoption shadow_file_path
ipconfig netbootoption machine_name

* Diskless Netboot

A diskless NetBoot image is exactly the same as a non-diskless image (you don't make that choice during SIU image creation, right? Right.) When you choose to make an image set diskless in Server Admin, the only change that is made is to the "SupportsDiskless" key in the NBInfo.plist file in the .nbi directory.

The magic occurs when you boot the client. Part of the BSDP response to the client includes information about the location of any network mountpoints for shadow files. For example, using the previous tip, you can get the following data from the BSDP packet:

% ipconfig netbootoption shadow_mount_path
afp://netboot001:[email protected]/NetBootClients3

% ipconfig netbootoption shadow_file_path
NetBoot001/Shadow

% ipconfig netbootoption machine_name
NetBoot001

Examining the /etc/rc.netboot startup script you can see how diskless Netbooting works. By default, a Netboot client will try to mount a shadow file at the shadow_mount_path. If that fails though (for example, if shadow_mount_path is not defined by the Netboot server), it will use the local drive instead. Therefore, diskless Netboot depends entirely on the client's ability to mount a shadow file at the AFP mount path returned by the Netboot server in the BSDP response.

Note that while NetInstall does not require an internal drive, it is *not* "diskless netboot". NetInstall does not use a shadow file at all, therefore a network shadow file is not required or returned in the BSDP response. This is also why the "Diskless" checkbox is disabled in Server Admin for NetInstall image sets. NetInstall sets employ RAM disks as necessary for writable space.

* Resetting NetBoot server caches

When you hold down the "N" key during startup, your machine will boot from the image set that you have identified as the "default" set in Server Admin. When you choose a Network startup disk in the Startup disk preferences pane, the server keeps track of your selection, and you're forever bound to that server and Netboot set until you make another choice. What this means is that if you change the default set at the server, then hold down the N key on startup at that client that had chosen another Netboot set, the client will not boot from your default set, it will always boot from the set that you had previously chosen (even if you have, since then, reset the startup disk to a local disk).

†(EFI): Hold down Option+N to boot from the actual default NetBoot image.

While this technically works as designed, it doesn't necessarily work as expected. The Netboot server keeps these choice settings in /var/db/bsdpd_clients. Its safe to delete that file to allow your clients to boot to the default image set again. Also, the following series of commands tend to resolve problems caused by setting a specific network startup disk choice on a client, then deleting that Netboot set.

sudo rm /var/db/bsdpd_clients
sudo killall bootpd
sudo killall -HUP xinetd
sudo lookupd -flushcache
sudo serveradmin stop netboot
sudo serveradmin start netboot

* Netbooting across subnets

Netboot requires that the client can get DHCP and BSDP information via broadcast. This typically requires that the Netboot server and clients reside on the same subnet, because routers typically do not pass broadcast information between subnets. DHCP information, however, is handled specially by routers so you don't need a DHCP server on every segment of your network. This is handled by what are typically called "DHCP Helper tables" (or more generally, DHCP Relay) in your router's configuration. Basically this is just a list of IP addresses that DHCP broadcast packets should be relayed to.

Because the BSDP protocol is so similar to DHCP, the router configuration for a BSDP server is the same as for DHCP. Therefore, if you want to Netboot across subnets, or more technically spoken, if you want BSDP broadcast information relayed past your routers, you need to add the IP address of your Netboot server to your router's DHCP helper table.

A common fear among network administrators is that this will interfere with the handling of DHCP by other servers. However, although the bootpd process is running on your Netboot server, if the DHCP service is not turned on, it will not hand out IP addresses. In fact, it will completely ignore any DHCP requests altogether. Likewise, your other DHCP server will completely ignore BSDP broadcasts that are relayed to it by the router.

In summary, if you want to Netboot across subnets, work with your network administrator to configure your routers to send BSDP broadcasts to your Netboot server. This is not an unreasonable request or difficult task, and greatly reduces your infrastructure and management costs.

* NetBooting Multiple Architectures

When a Macintosh client begins the NetBoot process, it sends out a broadcast request for a NetBoot server.  Within this request are three very important pieces of information: Client identifier (MAC address), architecture, and System Identifier (machine model). When a (Tiger+) NetBoot server sees a broadcast BSDP request, launchd launches bootpd to handle the request. The NetBoot server checks its /var/db/bsdpd_clients file to determine if the client already has selected a NetBoot image on the server. If a record for the client exists on the server, the server will return the associated NetBoot image information and the NetBoot client will prefer this server over any other NetBoot servers on the network. If an association does not yet exist, the server returns a list of NetBoot images that are available to the particular client. When the client finally chooses an image, the server creates a client-association record in /var/db/bsdpd_clients.

The NetBoot server will filter a NetBoot image from the list returned to the client if:

  1. The client's MAC address is specifically forbidden from accessing images on the server (NetBoot filters)
  2. The NetBoot image does not support the architecture of the client machine
  3. The NetBoot image is not enabled for the machine model

Refer to the Mac OS X Server documentation for more details on NetBoot filtering.

Architecture support is defined in two ways. As of 10.4.4, there is an additional key in the NBImageInfo.plist file named "Architectures". This attribute contains an array of the architectures supported, for example {ppc} or {ppc, i386}. Additionally, the NetBoot set must contain a booter, mach.macosx, and mach.macosx.mkext file for each architecture supported. For backward compatibility, the ppc booter files may reside at the root level of the NetBoot set or within a folder named "ppc" at the root level of the NetBoot set. Intel-specific booter files must reside within a folder named "i386" at the root level of the NetBoot set. Therefore, you could have a Universal NetBoot set (capable of booting ppc or i386) with the following structure:

NetBoot.nbi/
	booter
	i386/
		booter
		mach.macosx
		mach.macosx.mkext
	mach.macosx
	mach.macosx.mkext
	NBImageInfo.plist
	System.dmg

When the NetBoot server receives a BSDP request from a particular architecture, it determines if ${arch}/booter exists. If it does, it returns the path to that file in the BSDP response. If it does not, and arch = ppc, it returns the path to booter (at the root level of the nbi) if it exists. If the booter does not exist for the architecture, not only will the client not boot from that NetBoot set, but the NetBoot image will not even appear as an available boot disk to the client.

Generating platform-specific boot files:

  1. Create the mach.macosx file with a command similar to the following, replacing the path to the NetBoot set with your own information:
  2. ditto /mach_kernel /Volumes/NetBootSP0/NetRestore.nbi/mach.macosx

    or, if the kernel is fat, you can extract the architecture specific binary directly to the nbi folder:

    lipo -extract ppc -output /Volumes/NetBootSP0/NetRestore.nbi/mach.macosx /mach_kernel

  3. Create the kernel extension cache with a command similar to the following, replacing the path to the NetBoot set with your own information:
  4. sudo kextcache -a ppc -s -l -n -z -m /tmp/mkext /System/Library/Extensions
    ditto /tmp/mkext /Volumes/NetBootSP0/NetRestore.nbi/mach.macosx.mkext

    or, for an Intel-based Mac:

    sudo kextcache -a i386 -s -l -n -z -m /tmp/mkext /System/Library/Extensions
    ditto /tmp/mkext /Volumes/NetBootSP0/NetRestore.nbi/i386/mach.macosx.mkext

  5. Add the booter files. PowerPC:
  6. ditto /usr/standalone/ppc/bootx.bootinfo /Volumes/NetBootSP0/NetRestore.nbi/booter

    Intel-based Mac:

    ditto /usr/standalone/i386/boot.efi /Volumes/NetBootSP0/NetRestore.nbi/i386/booter

 

References:
Cisco article on DHCP Relay configuration
Apple Kbase: Netbooting across subnets
Alternative method of Netbooting across subnets
Kernelthread.com: Booting Mac OS X
Apple Documentation of the Mac OS X boot process
How to enable NetBoot 1.0 for older NetBoot client computers

History:
7/8/2005: Initial publication
1/19/2006: Updated with information about EFI/Intel-based Macs
4/3/2006: Updated with additional NFS troubleshooting information

footer shadow