LinuxQuestions.org - [SOLVED] Netatalk on Debian (linux 2.6.18) for Mac OS9 netboot

- Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)

- - Netatalk on Debian (linux 2.6.18) for Mac OS9 netboot (https://www.linuxquestions.org/questions/linux-server-73/netatalk-on-debian-linux-2-6-18-for-mac-os9-netboot-726535/)

Netatalk on Debian (linux 2.6.18) for Mac OS9 netboot

Hello all,

Before posting here I did realize full well that this might not be the right place to pose my question, but maybe somebody could redirect me to the right spot. I'm having some trouble with the following setup: I have a Debian based NFS/TFTP/DHCP server which I use for netbooting clients and embedded systems from. Currently I have a number of DOS versions, Memtest, Windows PE and a Linux thin client system running (more or less like the LTSP system) and this is working fine. Recently I got myself an iMac G3 trayloading system with 96MB RAM and a PowerMac G4 with 256MB RAM. Both are running OpenFirmware 3.x (3.0 on the iMac and 3.4 on the PowerMac). According to the specsheets, and my own testing both machines are netboot capable. I'm trying to get OS9.1 to boot over ethernet. I found that the documentation available for this is a great help, but there is little help if something didn't quite work like the howto described. Anyway, back to the setup: As I understand this (please correct me if I'm wrong; I'm new to the Mac-world) the OpenFirmware contains a DHCP (actually BOOTP) client, a TFTP client and an AppleTalk client with plain auth facilities.
When I for instance boot my iMac into the OpenFirmware prompt and enter

Code:

0 > boot enet:0

the client nicely contacts my DHCP/BOOTP server and gets and IP, together with the DHCP options 66 and 67 (next-server and filename). I placed the "Mac OS ROM" file from a working OS9.1 install in my tftproot, and this is being retrieved perfectly. The client then continues to boot, but never gets to read it's system disk, application disk or 'private/scratch disk'.
I've patched the source to the DHCP server (ISC DHCPd 3.0.4) to include the Mac-specific directives, and these are being received by the client and calculated all the hexadecimal representations of the paths, files and IP's that one needs to supply the client with.

To be more specific, this is an excerpt from my dhcpd.conf:

Code:

[snip]

allow bootp

authorative

[end snip]



 host imac01 {

                hardware ethernet 00:05:02:8B:64:C3; fixed-address 192.168.3.110;

                filename "Mac OS ROM";

                next-server 192.168.3.13;

                server-name "192.168.3.13";



                # MAC SPECIFIC OPTIONS:

                option mac-version 0:0:0:0;

                #This specifies the protocol version? 



                option mac-machine-name "imac01";

                #specifies the machine name 



                option mac-user-name "macint";

                #this specifies the user name which clients will use to log on to the server



                option mac-password "macmac";

                #this specifies the password which clients will use to log on to the server



                option mac-nb-img c0:a8:3:d:2:24:7:6D:61:63:62:6F:6F:74:0:0:0:0:2:7:6F:73:39:0:68:64:31;

                #option mac-nb-img c0:a8:03:0d:02:24:07:d:61:63:62:6f:6f:74:0:0:0:0:2:07:6f:73:39:0:68:64:31;

                #this specifies the path to the shared HD image. 



                option mac-apps-img c0:a8:3:d:2:24:7:6D:61:63:62:6F:6F:74:0:0:0:0:2:7:6F:73:39:0:68:64:32;

                #this specifies the path to the Applications HD image. I point this to a blank image.



                option mac-client-nb-img c0:a8:3:d:2:24:7:6D:61:63:62:6F:6F:74:0:0:0:0:2:7:6F:73:39:0:68:64:33;

                #this specifies the path to the client's private disk image. this must be unique.

                }

I have compared my configuration to a couple of other people's on the net, and I see no inherent differences that would prevent it from working. With a quick

Code:

tail -n50 /var/log/syslog

I can see that the booting client does "access the volume" (I think is the Apple-speak way of saying 'mounting the share'), however it doesn't download/open anything. See following excerpt from syslog:

Code:

May 17 10:00:26 pxeboot dhcpd: Received BootP request from Macintosh netboot client

May 17 10:00:26 pxeboot dhcpd: BOOTREQUEST from 00:05:02:8b:64:c3 via eth0

May 17 10:00:26 pxeboot dhcpd: BOOTREPLY for 192.168.3.110 to imac01 (00:05:02:8b:64:c3) via eth0

May 17 10:00:43 pxeboot afpd[20250]: ASIP session:548(6) from 192.168.3.110:3151(8)

May 17 10:00:43 pxeboot afpd[20250]: cleartext login: macint

May 17 10:00:44 pxeboot afpd[20250]: login macint (uid 1000, gid 1000) AFP2.2

May 17 10:00:44 pxeboot afpd[20250]: Warning: No CNID scheme for volume /var/lib/macboot. Using default.

May 17 10:00:44 pxeboot afpd[20250]: Setting uid/gid to 1000/1000

May 17 10:00:44 pxeboot afpd[20250]: CNID DB initialized using Sleepycat Software: Berkeley DB 4.2.52: (December  3, 2003)

May 17 10:00:44 pxeboot afpd[20250]: logout macint

May 17 10:00:44 pxeboot afpd[20250]: 0.22KB read, 0.18KB written

May 17 10:00:44 pxeboot afpd[19590]: server_child[1] 20250 done

May 17 10:00:52 pxeboot afpd[20252]: ASIP session:548(6) from 192.168.3.110:3151(8)

May 17 10:00:52 pxeboot afpd[20252]: cleartext login: macint

May 17 10:00:52 pxeboot afpd[20252]: login macint (uid 1000, gid 1000) AFP2.2

May 17 10:00:52 pxeboot afpd[20252]: Warning: No CNID scheme for volume /var/lib/macboot. Using default.

May 17 10:00:52 pxeboot afpd[20252]: Setting uid/gid to 1000/1000

May 17 10:00:52 pxeboot afpd[20252]: CNID DB initialized using Sleepycat Software: Berkeley DB 4.2.52: (December  3, 2003)

May 17 10:00:52 pxeboot afpd[20252]: logout macint

May 17 10:00:53 pxeboot afpd[20252]: 0.22KB read, 0.18KB written

May 17 10:00:53 pxeboot afpd[19590]: server_child[1] 20252 done

May 17 10:01:01 pxeboot afpd[20263]: ASIP session:548(6) from 192.168.3.110:3151(8)

May 17 10:01:01 pxeboot afpd[20263]: cleartext login: macint

May 17 10:01:01 pxeboot afpd[20263]: login macint (uid 1000, gid 1000) AFP2.2

May 17 10:01:01 pxeboot afpd[20263]: Warning: No CNID scheme for volume /var/lib/macboot. Using default.

May 17 10:01:01 pxeboot afpd[20263]: Setting uid/gid to 1000/1000

May 17 10:01:01 pxeboot afpd[20263]: CNID DB initialized using Sleepycat Software: Berkeley DB 4.2.52: (December  3, 2003)

May 17 10:01:01 pxeboot afpd[20263]: logout macint

May 17 10:01:02 pxeboot afpd[20263]: 0.22KB read, 0.18KB written

May 17 10:01:02 pxeboot afpd[19590]: server_child[1] 20263 done

May 17 10:01:10 pxeboot afpd[20292]: ASIP session:548(6) from 192.168.3.110:3151(8)

May 17 10:01:10 pxeboot afpd[20292]: cleartext login: macint

May 17 10:01:10 pxeboot afpd[20292]: login macint (uid 1000, gid 1000) AFP2.2

May 17 10:01:10 pxeboot afpd[20292]: Warning: No CNID scheme for volume /var/lib/macboot. Using default.

May 17 10:01:10 pxeboot afpd[20292]: Setting uid/gid to 1000/1000

May 17 10:01:10 pxeboot afpd[20292]: CNID DB initialized using Sleepycat Software: Berkeley DB 4.2.52: (December  3, 2003)

May 17 10:01:10 pxeboot afpd[20292]: logout macint

May 17 10:01:10 pxeboot afpd[20292]: 0.22KB read, 0.18KB written

May 17 10:01:10 pxeboot afpd[19590]: server_child[1] 20292 done

If I don't pass the user, pass and path options via DHCP/BOOTP, the client does not contact the AppleTalk (netatalk) share at all. The share itself works fine from an OS8.5 client (this iMac from HDD), and OS9.1 client (the G4) and Mac OSX 10.2 (again, the G4, it's a dual boot). Just to make things clear(er); I've netbooted both the boxes, and the behaviour is exactly the same, so this is not a quirky OpenFirmware version, or a duff ROM somewhere. I'm looking into the 'CNID scheme' thing, but apparently, as long as a 10.5 client has not 'tainted' the metadata with it's indexing, it shouldn't matter. I've however removed the .AppleDB and .AppleDouble files which the clients automatically make when mounting the drive and booting with the share in a clean state. Same results apply. Now there is one very odd thing that I noticed on the G4, and will be testing on the iMac as well, and that is that the ethernet link comes up before getting the TFTP file when booting from network (obviously), the file is retrieved perfectly, but then the ethernet goes down, comes back up, the client mounts the share, then ethernet goes down again, up again, mounts the share, etc. This can be seen in the logs above (at least, the fact that it re-mounts the share). Is this normal? Is it trying to determine something? It's not 'just trying a few line speed/duplex modes', it correctly sees 100Mbit FD first time round. Following attempts are made at the same link speed/duplex, anyway.
Irrelevant but funny information: I have MacOS 8.5 installed on the iMac's HDD, and when the client is done 'trying' to find it's system disk on the network, it tries to boot from the internal disk and bombs out with an 'unimplemented trap'. I guess this is because of the OS9.1 ROM and the OS8.5 install, but I could be wrong.

I created the system disk image from a working 9.1 install on the G4 (installed from a 'universal' CD, not one 'only for that G4') under OS9 booted from CD (see http://frank.gwc.org.uk/~ali//nb/ <- he's been a lot of help). The client disk image and apps image are just blank files that I

Code:

touch

ed, Locked (from GetInfo in Finder) and then zeroed out with a

Code:

cat /dev/null > $FILE

. As I understand it, the apps and 'scratch' disks are required, but are not required to be functional, or have a filesystem. Again, please correct me if I'm wrong, and this is one of the things I will trying later on.

If I missed any vital info please let me know, if you have any tips, ideas, input, examples, howtos, readmes or feedback at all, please inform me as well.

kind regards and thanks in advance

PelliX

PS: I'm using the packaged version of netatalk (2.0.3-4+etch2) under Debian which has the SSL libraries/compile flags/option disabled due to licensing, but as the OpenFirmware only understands plaintext I guess this doesn't make a difference.

PPS: This is my basic calculation of the paths (and according to the afpd/syslog they would appear to be correct):

Code:

192.168.3.13 --hex--> 00:c0:00:a8:00:03:00:0d

c0:a8:03:0d



548 (tcp port) --hex--> 2:24



7 char (macboot) --length(hex)--> 07



macboot --hex--> 6d:61:63:62:6f:6f:74



this is always the same --> 0:0:0:0:2



7 char (OS9:hd1) --hex--> 07



os9 --hex--> 6F:73:39



hd1 --hex--> 68:64:31



result:



c0:a8:03:0d:02:24:07:d:61:63:62:6f:6f:74:0:0:0:0:2:07:6f:73:39:0:68:64:31



hd2 --hex--> 68:64:32



hd3 --hex--> 68:64:33

PPPS: I have tried forcing the client to use DHCP instead of BOOTP (by disabling the allow bootp directive in the dhcpd.conf) but the effect was the same.

UPDATE: I have tested the iMac while monitoring the switch and it shows the ethernet link going up and down as well. This appears to be inherent, I guess.

Ok, I've narrowed down the cause of the failure (or excluded certain possibilities, rather). I downloaded the Netboot9.dmg file from the Apple website and mounted it under Mac OS X, then opened the .pkg file, extracted the .pax.gz file to the desktop. Ungzipped the pax.gz file, and continued to extract the contents with Pacifist. So far, so good. Now I've replaced the OS9.1 Mac OS ROM file in the tftpboot folder with the one supplied in the netboot package, and the HD images on the AppleTalk share, however the 'scratch disk' is still the same. I renamed the files to fit in my current naming scheme (hd1 = OS, hd2 = Applications, hd3 = scratch). I tried to boot the G4 from these files (over ethernet), but encountered the exact same problem. I'm losing hair here. Anybody got any bright ideas, or even not so bright ones ?

OK, considering that nobody seems to have any ideas regarding a resolution for the problem, does somebody have any ideas how I could 'debug' the boot process of the client, namely the AppleTalk activity when the client accesses (or tries to access) the shared volume?
Of course I have the afpd syslog, but I can't seem to get any really useful information from it (was a file entirely read or only partially, for instance).
Note: judging by the transfer sizes (0.22KB read, 0.18KB written) I would say it looks more like the client is retrieving the 'Apple metadata' in the folder(s) but not much more than that. These 'meta info' files point to the correct paths, and have been generated from scratch a number of times with different settings (AppleDouble support, no AppleDouble support, etc).

UPDATE: On a more positive note, I've knocked up a quick ppc linux kernel and basic environment for netbooting, and it works like a charm on both machines. That was a half an hours work including coffee...

EDIT: after reading the last bit I added to this reply, I would just like to clarify that I'm still trying to get the OS9 image to boot on the Macs, the fling with ppc-linux was just a proof of concept invoked by frustration. :-)

I guess this would be the wrong forum to ask in (so why does he ask the question, you're wondering...) but does anybody know whether you can configure the "Mac OS ROM" file, extracted from the Netboot package or the System Directory? To me the error lies there somewhere, because:
- the clients gets the said directives (wireshark/tshark confirm(s) this and read on -> )
- the clients download the Mac OS ROM by TFTP (this can be even verified on the client)
- the clients mount the right volume on the right server (whether they access the right folder is hard to tell, but I've tried sticking stuff in the root, and re-calculating the hex paths in the dhcpd.conf)
- the clients do NOT download/'properly access' the images on the share (there is Apple Filing Protocol over TCP traffic, but with the aforementioned quirks). My guess would be that the Mac OS ROM is actually still in control of the machine, and considering that the box repeatedly keeps accessing the shares, I assume it hasn't "handed over" to whatever it is in Mac OS9 that runs things. Note that the mouse (and presumably keyboard) remain active, there is no kernel panic in that sense, merely a failure to access the system to boot into the 'real OS'.

I have now acquired Mac OSX 10.4 server, and have installed the NetBoot services. Contrary to popular belief, this does not work out of the box, and requires some configuration too. Anyway, I will be setting it up the official way so that I can slowly but surely narrow the problem down, and try to duplicate the correct behavior on the Linux box. If anybody would like to know any specifics about the 10.4 server, or would like to see the debug logs of a working setup (once I've got this running), feel free to send me an email.
Due to the lack of support (no offense intended, and some individuals on Apple forums were quite helpful) I've noticed that this is a path less traveled, thus I feel somewhat obliged to provide as much information as I can, given the amount of time I have to spare for this endeavor.

Although I can't offer any assistance it's an interesting read and I would like to see how it all plays out.

Well, right now I'm playing with Mac OS X 10.4 server. I thought I'd do it the official way, and then bit by bit (I guess literally) move back to the Linux based server. However, I can't get the OS X server to provide the client with the correct info in the DHCP lease right now. I've looked at the official documentation, caused many hours of CPU time searching for non-existent files on the server, etc, but I cannot seem to figure out how to configure the lease without a 'dhcpd.conf'. The problem is that the lease provided only contains the TCP/IP parameters (IP, netmask, gateway, DNS, etc), and has no paths or image names in it.
Also, TFTPd seems to be playing up, and deciding not to offer the ROM file to the client, but first the lease needs to be corrected (it doesn't provide a 'boot-file' value either, according to wireshark/Cocoa Packet Analyzer).
I'm gonna get dirty with this very soon, and will definitely post my progress, or get carried away in a straight jacket, kicking and screaming...

Well, I've got one step further; I've dismissed the idea of trying OS X server for now, as this would appear to more effort than it's worth. Now the booting Mac client finds the system image, accepts this, then proceeds to load the 'scratch disk image' and currently gets an access denied error from the netatalk daemon, even though permissions are correct, and the image is not 'locked' (Finder). I had to make a second AppleTalk share for this to take place though; it would appear that the client refuses to even attempt to load a scratch image from the same volume. This contradicts the documentation at http://frank.gwc.org.uk/~ali//nb/ , but appears to work. This evening I will check all the related permissions and ACLs once again, to see whether I can get the scratch disk working. I guess that this would be the last required step, as the application image needs to exist, but does not need to contain anything, and cannot be written to. Maybe that straight jacket won't be necessary after all...

Right, mission accomplished. I have the setup working with the Linux server and the two Mac clients I previously mentioned. I've also taken the time to make a decent writeup and posted it on Google Docs. I hope (but doubt) that someone will find it useful.

http://docs.google.com/Doc?id=dgtdprg4_3h7cbjtgt

Kind regards

PelliX