README for Princeton mods version 1
October 14, 1996
irwin@princeton.edu


This file contains information about version 1 of the Princeton mods to
CMU-dhcpd 3.3.7.  This patch is also available at:

http://www.princeton.edu/~irwin/src/CMU-dhcpd-3.3.7-PU-patch1.Z

Most of the changes are intended to make the server adhere more closely
to the most recent BootP and DHCP specs available at this time.

---------------------------------------------------------------------

APPLYING THE PATCH

The patch is against dhcpd version 3.3.7 as distributed by CMU.  You
may have some local patches you'll need to reconcile against this.

I've included some of the changes others have suggested on the CMU 
bootpd mailing list, since I was able to test them here.

(I've *not* dealt with the gcc -I issue in the Makefile.  If you've
dealt with that already, you know what to do.  If that issue wasn't
a problem for, then never mind.)

---------------------------------------------------------------------

PLATFORMS

This code is currently in production at Princeton, but has NOT been
tested extensively in all environments.  In particular, I've only
compiled and tested on SunOS 4.1.4 and Solaris 2.5.1.

---------------------------------------------------------------------

SPECS

The specs I've worked from are RFC 951, as updated by RFC 1542, RFC 1534,
draft-ietf-dhc-dhcp-07.txt, and draft-ietf-dhc-options-1533update-04.txt.

---------------------------------------------------------------------

WHAT'S CHANGED


Here's a summary of what I've changed; it is probably not a complete
list, since I'm summarizing things after the fact.

You can redefine what syslog facility messages are logged to.  To do
this, uncomment and edit the LOG_FACILITY definition in the main
Makefile.  (Do in Makefile.in before running 'configure' if you want
this change preserved every time you run 'configure'.)

If we're running standalone, we'll write out the processid to /etc/dhcpd.pid.
You can override the filename by uncommenting and editing the PID_FILE line
in the main Makefile.  (Do this in Makefile.in before running 'configure' if
you want this change preserved every time you run 'configure'.)

I uncommented the call to enable SO_BROADCAST.  (Why was it commented out
for all platforms; on what platforms should it be commented out?)

We used to stay in the foreground (don't fork) if you specified a debug
level greater than or equal to 3.  Now we always go into the
background, except if you specify the new "-f" (Don't Fork) option.
This lets you control your debug level and forking behavior
independantly.

Many changes to debugging and log messages; see section below.

Many changes to the way we decide whether to assign/offer the client an
address, or be silent, or NAK the client.  See the section below.

Added new -u (Renew Unbound Statics) option, to allow multiple servers
handling statically-assigned addresses provide better redundancy.  See
the section below.

Fixed a memory leak, I hope.  

Restored the code that supports the CMU cookie-style bootp
requests/responses.

When sending DHCPACK, the old code could leave out the lease
time.  Ditto for DHCPOFFER.  This is fixed.

When sending DHCPNACK, the old code left out the server identifier
option.  This is fixed.

When sending DHCPNACK, the old code always inserted the clientid
option.  The specs does not seem to require this if the client did not
specify this option.  Including the clientid in this case could
conceivably confuse clients, since they way we infer a clientid for
clients who don't specify it might differ from their idea of their
clientid.  So if the client leaves out the clientid option, we leave it
out of our NACK.

The dump_bootptab and dump_bindings functions (requested with USR1 and
USR2 signals) are now available even if you compile without DEBUG
defined.  (They're too useful to be treated as just code debugging
features.)

The code used to select an IP destination for an outgoing packet has
changed, to more closely follow the specs.  We should put less garbage
in our arp cache, and be able to get NAKs to clients correctly.
Multihomed hosts still will not be able to send a response packet to a
client, if the packet must be broadcast, and the client is attached to
one of the server's non-primary interfaces (see below).

Previously we assumed the server had just one IP address; the first one
returned by 'gethostbyname'.  While we still treat that as "my ip
address" (e.g.  when identifying ourself to clients), at startup we now
walk the OS's interface structures, to determine the IP address and
netmask for each interface.  (Displayed on startup at debug level 4 and
higher.) Although this server *still* is really not designed to work on
a multihomed host (see below), this at least lets us *correctly*
determine whether we need to prime our arp cache before sending a reply
packet.  And we can do a somewhat better job in determining if an
proposed IP address is appropriate for the network to which the client is
attached, when giaddr==0.....though there are still serious problems with
this (see below).

Added more comments to information produced by dump_bindings.  Also
print information about cable and expire values.

Top-level Makefile was missing some header file dependancies.

dhcp_opts() treated the max msg size option as 4 bytes long;
spec says it is 2 bytes long.  Problem was pointed out by
bob@pimc.be (Bob Deblier) on CMU mailing list.

Include the following changes to the Makefiles, publihsed by 
Mohamed Ellozy <ellozy@netman-mel.dfci.harvard.edu) on CMU mailing list:
add dependancies upon ../config.h to the snmpd, snmplib and dhcpcmd
makefiles;  make clean cleans out subdirectories; changing CC
in top-level Makefile is propagated to lower-level makefiles;
dhcpdcmd added to 'all' target in top-level Makefile.

Included fix to increment of tmpaddr in kw_network(), published
by Petr Lampa <lampa@fee.vutbr.cz> and Jeff Licquia <jal@cssroute.lexdoc.com>
on the CMU mailing list.

Included fix to uninitialized fromlen variable in ping(), published by
Petr Lampa <lampa@fee.vutbr.cz>, and David M.  Meyer
<meyer@network-services.uoregon.edu> on the CMU mailing list.

Included fix to include various header files, published by
Petr Lampa <lampa@fee.vutbr.cz> on the CMU mailing list.

Include fix to processing of END tag (resetting overload flag),
published by Robert Lee <robert@spin.net.uky.edu> and
Mel Lew <melvin@columbia.edu> on the CMU mailing list.

Included fix to dhcpdecline() that prevented it from calling dhcp_opts(), 
published by Nancy L Wong <nancy@watsun.cc.columbia.edu>
on the CMU mailing list.

Include printing of 'dy' tag when dumping a host entry,
as pointed out by Petr Lampa <lampa@fee.vutbr.cz> on the
CMU mailing list.

Numerous small bugs fixed.  Numerous bugs probably introduced.

---------------------------------------------------------------------

HOW WE SELECT AN ADDRESS TO OFFER/ASSIGN, OR NAK

I had a lot of problems with the way the old code selected
among sending an address to the client, being silent, or
sending a NAK to the client.   Too often I'd find that
it was responding affirmatively to a client that it should
have been ignoring or NAKing, because it didn't perform
all the appropriate tests.  (In a few cases, I found it
NAKed when it should have been silent, too.)

I ended up revising much of the routines that make these
decisions (bootp_request(), dhcp_discover(), and dhcp_request()),
so that they behave much more as the relevant specs say they
should.  Ditto for the support routine check_net(), and a
new routine is_local().

We're also much more particular about the client sending us request
packets that meet the current specs.  That's because we are looking
more closely at some of the fields to determine in what state the
client is in.  We need to do this because the spec often proscribes
different server behavior based upon the client's state.  (This is
especially true for the much-overloaded DHCPREQUEST packet.)  So if you
have clients that produce request packets that don't meet the current
spec, you could have some problems with this new code.

The upshot is that you are likely to see different behavior from the
server here.  For example, we're smarter about dealing with clients who
move among networks, or who ask to use someone else's IP address, etc.
If you don't understand why the server is assigning (or not assigning)
something here, the best thing to do is to push your debug level up to
7.  Or read the comments in the source for those functions.  The code
isn't as pithy and fast as it was before, but it should be clearer what
we're doing.

I could not always get the server to behave perfectly in accordance
with the specs, because sometimes that would require information not
always available to me via the standard sockets interface on all
platforms.  (E.g. the interface on which a packet arrived, the IP
destination address of a packet.) In those cases, I chose to make the
server behave conservatively (e.g. be silent instead of NAKing).
Comments in the code describe what's going on.

---------------------------------------------------------------------

RENEW UNBOUND STATICS OPTION

I added a -u option that will be of interest if all of the following is
trye for you: you are doing DHCP, you are assigning (some or all) addresses 
statically, and you have multiple servers.

The spec says that when a client goes to renew, it unicasts the
DHCPREQUEST to the server to which it was bound.  If it gets no answer
to its retries after some time, it goes into REBINDING state, and
broadcasts the renewal request, retrying until the lease expires.

When we receive a DHCPREQUEST from a client in RENEWING (or REBINDING)
state, we normally only grant the renewal if the client has an
unexpired lease with us.  Otherwise we ignore the request (unless we're
SURE the address the client is asking for is inappropriate), since the
server the client is bound to should answer.

Now, if we had a server-to-server protocol in place today, our multiple
servers could exchange lease information.  Then when a client went into
REBINDING, any server that has received the lease info could respond to
the client.  But we don't have a standard server-to-server protocol
today.

As a temporary measure, the -u option will let your multiple servers
provide redundant service in this case, *if* the IP address in question
is one that is statically-assigned to the client that's asking.  So if
you have multiple servers all with the same set of static assignments
in their bootptabs, specifying "-u" on all of them will let them
provide redundant service when clients go into REBINDING because their
original server is unavailable.

If the requested IP address is one that is dynamically- assigned, the
"-u" option has no effect.  The server will not provide redundant
service if it sees the requested IP address is listed in the bootptab
as dynamically-assigned.  (That's because in the absence of a
server-to-server protocol, I assume that you must give each of your
servers independent ranges of dynamic IP addresses.)

By the way, this "-u" option has another use, even if you don't have
multiple servers.  Say you have clients who have unexpired leases, but
your server doesn't know about their unexpired leases.  (That would
happen after you convert from another dhcp server product to this one,
or you erase the bindings directory contents because it is corrupted.)
Starting your server with "-u" will help deal with the problem, since
when the clients with leases try to renew, you'll be able to answer
them, even though you didn't have them in your list of unexpired
leases.  Of course, this still only works if we're dealing with
statically-assigned IP addresses, not dynamically-assigned ones.

---------------------------------------------------------------------

LOG MESSAGES AND DEBUGGING

More debugging messages are available, and most existing ones have
changed.

The syslog levels at which we log messages has also changed.  Our rule
of thumb is that LOG_ERR indicates serious problems with the server
itself, or its databases.  LOG_WARNING and LOG_NOTICE indicate
conditions that are quite unusual and you may want to look into.  Most
messages are LOG_INFO level, and show the server in action.  Messages
at LOG_DEBUG are really intended to debug any problems in how the
server maintains its data structures.

We also support a wider range of debug levels.  We produce increasingly
detailed messages as the debug level increases to 16.  Here's a rough
idea of what messages are added at each level:

0: server errors, operational problems, database corruption

1: major events affecting server's operation (i.e. rereading
   databases)

2: arrival of request packet, IPsrc, length, the packet type, 
   the clientid or hardware address, and what IP address we 
   assigned/offerred (only if we decided to do so)

3: adds just a few messages indicating problems with the 
   request packet (short packets, can't decode)

4: adds most of the info about what state the client must
   be in (except a few repetitive ones), and most of the
   steps/decisions the server makes to decide what IP address
   to assign/offer, or why it ignores or NAKs the client.
   also reports interface info at startup.

5: adds just a few messages about the state the client must
   be in that tend to be wordy and repetitive

6: no additional messages

7: adds just a few messages indicating when we are searching
   for an address to assign/offer, and appropriate-network
   checks.  Also adds messages showing how we determine the
   IPdst of response packets

8: no additional messages

9: no additional messages

10: shows the contens of some less-interesting fields in the packet
    (e.g. bootfile, vendor magic cookie).  For DHCP, prints the
    lease expire, rebind, renew times.  Tell you about bootp requests 
    with vend fields shorter than legal minimum.

11: show start and completion of dhcp garbage collection, and
    flush lastbindings to disk

12: when reading bindings from disk at startup, details each
    record read and whether it was added to current bindings.
    Also during DHCP garbage collection, details what's
    done with each binding.  When re-reading conf file or bootptab,
    details which bindings are being removed.

13: Shows bootptab lastmod time every time packet is received.
    When sending packets, shows max length of packet and number
    of bytes left.

14: no additional messages

15: no additional messages

16: all remaining low-level debug messages, typically involving
    maintenance of the data structures.  To get these messages,
    you will also need to uncomment the DEBUG in the top-level
    Makefile.  


Note that the DEBUG definition in the Makefile now only controls the
availability of the level 16 messages.  (The choice between compiling
with the -g or -O option has been moved to a new OPT definition in the
top-level Makefile.)

These changes will provide you with greater ability to decide what
should get logged, and where, and to understand what the server is
doing and why.

If you're not having any problems, but just want to understand the
decisions the server is making, try running at debug level 7.  Unless
you have lots of free space or few clients, you normally won't want to
run with debug higher than 2.

---------------------------------------------------------------------

MULTIHOMED HOSTS

I recommend against running this server on a multihomed host.  There
are two major issues here:

a) Often the server needs to send a reply to the broadcast IP address.
On a multihomed host, which interface this packet goes out is not
well-defined.  This is (deliberately) left unspecified in the current
Host Requirements RFC.  Perhaps the OS copies it to all interfaces; on
many hosts, it goes out the first interface that was configured.  In
the latter case, if the client is attached to a different interface,
then the broadcasted response won't reach the client.

I considered the following workaround: for a multihomed host, replace
each outgoing broadcast packet with a series of subnet-directed
broadcasts, to force one out of each interface.  While that would get
the packets out all the interfaces, I didn't pursue that because then
the reply packets would contain destination IP addresses other than
255.255.255.255 -- which means that according to the BootP and DHCP
specs, they would not be valid BootP and DHCP replies.  Presumably many
clients would not recognize them as broadcasts (remember that these
clients may not know the subnet mask yet).

b) The spec says the server should, in most circumstances, check that
the IP address it is about to hand to a client is appropriate for the
network to which the client is attached.  If the request packet was
broadcast from a non-local network, we determine to what network the
client is presently attached by inspecting giaddr.  

But giaddr will *not* be set if the client is attached to a local
network, or if the request packet was unicast by the client.  When
giaddr is not set, the server *used* to assume the request packet
arrived on the server interface corresponding to the first IP address
returned by 'gethostbyname' on the server.  If the client was actually
on a different local network, this meant the server was making
decisions based on an incorrect notion of the network the client was
presently attached to.  This led to errors in dyanamic IP assignments.
(It didn't lead to errors in static IP assignments, since the server
wasn't checking those before handing them out.  But my mods check
before handing out static assignments as well, as per spec, so these
would be affected as well.)

If we could determine on what interface a request packet arrives, we
could deal with this.  I don't see a portable way to do that while
still using datagram sockets.  (I think one can partially solve the
problem by opening up a datagram socket bound to each interface's IP
address, instead of one bound to INADDR_ANY.  But you also have to
listen for broadcasts, so I don't see how to discover on what interface
those arrived.)

So since we don't know on what interface the packet arrived, the *new*
code checks the proposed IP address against *all* of the networks to
which the server is attached; if the IP address is appropriate for
*any* of those networks, we say that the address is OK.  (We determine
the list of server-attached networks by walking the OS's interface list
and grabbing IP addresses and netmasks.)

That's why I recommend not running the server on a multihomed host.  If
you must, then you can still avoid any problems by making sure there
are NO bootp or dhcp clients on any of the networks directly attached
to any of the server's interfaces.  (If there are no such clients, it
won't be forced to make the bad decisions.) If you must have clients on
those networks, then at least be sure that you're not assigning any
dynamic IP addresses on these networks.


---------------------------------------------------------------------

KNOWN BUGS

If you are building on a machine that needs 'signal()' called
after handling a signal, add "-DSYSV" to the CFLAGS in the Makefile.
(E.g. Solaris 2.5.1)  (Do in Makefile.in before running 'configure'
if you want this change preserved every time you run 'configure'.)
As pointed out by Petr Lampa <lampa@fee.vutbr.cz> on the CMU mailing
list, the configure program fails to do this.

When a client succesfully renews a lease, much of the information we
provide in the ACK is copied from the current lease.  This is a problem
if any of those values have changed.  (E.g. you fiddle with lease
values in the bootptab, or router list, subnet mask, etc.)  To get the
new values to the client, the existing lease must go away -- e.g. by
expiring, the client releasing it, or the server deciding to remove it
with extreme prejudice.   Note that major changes that invalidate the
client's lease (e.g. removal of client from bootptab, if limiting
service to "registered" clients) will cause the server to remove the
unexpired lease, so that's one way around it.  Sort of.

If you change any of the DHCP lease time values (lease time, rebinding
time, or renewal time in bootptab; or DEFAULT_LEASE definition), these
new values will NOT be provided to clients if they renew unexpired
leases.  When an unexpired lease is renewed, the expire/renew/rebind
time values for the extended lease are just copied from the old lease.
If the client's lease expires (or is otherwise discarded), the next
lease the client is granted will reflect the new values.

Should we really be adding a route to 255.255.255.255 on all OS's?

I wouldn't be surprised to hear that there are still more memory leaks,
or (worse) that we're discarding some information we still need.
(Watch out for bad pointer derefs.)

Don't specify lease times in the optional $DHCP or $Vxxxx bootptab
entries.  We're not looking in those templates correctly for lease time
information.

You may find the new code runs slower than the old code.  I found that
when they old code received a request packet, it often assigned an IP
address without performing many of the tests that were in the spec.
Adding those tests almost certainly slowed things down.  My top
priority has been to get the code to answer correctly, so I've not
spent time optimizing.  There's also some similar code scattered
through bootp_request(), dhcp_request(), and dhcp_discover() that I
have not factored out, because there are enough differences that
parameterizing them would have made it less readable while I was
debugging.

There are a variety of platform-specific include file fixes reported
on the CMU mailing list, but I didn't include them since I wasn't able
to test them.  Ditto for some linux-specific fixes.

As mentioned earlier, I haven't dealt with the -I issue in the top-level
Makefile.

---------------------------------------------------------------------

LEGAL STUFF

You may copy, modify, and redistribute these modifications and
documentation (the "product"), as long as doing so will not violate any
copyrights, patents, or other restrictions that may be imposed
on the underlying product by Carnegie Mellon University.

The product" is provided "as is" without warranty of any kind, either
express or implied, including without limitation any warranty with
respect to its mechantability or its fitness for any particular
purpose.  The entire risk as to the quality and performance of the
product is with you.  The author, and Princeton University, does not
warrant that the functions contained in the product will meet your
requirements or that the operation of the product will be uninterrupted
or error-free, or that defects in the the product will be corrected.