Friday, July 27, 2007

Disk space shortage wreaks havoc on a domain controller

One day our Windows 2003 domain controller ran out of disk space, and a slew of problems ensued. Apparently, Active Directory synchronization failed due to insifficient space, and the event log was full of messages like this:
Event Type: Warning
Event Source: NETLOGON
Event Category: None
Event ID: 5705
Description:
The change log cache maintained by the Netlogon service for  database changes is inconsistent. The Netlogon service is resetting the change log.

Event Type: Warning
Event Source: W32Time
Event ID: 26
Description:
Time Provider NtpClient: The response received from domain controller  has a bad signature. The response may have been tampered with and will be ignored.

Event Type: Error
Event Source: NETLOGON
Event ID: 5805
Description:
The session setup from the computer  failed to authenticate. The following error occurred: 
Access is denied. 

Event Type: Warning
Event Source: LSASRV
Event Category: SPNEGO (Negotiator) 
Event ID: 40960
Description:
The Security System detected an authentication error for the server cifs/.  The failure code from authentication protocol Kerberos was "The attempted logon is invalid. This is either due to a bad username or authentication information. (0xc000006d)".

Event Type: Error
Event Source: Kerberos
Event ID: 4
Description:
The kerberos client received a KRB_AP_ERR_MODIFIED error from the server host/.  The target name used was ldap//@. This indicates that the password used to encrypt the kerberos service ticket is different than that on the target server. Commonly, this is due to identically named  machine accounts in the target realm (), and the client realm.   Please contact your system administrator.

Event Type: Error
Event Source: NETLOGON
Event ID: 3210
Description:
This computer could not authenticate with \\, a Windows domain controller for domain , and therefore this computer might deny logon requests. This inability to authenticate might be caused by another computer on the same network using the same name or the password for this computer account is not recognized. If this message appears again, contact your system administrator.
Even after freeing some space, problems continued, since Active Directory database was damaged. The solution was to reset computer domain password using netdom, as documented in Microsoft KB article 260575

What does "localhost" mean in named.conf?

Configuring BIND 9, run across an interesting issue. What does the following excerpt from named.conf mean:
options {
        listen-on port 53 { localhost; };
        ...
};
I thought this tells BIND to listen only on loopback adress (127.0.0.1). After all, this what "localhost" usually resolves to. To my great surprise, I've found that BIND is listening on all network interfaces. As it turns out, in the context of BIND configuration, localhost "Matches the IPv4 and IPv6 addresses of all network interfaces on the system." Go figure! The correct configuration is as follows:
options {
        listen-on port 53 { 127.0.0.1; };
        ...
};

Sunday, July 15, 2007

Masquerading multiple IPSec connections on a Linux router

Our company network is set up in a rather standard way. We have a Linux router connected to an ISP with one network card, and to the local network with another one. The local network uses the reserved IP range 192.168.1.0-192.168.1.255. The router does Network Address Translation (NAT), also known as IP Masquerade, to allow the internal hosts transparent access to the Internet. This type of configuration is widely used and is well documented, in Linux 2.4 NAT HOWTO or Linux IP Masquerade HOWTO to name a few sources.

IP masquerade works well for TCP and UDP, but how about some more exotic IP protocols? Not that easy when it comes to IPsec VPNs. We need to connect to a customer's site that uses Contivity VPN from Nortel. This is an IPsec-based product. Unfortunately, masquerading several IPsec connection through one router is a non-trivial task that Linux is currently not capable of. IPsec connection involves a handshake over UDP, after which the data is transmitted over IP protocol 50 (Encapsulating Security Payload, ESP). Since there is no connection tracking support for IP protocol 50 in the Linux kernel, only one internal client can connect to a remote IPsec VPN server at any time, because the kernel cannot tell one connection from another. This was not acceptable to us, since we need several people to work with the customer's VPN simultaneously.

NAT Traversal feature of IPsec protocol is supposed to resolve this problem, however, for some reason, NAT Traversal didn't work with the customer's VPN server. Attempts to persuade the customer's IT personnel to look into the issue and enable NAT Traversal proved unsuccessful (they either didn't understand what the problem was, or didn't care, or both). We had to resolve the issue ourselves.

After some thought, the solution was found. First it involved getting several external IP addresses from our ISP, as many as many people we needed to work with the customer's VPN. Luckily, it wasn't many, just five. The idea then was to route the IPsec traffic from the five internal clients through separate external IP addresses. This will allow the kernel to keep track of each connection separately and not mix them up. Here's how I achieved this using iptables.

For the sake of example, I'll assume that our external IP numbers were 1.1.1.1, 1.1.1.2, 1.1.1.3, 1.1.1.4 and 1.1.1.5. All these numbers were bound to our external network interface as aliases. Assuming our external interface is called eth1, these new addresses were assigned to alias interfaces eth1:1, eth1:2, etc. up to eth1:5. Also for the sake of example, assume that the client computers that need to talk to the customer's VPN have internal IP addresses 192.168.1.1 - 192.168.1.5 and that the customer VPN server's address is 2.2.2.2.

With this setup, I added the following iptable rules to NAT table:

iptables -t nat -A POSTROUTING -s 192.168.1.1 -d 2.2.2.2 -j SNAT --to-source 1.1.1.1
iptables -t nat -A POSTROUTING -s 192.168.1.2 -d 2.2.2.2 -j SNAT --to-source 1.1.1.2
...
iptables -t nat -A POSTROUTING -s 192.168.1.5 -d 2.2.2.5 -j SNAT --to-source 1.1.1.5

This tells the kernel the following: right after routing (POSTROUTING chain), if the packet is coming from 192.168.1.x (a VPN client) to 2.2.2.2 (the VPN server), masquerade it using source address 1.1.1.x.

An important note is that these rules should precede the rules that masquerade the entire network. I have a rule like this:

iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -o eth1 -j MASQEURADE
and it is added to the POSTROUTING change after the special rules above.

The above technique allows to masquerade multiple IPsec (or indeed any IP protocol) connections when NAT Traversal is not available. This said, I would much prefer if NAT Traversal worked and saved me the headache.