MaraDNS troubleshooting guide

This troubleshooting guide is an example of how problems with MaraDNS may be resolved without needing to wait for support on the MaraDNS mailing list. This guide is a troubleshooting example that uses MaraDNS on a CentOS 3.8 system. People using MaraDNS on other systems will have to adapt this guide to suit their needs.

The problem we will troubleshoot in this example is MaraDNS not responding to DNS queries. As we will see in this guide, a number of different issues can cause this problem, and resolving the problem depends on what issue is causing the problem.

As just some of the possible issues, it is possible that the MaraDNS process is not running at all. It's possible that MaraDNS is running, but can't bind to the assigned IP (because of a Linux bug, MaraDNS can not accurately report this problem when run in Linux).

Here are some hints:

In the above mararc file, MaraDNS has the IP 127.0.0.1, would look for zone files in the directory /etc/maradns, and allows recursive DNS queries on the loopback interface.

OK, so let's look at some problems, as they appear on a CentOS 3.8 box with the above mararc file.

This is how things look when we don't have a loopback interface to bind to. Like in all examples in this guide, the '$' character indicates a line that we type data on; all other lines, including lines that start with '#', are lines created by the programs we are running in these examples.

$ askmara Awww.google.com.
# Querying the server with the IP 127.0.0.1
# Hard Error: Unable to send UDP packet!
Basically, the askmara client is unable to send a query because there is no way for it to contact a server on 127.0.0.1. Probably because there is no 127.0.0.1 to send the packet on. So, let's start troubleshooting.
$ export PATH=$PATH:/sbin:/usr/sbin:/usr/local/sbin
This gives us access to commands like ifconfig and what not.

$ su
Password:
type in your root password here

$ ifconfig lo 127.0.0.1
$ askmara Awww.google.com.
# Querying the server with the IP 127.0.0.1
# Hard Error: Timeout
OK, so let's restart MaraDNS:
$ /etc/rc.d/init.d/maradns restart
Sending all MaraDNS processes the TERM signal
waiting 1 second
Sending all MaraDNS processes the KILL signal
MaraDNS should have been stopped
Starting all maradns processes
Starting maradns process which uses Mararc file /etc/mararc
If /etc/rc.d/init.d/maradns restart doesn't generate the above output, this indicates that either MaraDNS was not correctly installed, or that you are using MaraDNS on another Linux/*NIX distribution. If you're not using CentOS or Red Hat Enterprise Linux, replace this command with the appropriate command for restarting a daemon/service for your operating system.

Now, lets look at some possible replies.

Server failure

$ askmara Awww.google.com.
# Querying the server with the IP 127.0.0.1
# Remote server said: SERVER FAILURE
# Question: Awww.google.com.
# NS replies:
# AR replies:
This is the askmara output when MaraDNS is running correctly but is unable to connect to DNS servers on the internet. This can be caused when the machine running MaraDNS does not have an internet connection, or when MaraDNS is being firewalled.

So, we get the internet connection up and going. If you have a working ethernet card and are on a network with internet access, this is as simple as making a DHCP request for an IP:

$ dhclient
Internet Systems Consortium DHCP Client V3.0.1
Copyright 2004 Internet Systems Consortium.
All rights reserved.
For info, please visit http://www.isc.org/products/DHCP

/sbin/dhclient-script: configuration for eth0 not found. Continuing
with defaults.
/sbin/dhclient-script: line 52: eth0: No existe el fichero o el directorio
Listening on LPF/eth0/00:40:f4:17:ac:e9
Sending on   LPF/eth0/00:40:f4:17:ac:e9
Listening on LPF/lo/
Sending on   LPF/lo/
Sending on   Socket/fallback
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 6
DHCPOFFER from 10.1.2.1
DHCPREQUEST on eth0 to 255.255.255.255 port 67
DHCPACK from 10.1.2.1
/sbin/dhclient-script: configuration for eth0 not found. Continuing
with defaults.
/sbin/dhclient-script: line 52: eth0: No existe el fichero o el directorio
bound to 10.1.2.3 -- renewal in 255 seconds.
Note that if you are using something besides CentOS or Red Hat Enterprise Linux, the command for getting a DHCP lease may not be dhclient.

Now, the dhclient that CentOS 3.8 comes with is buggy, and breaks lo (the loopback interface which gives CentOS the 127.0.0.1 IP address). So, we have to fix lo again:

$ ifconfig lo 127.0.0.1
In addition, losing 127.0.0.1 breaks any service bound to 127.0.0.1, such as MaraDNS, so we have to rebind MaraDNS to 127.0.0.1:
$ /etc/rc.d/init.d/maradns restart
Sending all MaraDNS processes the TERM signal
waiting 1 second
Sending all MaraDNS processes the KILL signal
MaraDNS should have been stopped
Starting all maradns processes
Starting maradns process which uses Mararc file /etc/mararc
Keep in mind that MaraDNS binds to high-numbered ports when sending outgoing DNS requests. The "Firewall Configuration" section of the MaraDNS man page gives details.

The problem with UNIX firewalls is that there is no standard interface for configuring them, so I can't help you as well as I would like here. CentOS 3.8, by default, has a firewall that allows MaraDNS to act as a recursive nameserver on the loopback (127.0.0.1) interface, but the firewall needs to be changed to work on other interfaces:

$ redhat-config-securitylevel-tui
And select "personalize", and add "53:udp" as a hole in the firewall. Yes, the interface for this program is somewhat primitive; hopefully CentOS 4 has a more complete interface.

You will have to do a similiar configuration change to any firewalls between your server and the internet.

Timeout

It is also possible to get a timeout after sending an askmara query. A timeout looks like this:
$ askmara Awww.google.com.
# Querying the server with the IP 127.0.0.1
At this point, there is a 30 second delay. After the delay, askmara outputs this message:
# Hard Error: Timeout
This is usually caused by one of two problems: To see if MaraDNS is running, run ps like this:
$ ps auxw | grep maradns
If MaraDNS is running, the output will look like this:
root      2023  0.0  0.0  1516  304 pts/1    S    11:46   0:00 /usr/bin/duende 
/usr/sbin/maradns -f /etc/mararc
nobody    2024  0.3  0.1  1748  596 pts/1    S    11:46   0:00 /usr/sbin/maradns
 -f /etc/mararc
#66       2025  0.0  0.0  1520  440 pts/1    S    11:46   0:00 /usr/bin/duende 
/usr/sbin/maradns -f /etc/mararc
user      2027  0.0  0.1  3720  700 pts/1    S    11:46   0:00 grep maradns
If MaraDNS is not running, the output will look like this:
user      1983  0.0  0.1  3728  696 pts/1    S    11:45   0:00 grep maradns

If MaraDNS is not running, there may be a message in the log files indicating why MaraDNS failed when you tried to start MaraDNS. Look at the log:

$ su
Password:
$ grep maradns /var/log/messages | more
The messages will give you a hint as to what is preventing MaraDNS from starting up. If there are no MaraDNS messages in your log, there is something wrong with your MaraDNS installation.

Conclusion

Basically, the best strategy for troubleshooting problems with MarNS is to have the mararc file be a simple three line mararc file. If things still don't work, the problem is probably outside of MaraDNS.