The Hitchhogs Guide to IP-Tables
Building a firewall with Linux is actually a very easy business. This article describes some of the basics and gives a very simple set of instructions to build your own firewall. It is not meant as an exhaustive documentation of all features of IP-Tables.
This guide applies to all IP-Tables (Netfilter) based systems, that is Linux 2.4.*, and 2.6.*.
If you are already familiar with IP, TCP, UDP, and routing: skip to the next chapter.
Usually one just wants TCP/IP because that is what makes the Internet go and me surf. What we need to understand for building firewalls is that this consists of several network protocols.
IP is a packet based transport protocol. It knows the relationship between the local address (eg. 10.1.2.3) and a network interface (eg. eth0). It also knows to what interface and next computer to send a data packet by its destination address. The next computer (if it is not the final recipient) will also know where to send it next, and so on, until the packet arrives.
UDP is a packet transmission protocol on top of IP. Basically it enhances IP (which knows only addresses) by the concept of ports (in the range of 1-65535, 0 is reserved) - so that up to 65535 different communication channels can be used between hosts.
TCP is a stream protocol. Like UDP it enhances IP by ports, but it also gives the program using the protocol the impression that it is transmitting a continguous stream of data - TCP internally takes care of dividing the data into packets, re-assembling it into a stream and making sure that all bytes arrive exactly once and in the right order.
A TCP connection goes through several stages: connection establishment, communication, connection tear-down. Packets can be marked with different flags that tell the other side what action is expected. Connection establishment uses the so called three-way-handshake: the initiating host (client) sends a packet marked SYN to the receiving side (server), telling it that it wants to communicate. The server may deny (RST packet) or acknowledge (SYN, ACK), then the client acknowledges the servers SYN (only ACK). Normal communication uses unmarked packets or any combination of the flags ACK, URG, PSH for various reasons. Connection tear down uses the FIN and ACK flags to finish the connection.
There are two more protocols of interest: ICMP is used for error messages and in IPv6 for local link control - there are people who filter certain ICMP messages in the hope that this makes specific attacks more difficult, usually those effects are highly theoretical and the effect is nil with modern attack kits - hence the following sections accept ICMP unconditionally. ARP is used by IPv4 to resolve the link between IP-Addresses and local network addresses (eg. ethernet MAC) - it is only locally relevant and cannot be avoided, it is not even configurable in the IPv4 filter chains.
Other protocols (GRE, IP-in-IP tunnels, IP/Sec, SCTP) will be largely dropped in the rule sets of this guide, if you need them you can easily add rules to allow them once the firewall is running.
Other higher level protocols (like HTTP, or FTP) usually work on top of either TCP or UDP.
Because we will pretty soon run out of IPv4 addresses, a switch to the new protocol version IPv6 will have to be made soon. There will be a while during which both protocols can and will run in parallel. Hosts that support both are called dual-stack hosts.
IPv6 addresses are 128 instead of 32 bits wide. They are usually in colon-separated-hex format, eg. 2001:db8:dead:beef::2. Please check Wikipedia on IPv6.
TCP and UDP still work the same way (with only very minor differences).
IP-Tables provides sets of tables for each protocol. We will be concerned with the tables for IPv4 and IPv6 here. Since IPX and other low-level protocols are very rarely used these days, I'll ignore them here.
The utilities important to us are iptables, iptables-save, and iptables-restore.
The filter table is the most important here: it is used to filter out unwanted traffic and let through traffic that is wanted.
The mangle table will be ignored here since it is used to change the content of packets, which is somewhat outside the scope of the standard firewall.
The nat table is used for hosts that either need to hide a whole network behind a gateway (source NAT, Masquerading) or that forward specific ports to different hosts (destination NAT). I'll give a few examples below.
The utilities are named accordingly: ip6tables, ip6tables-save, ip6tables-restore.
The filter and mangle tables serve the same purpose as above.
The nat table does not exist for IPv6. There is no specification for NAT for IPv6 yet. Most experts agree that there is no need for NAT for IPv6, since it does not serve any real security purpose and there are enough addresses to give every square millimeter of the planet, including the oceans, about 667126144781400397 addresses - this is a lot of ethernet devices on a single square millimeter.
In both cases the filter table has three pre-defined chains: INPUT, FORWARD, and OUTPUT. More chains can be created on the fly if needed.
If a packet is coming from the network and its target is the receiving system, then it is applied to INPUT.
If it is sent from the local system it is applied to OUTPUT before being sent on the network.
If it is received from the network and will be sent on to another destination on the network it is applied to FORWARD.
Each chain contains rules. A rule consists of a pattern to match against and an action, the action is executed if the packet matches the rules pattern. The rules of a chain are checked against from top to bottom, the first rule that matches ends the check by simply executing its action. An action can be a hint whether to let the packet through or discard it or another chain that is checked next. Allowed actions are:
If no rule in any chain that was traversed matches, the pre-defined chains have a policy (ACCEPT, DROP or REJECT) that defines what to do with those remaining packets.
The nat table has three predefined chains: PREROUTING, POSTROUTING, OUTPUT.
PREROUTING is applied to packets coming from outside before any routing decision is made.
POSTROUTING is applied to packets that have already been checked agains the filter table.
There are additional actions, some of which will be explained below.
The main components of the firewall will always be the same. Below I will give examples on which rules apply to which kind of host and explain how they were constructed. Each real host will probably be a combination of several examples listed below.
Please read all examples, even if you think you don't need them - each example contains some explanations that may be useful even for other configurations.
Our main component is the start up script - it initializes the firewall during boot or after a change to the configuration. It needs to reside in the /etc/init.d directory for SysV-Init to find it and could be called "firewall" or something similiar:
#!/bin/sh case "$1" start|restart|reload) echo -n Starting Firewall... iptables-restore </etc/iptables ip6tables-restore </etc/ip6tables echo done. ;; esac
We use the ip*tables-restore utility to load the firewall and store the configuration in /etc/iptables for IPv4 and /etc/ip6tables for IPv6. The initial configuration can be generated with ip*tables-save if you like.
The script above must be executable in order to work. As is visible in the case statement it completely re-starts the firewall on start, restart, and reload; but it does not have a stop rule - a firewall should not be stopped.
Now you need to enter the script into the run-levels that need to start it. With Debian this is done with update-rc.d on OpenSuse with Yast, other distributions will have their own tools. You should insert it immediately after the network setup.
This is the starting point for all firewalls. If you don't want to use a protocol or a chain you leave it in the state described here.
*filter :INPUT DROP :FORWARD DROP :OUTPUT DROP COMMIT
The *filter keyword tells iptables that we are configuring the filter table, the lines below configure the policy for each pre-defined chain. The COMMIT keyword tells iptables that the configuration is complete and can be loaded into the kernel.
In this example all chains are without rules and have a policy to drop packets. In other words communication is completely cut off.
Use a configuration like this if you want to turn a protocol completely off (eg. for IPv6 on an IPv4-only host). Use the DROP policy on FORWARD for hosts that are not routers.
A little warning is in order: hosts that do this for both IPv4 and IPv6 will not be able to do much: a lot of services (eg. MySQL or X11) usually need the loopback interface which is also denied here.
The usual philosophy of firewall building is to first deny everything and then define what is allowed. Hence we start with the above fully closed system and then define what exceptions we want to make.
Many services on a modern Linux require the loopback interface (lo) to work. When two programs communicate on loopback, the packet sent by one program enters the kernel, traverses OUTPUT, is sent to the loopback interface, returns to the firewall, traverses INPUT, and is received by the other program. The loopback interface exists only locally, so it never forwards. This means we have to allow loopback traffic in INPUT and OUTPUT:
*filter :INPUT DROP :FORWARD DROP :OUTPUT DROP -A INPUT -i lo -j ACCEPT -A OUTPUT -o lo -j ACCEPT COMMIT
The added lines are identical to the parameters that would be given to the iptables command if the chains would be built manually, so you can use the iptables man page to check for parameters. I'll explain the ones used here:
A very simple computer on a local LAN will usually allow the user to browse the web and do a few other network protocols, but not the network to access any servers (eg. a local database) that might be installed on the computer.
We assume here that the users are trusted (not always true) and allow anything out. Most protocols use TCP, so we need to allow TCP if it originates inside and disallow it if it originates outside. Furthermore we need UDP port 53 to enable DNS name resolution.
I'll also assume that the computer is connected via an ethernet interface called eth0. If it uses PPP or the interface may change names, you need to use wild-cards: ppp+ (any PPP device) or eth+ (any ethernet device) - the "+" is used in iptables like the "*" in file names.
We'll introduce our first user defined chain: ineth. Since user defined chains have no policy instead of DROp we write "-".
*filter :INPUT DROP :FORWARD DROP :OUTPUT DROP :ineth - #filter input -A INPUT -i lo -j ACCEPT -A INPUT -i eth0 -j ineth #allow output to known devices -A OUTPUT -o lo -j ACCEPT -A OUTPUT -o eth0 -j ACCEPT #ethernet input -A ineth -p icmp -j ACCEPT -A ineth -p udp -m udp --sport 53 -m state --state ESTABLISHED,RELATED -j ACCEPT -A ineth -p tcp -m state --state ESTABLISHED,RELATED -j ACCEPT -A ineth -j DROP COMMIT
The "-j ineth" action in INPUT tells iptables to check the ineth chain for packets coming from eth0.
The -p switch matches a protocol directly above IP, it can be tcp, udp, or icmp (icmpv6 for IPv6). In the example above the protocol ICMP is accepted always, while UDP and TCP are filtered further.
The -m switch tells iptables that the switches right of it match a specific protocol header (here udp or tcp), or a class of features (eg. state or owner). This switch must be used if specific features of ICMP, TCP, UDP, or the connection need to be filtered.
The --state switch filters by the connection state. This means that connection tracking needs to be active (it is automatically loaded if the kernel supports it). The ESTABLISHED flag tells it to match any packet of a connection that has already been established (here: the local host sent the first packet). The RELATED flag matches any packet that is related to an open connection - this can be ICMP errors or if eg. FTP connection tracking is active the establishment of a data transfer connection that has been requested in a running FTP session.
The --sport switch tells iptables to filter the source port where the packet is coming from, in this case it checks whether it comes from port 53, which is the port that DNS responses come from.
The final -j DROP is not really necessary in this particular setup, but it makes debugging easier and ensures that no packet returns from the chain to meander through any other chain that we might add later.
While the kernel executes the rules from top to bottom it is actually easier to read them from bottom to the top, in our example it would read thus:
The last two lines of the rule set could also be written in a state-less way that works without connection tracking:
-A ineth -p tcp -m tcp --syn -j DROP -A ineth -p tcp -m tcp --dport 1024: -j ACCEPT -A ineth -j DROP
The --syn switch tells iptables to drop packets that contain the SYN, but not the ACK flag - ie. the first handshake bit of a TCP connection, since we do not want outsiders to establish a connection to us.
The --dport switch tells iptables to accept traffic on a specific port. In the case above it is a range of ports in the format first:last (if first is ommitted it is assumed to be 0, if last is ommitted it is assumed to be 65535).
The human readable translation would now be:
A little warning: in theory this could allow so called SYN-cookie-attacks if the kernel does use SYN-cookies, however modern kernels don't use them per default and the attack is still quite unlikely to succeed.
Sometimes we want to allow others in the network to access a few server applications on our host. The configuration is very simple: open the servers port as an exception to the draconian rules we introduced above.
Let's show this with a DNS server as example. DNS uses port 53 on the server side, both with UDP and TCP:
#ethernet input #same as simple host: -A ineth -p icmp -j ACCEPT #new: -A ineth -p udp -m udp --dport 53 -j ACCEPT -A ineth -p tcp -m tcp --dport 53 -j ACCEPT #same as simple host: -A ineth -p udp -m udp --sport 53 -m state --state ESTABLISHED,RELATED -j ACCEPT -A ineth -p tcp -m state --state ESTABLISHED,RELATED -j ACCEPT -A ineth -j DROP
As visible above we just allow port 53 on TCP and UDP before any of the other TCP and UDP rules. The method is the same for any other server application, like Apache (80/tcp, 443/tcp), or SSH (22/tcp).
If the host is only a server there is (almost) no reason for any user to use the network, which means we can switch from a users-are-trusted perspective to a users-may-be-hacked perspective in the OUTPUT chain. Let's construct a paranoid firewall for a paranoid Debian based DNS server:
*filter :INPUT DROP :FORWARD DROP :OUTPUT DROP :ineth - :outeth - :outroot - #filter input -A INPUT -i lo -j ACCEPT -A INPUT -i eth0 -j ineth #allow output to known devices -A OUTPUT -o lo -j ACCEPT -A OUTPUT -o eth0 -j outeth #ethernet input -A ineth -p icmp -j ACCEPT -A ineth -p udp -m udp --dport 53 -j ACCEPT -A ineth -p tcp -m tcp --dport 53 -j ACCEPT -A ineth -p udp -m udp --sport 53 -m state --state ESTABLISHED,RELATED -j ACCEPT -A ineth -p tcp -m state --state ESTABLISHED,RELATED -j ACCEPT -A ineth -j DROP #ethernet output -A outeth -p icmp -j ACCEPT -A outeth -p udp -m udp --dport 53 -j ACCEPT -A outeth -p udp -m udp --sport 53 -j ACCEPT -A outeth -p tcp -m tcp --sport 53 -j ACCEPT -A outeth -p tcp -m owner --uid-owner 0 -j outroot -A outeth -p tcp -m state --state ESTABLISHED,RELATED -j ACCEPT -A outeth -p tcp -j REJECT -A outeth -p udp -m udp -j REJECT -A outeth -j DROP #special rules for root -A outroot -p tcp -m tcp -d 188.8.131.52 --dport 80 -j ACCEPT COMMIT
The input rules are the same as above, however we now filter the output and added two chains: outeth to generally filter output on eth0 and outroot to add some special rules for the root user.
As in input we allow ICMP on output.
The rules for the DNS server are an exact mirror of the input rules (--dport and --sport swapped).
Root needs to be able to make updates to installed packages and hence gets his own chain. The --uid-owner parameter matches agains a user ID - here root.
Then we prohibit internal users to create any other connections by filtering --syn. We accept TCP if it is already established by any of the exceptions.
We deny UDP in particular and anything else generally - pretty much as with input.
There is a difference in rejecting packets here: we are using REJECT instead of DROP, which is simply being nice to the programs running inside so that they get an error message instead of having to wait for a timeout.
The rule for root here allows to establish communication with ftp.de.debian.org (-d 184.108.40.206) - one such rule needs to be added for each package server that is accessed by the local package management system (usually they use HTTP as transport, hence port 80).
Of course it is possible to define exceptions for other users as well. You should try to be as specific about destination hosts and ports as possible.
The whole concept of denying users access to the network might seem a bit radical at first, but as mentioned above: on a server system users have normally no business using the network and it might just save your server from becoming a zombie - the first component installed after a break in is usually only a loader package that then establishes a connection to its server to retrieve more malicious code and to its command-and-control server to wait for commands. This will be impossible with a very tight setup.
The average IPv4 dial-in connection only gets one IP-Address from the provider, that usually changes with each dial-in. In order to get other hosts in the same network to access the Internet, they need to be hidden behind this one IP-Address.
This is called NAT - Network Address Translation. Or more specifically source NAT or Masquerading.
When a host from the internal network tries to establish a connection to the outside network, the packet is received by the router, the router replaces the sender IP with its own IP address and the port with a free local port. It then remembers the relationship between real sender/port and the assigned port. When the answer comes in it looks into its relationship table and replaces the destination IP address and port until the connection is terminated or times out.
Sometimes we also want to make a service from inside the network available on the outside network, meaning a port on the router is really connected to a port on a different host inside the network. This is called port forwarding or destination NAT (DNAT).
Sometimes a special protocol is not understood by the router, but by another host in the network. For example if the VPN server or IPv6-in-IPv4 tunnel endpoint is on a different host. This is called protocol forwarding.
These are accomplished via the nat table:
############ # NAT: *nat :PREROUTING ACCEPT :POSTROUTING ACCEPT :OUTPUT ACCEPT -A POSTROUTING -o ppp+ -j MASQUERADE -A PREROUTING -p tcp -m tcp -i ppp+ --dport 2000 -j DNAT --to 192.168.11.51:22 -A PREROUTING -p 41 -i ppp+ -j DNAT --to 192.168.11.55 COMMIT ############ # Filtering: *filter :INPUT DROP :FORWARD ACCEPT :OUTPUT DROP :ineth - :inppp - #filter input -A INPUT -i lo -j ACCEPT -A INPUT -i eth0 -j ineth -A INPUT -i ppp+ -j inppp #allow output to known devices -A OUTPUT -o lo -j ACCEPT -A OUTPUT -o eth0 -j ACCEPT -A OUTPUT -o ppp+ -j ACCEPT #allow forwarding #ethernet input -A ineth -p icmp -j ACCEPT -A ineth -p tcp -m tcp --dport 22 -j ACCEPT -A ineth -p tcp -m state --state ESTABLISHED,RELATED -j ACCEPT -A ineth -j DROP #PPP input -A inppp -p icmp -j ACCEPT -A inppp -p udp -m udp --sport 53 -j ACCEPT -A inppp -p tcp -m state --state ESTABLISHED,RELATED -j ACCEPT -A inppp -j DROP COMMIT
The above example assumes that the connection to the outside world happens via PPP (modem, ISDN, DSL, etc.pp.) and matches all PPP interfaces (ppp+).
The MASQUERADE operation is for the first packet in a connection and is done after the system already decided where to route the packet. The return route does not need to be configured - the kernel automatically remembers it.
The two DNAT rules need to be executed before a routing decision is made by the kernel, because the packets are re-routed to the network instead of the local box.
The first of the two DNAT rules represents port forwarding: it forwards port 2000 from the PPP-Interface to port 22 on an internal machine - meaning the internal machine can be accessed via SSH on port 2000 of the public IP address.
The second DNAT rule represents protocol forwarding - in this case protocol 41 (IPv6 in IPv4 tunnel), which means that the tunnel protocol is handled by another machine on the internal network.
If the FORWARD chain is not per default opened as here each DNAT rule needs a corresponding FORWARD rule to allow the protocol or port that is forwarded.
Each table that is configured needs a closing COMMIT statement to write it to the kernel.
The assumption for the filter table is that eth0 represents the internal network and ppp+ the external network. Each device has got its own user defined chain to make it easier to distinguish the rules for the devices.
In the example above the internal network can access the SSH-port of the router (eg. to administrate it), but otherwise only already established connections are allowed. As an excercise it is left to the reader to convert the appropriate DROPs to REJECTs to be a bit nicer to the hosts on the local network if they try something that is forbidden.
On the external side only DNS responses and already established TCP connections are allowed in. This could be combined with the paranoid server setup above to make the router even more secure.
The FORWARD chain is opened up per default in this example. The reason behind this is that we assume that the internal network is trusted and external entities cannot access the internal network unless the connection was established from the inside. In this SNAT takes away much of the need for a firewall on the FORWARD chain, but this changes radically if the internal network is untrusted or the network does not use NAT.
Eg. a FORWARD setup that allows only web could look like this (only the changed parts):
:FORWARD DROP #inside to outside packets -A FORWARD -p icmp -j ACCEPT -A FORWARD -i eth0 -o ppp+ -p tcp -m tcp --dport 80 -j ACCEPT -A FORWARD -i eth0 -o ppp+ -p tcp -m tcp --dport 443 -j ACCEPT -A FORWARD -i eth0 -o ppp+ -p tcp -m tcp --dport 53 -j ACCEPT -A FORWARD -i eth0 -o ppp+ -p udp -m udp --dport 53 -j ACCEPT -A FORWARD -i eth0 -o ppp+ -p udp -j REJECT -A FORWARD -i eth0 -o ppp+ -p tcp -j REJECT -A FORWARD -i eth0 -o ppp+ -j DROP #outside to inside (assuming NAT) -A FORWARD -i ppp+ -o eth0 -j ACCEPT
The above example allows HTTP (80), HTTPS (443), and DNS (53). ICMP is allowed in any direction. It rejects all other UDP/TCP based protocols and DROPs all unknown protocols. It is not concerned with the return packets because of NAT.
Here the scenario for FORWARD changes: it is now necessary to protect the internal network against attacks. I will assume that we still use PPP with a more forthcoming provider, but it can easily be replaced by any network interface ID.
We now also have to differenciate between access to the router (INPUT/OUTPUT) and access to the network (FORWARD).
*filter :INPUT DROP :FORWARD DROP :OUTPUT DROP :ineth - :inppp - :fwppp - #filter input -A INPUT -i lo -j ACCEPT -A INPUT -i eth0 -j ineth -A INPUT -i ppp+ -j inppp #allow output to known devices -A OUTPUT -o lo -j ACCEPT -A OUTPUT -o eth0 -j ACCEPT -A OUTPUT -o ppp+ -j ACCEPT #allow forwarding -A FORWARD -i eth0 -o ppp+ -j ACCEPT -A FORWARD -i ppp+ -o eth0 -j fwppp #PPP to local Forwarding -A fwppp -p icmp -j ACCEPT -A fwppp -p udp -m udp --sport 53 -m state --state ESTABLISHED,RELATED -j ACCEPT -A fwppp -d 10.2.1.0/24 -j DROP -A fwppp -d 10.2.3.4 -j fwweb -A fwppp -p tcp -m state --state ESTABLISHED,RELATED -j ACCEPT -A fwppp -j DROP -A fwweb -p tcp -m tcp --dport 80 -j ACCEPT -A fwweb -p tcp -m tcp --dport 443 -j ACCEPT COMMIT
Above I ommitted the ineth and inppp chains, they are identical to the NAT router.
The policy on FORWARD is back to DROP - be paranoid. We assume that the internal users are allowed to do anything, hence we allow anything from eth0 to ppp+ - if you don't trust your users you can modify the FORWARD chain the same way as shown above at the end of the NAT example.
For traffic from outside to the inside we create a new chain: fwppp. As usual we accept ICMP and DNS. The next rule protects the 10.2.1.* subnet by making it invisible to the outside - this might be to protect the backup servers from outside access or for any other reason that a host or network needs to be completely in-accessible. The host 10.2.3.4 is our web server, we redirect to the new fwweb chain to check whether this is web traffic (ports 80 and 443), the remainder is the same rules as we already used several times: allow TCP if it came from inside, drop everything else.