Setting up a subnet of Linux Containers on a Xen DomU
This short tutorial covers the setup of a bunch of Linux Containers on a "cloudy" virtual server using the relatively new macvlan bridge.
Disclaimer: I am not that kind of networking expert you should trust. The following code examples are to be taken with a grain of salt (i.e think about what they are> doing to your computer). Don 't copy and paste unless you know what you are doing!
So you 've got that cool cloudy Xen Box, which can be almost instantly upgraded to more power if the need arises. And you want it to be even more flexible to meet your changing needs.
Here is what I did on a freshly installed Debian Squeeze.
Get the right tools:
aptitude install lxc
That one was easy. Now check if your kernel supports all the features we need:
lxc-checkconfig
If the output of that program shows you something like this:
--- Namespaces ---
Namespaces: enabled
Utsname namespace: enabled
Ipc namespace: enabled
Pid namespace: enabled
User namespace: enabled
Network namespace: enabled
Multiple /dev/pts instances: enabled
--- Control groups ---
Cgroup: enabled
Cgroup namespace: enabled
Cgroup device: enabled
Cgroup sched: enabled
Cgroup cpu account: enabled
Cgroup memory controller: enabled
Cgroup cpuset: enabled
--- Misc ---
Veth pair device: enabled
Macvlan: enabled
Vlan: enabled
File capabilities: missing
your a lucky man/woman.
Well, lucky in the sense that you can probably use your existing config. But anyway you 've got to grab the kernel sources from your favourite location and roll your own.
man lxc
gives you some hints about the specific configuration options. Ah, and be sure your hosting provider let 's you use your own kernels and gives you practical configuration advices. Otherwise .. change!
The reason you will have to build your own kernel (unless you read this article at a time where at least 2.6.39 is released) is that in the implementation of the networking core the mtu checking for TSO enabled devices has to be disabled for a macvlan bridge working on another virtual device. So you have to apply the patch from http://article.gmane.org/gmane.linux.network/190843/match= to your kernel sources. Else the outgoing traffic from your macvlan devices will be totally scrambled, resulting in a big bunch of rx errors.
My specific Xen Box has one (of course virtual) ethernet interface "eth0" with a static IP assigned to it.
ifconfig
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:xxx errors:0 dropped:0 overruns:0 frame:0
TX packets:xxx errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:xxx (some KiB) TX bytes:xxx (some KiB)
eth0 Link encap:Ethernet HWaddr 00:16:00:00:00:42
inet addr:my.first.ip.0 Bcast:my.first.ip.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:xxx errors:0 dropped:0 overruns:0 frame:0
TX packets:xxx errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:xxx (some MiB) TX bytes:xxx (some MiB)
Interrupt:241
I ordered a second IP which, according to my hosting provider, can only be set up as an alias to eth0.
eth0:1 Link encap:Ethernet HWaddr 00:16:00:00:00:42
inet addr:my.second.ip.112 Bcast:my.second.ip.255 Mask:255.255.255.252
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:241
(In fact it is also possible to bind the second outgoing IP to a macvlan interface, which can be set up in "private" mode since there is no need for it to join the subnet(s) itself.)
I am going to assign to virtual lan devices with dedicated Hardware addresses (Macvlan) to each of the two outgoing devices. I want the whole subnet that is build on top of these devices to be able to comunicate to each other, so I will use the bridged mode of macvlan. The tools you need for that purpose come with the iproute2 package. In Debian it is called "iproute". The "2" is missing. Maybe it 's because there is only one package of that name. Make sure you grab the most possible recent version of that package. Cos we 're using real fresh technology. I for once use the version coming with Squeeze.
Let 's exercise:
ip link add link eth0 name macv0 address 00:13:00:e6:00:90 type macvlan mode bridge
That creates the virtual device "macv0" (You can name it in another fashion as it suits your taste) with a merely fictional Hardware address. You can choose any address you like as long as it meets the standard and doesn 't interfere with the other Hardware addresses in your network.
ip link set macv0 up
Activates our newly created device.
ip address add 192.168.111.10/24 broadcast 192.168.111.255 dev macv0
This command assigns the private class c address "192.168.111.10" with the appropriate broadcast address to our device.
Let 's have a look:
ip -s -s link show macv0
should give us something like this:
27: macv0@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
link/ether 00:13:00:e6:00:90 brd ff:ff:ff:ff:ff:ff
macvlan mode bridge
RX: bytes packets errors dropped overrun mcast
0 0 0 0 0 0
RX errors: length crc frame fifo missed
0 0 0 0 0
TX: bytes packets errors dropped carrier collsns
0 0 0 0 0 0
TX errors: aborted fifo window heartbeat
0 0 0 0
What else?
ip address show macv0
27: macv0@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
link/ether 00:13:00:e6:00:90 brd ff:ff:ff:ff:ff:ff
inet 192.168.111.10/24 brd 192.168.111.255 scope global macv0
inet6 fe80::213:some:thing:d590/64 scope link
valid_lft forever preferred_lft forever
Looks good so far. Now we create the second device "macv1" and assign it to eth0:1. (It will appear as an assignee to eth0 though)
ip link add link eth0:1 name macv1 address 00:13:00:e6:00:91 type macvlan mode bridge
ip link set macv1 up
ip address add 192.168.111.20/24 broadcast 192.168.111.255 dev macv1
Now your network devices are setup there are only a few configuration steps left to connect your network to the world.
You 've got to enable forwarding in your running kernel.
echo 1 > /proc/sys/net/ipv4/ip_forwarding
should work for now. You will also need some network address translation for the guests to connect to the internet.
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
works for testing purposes.
For lxc to work you have to do some more steps.
Add the following line to your /etc/fstab file:
cgroup /sys/fs/cgroup cgroup defaults 0 0
and mount the cgroup:
mount cgroup
Now let 's create a first container. Touch and edit the file /etc/lxc/mactest1.conf
lxc.utsname = mactest1
lxc.network.type = macvlan
lxc.network.macvlan.mode = bridge
lxc.network.flags = up
lxc.network.link = macv0
lxc.network.name =eth1
lxc.network.hwaddr = 4a:00:43:00:79:0e
lxc.network.ipv4 = 192.168.111.11/24
Create a debian (lenny) container
lxc-create -f /etc/lxc/mactest1.conf -n mactest1 -t debian
The automation of the involved scripts is good but not good enough. The template for the debian container creates a network device "eth0" configured for dhcp requests. Since that is not what we want, let 's change the configuration:
editor /var/lib/lxc/mactest1/rootfs/etc/network/interfaces
Delete or comment the eth0 stuff and add
auto eth1
iface eth1 inet static
address 192.168.111.11
netmask 255.255.255.0
network 192.168.111.0
broadcast 192.168.111.255
gateway 192.168.111.10
To start and play whith your newly created toy it is a good idea to start a screen session. So if you don 't already have it, get it.
aptitude install screen
Start a session
screen
and in that session type
lxc-start -n mactest1
if all went well you should be greeted by a login screen. Go and enter the system as root with the password "root", which is cool for typing but not overly secure I guess.
Since we installed our system with the unmodified debian template, there is no ping command available. Our first test of network connectivity is therefore:
apt-get update && apt-get install iputils-ping
If all went well, we are ready to set up a whole network of containers and play around with some configuration options. Please make sure that all virtual devices you create have a unique hardware address.
Further Reading:
http://rhonda.deb.at/blog/2011/03/23#lxc-and-nat-on-notebook
http://blog.ikibiki.org/2011/04/05/Running_X_from_LXC/
http://patchwork.ozlabs.org/patch/86815/ (contains reasoning about the pros and cons of macvlan)
http://www.mail-archive.com/lxc-users@lists.sourceforge.net/
** Update: Fixed Typo