I recently learned about systemd-nspawn
which some pages claim to be similar in functionality to LXC with simpler setup as most of the pieces are already there in modern Linux distributions.
Since using LXD without snap has become cumbersome, I decided to give systemd-nspawn
a try.
Setup
-
Start with a clean Ubuntu 20.04 server install. This seems to be running systemd-networkd by default.
-
apt-get install systemd-container
-
Do the key setup for nspawn.org
Get an image and start a container
- Pull an image:
sudo machinectl pull-tar https://hub.nspawn.org/storage/ubuntu/focal/tar/image.tar.xz ubuntu-20.04
- start the vm:
machinectl start ubuntu-20.04
- check the status:
$ machinectl list
MACHINE CLASS SERVICE OS VERSION ADDRESSES
ubuntu-20.04 container systemd-nspawn ubuntu 20.04 192.168.8.165…
# I hate how systemd commands eat output
$ machinectl list -l
MACHINE CLASS SERVICE OS VERSION ADDRESSES
ubuntu-20.04 container systemd-nspawn ubuntu 20.04 192.168.1.11
169.254.70.231
fe80::d891:ecff:fe00:6958
1 machines listed.
Doing things in the container
It can be accessed as a “regular” VM or machine via the console.
# Start an interactive (root) session
machinectl shell ubuntu-20.04
Connected to machine ubuntu-20.04. Press ^] three times within 1s to exit session.
root@focal:~# exit
Configuring the container
Running a container as per above using machinectl uses a set of default options which work well in complete isolation (one can shell to it and copy files and use the network from the container), but a container is more interesting if it can interact with host resources like files and being a network server. Most of this can be configured via the command line by using systemd-nspawn
instead of machinectl
. But it’s also possible to configure most options via nspawn
unit files, which feel similar to LXD configuration profiles to an extent. So that’s the technique I’m going to use to customize most of the settings for my containers below.
Networking
If systemd-networkd
is installed on both the host and the guest, virtualized networking gets configured automatically. This is an interesting advantage of using Ubuntu Server 20.04, as systemd-networkd
is there by default.
The machine is visible from the host:
$ ping 192.168.8.165
PING 192.168.8.165 (192.168.8.165) 56(84) bytes of data.
64 bytes from 192.168.8.165: icmp_seq=1 ttl=64 time=0.130 ms
64 bytes from 192.168.8.165: icmp_seq=2 ttl=64 time=0.065 ms
64 bytes from 192.168.8.165: icmp_seq=3 ttl=64 time=0.095 ms
Make the machine visible to other hosts
I’m sure some portmapping trickery would work but since this is a container and should stand on its own, I wanted an alternative to lxd’s bridged networking where the container appears in the same network as the host, as a kind of “sibling”.
The thing to do is to create a bridge on the host system. This page has a good amount of detail on how to do that - in our example, it can be done in our ubuntu-server system by disabling cloud-init network config and creating a manual netplan config file:
# Do this as root, or use | sudo tee as appropriate
echo "network: {config: disabled}" > /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg
rm /etc/network/interfaces.d/50-cloud-init.yaml
# ens3 is the name of the network interface in the host, change accordingly.
cat << EOF > /etc/netplan/01-bridge.yaml
network:
version: 2
renderer: networkd
ethernets:
ens3:
dhcp4: no
bridges:
br0:
dhcp4: yes
interfaces:
- ens3
EOF
# Apply the configuration, or reboot
netplan apply
Once that’s done, configure the container to use the bridge (br0
as just configured):
sudo mkdir -p /etc/systemd/nspawn
cat << EOF | sudo tee /etc/systemd/nspawn/ubuntu-20.04.nspawn
[Network]
Bridge=br0
EOF
No virtual networking
In this mode the container uses the same network as the host (application container mode), it looks like a Docker container where all ports are “forwarded” to the host by default.
sudo mkdir -p /etc/systemd/nspawn
cat << EOF | sudo tee /etc/systemd/nspawn/ubuntu-20.04.nspawn
[Network]
VirtualEthernet=no
EOF
Disk/directory sharing
In the LXC world we had a very nice profile which enabled things such as:
- mapping your host user into the container with the same name
- Adding the user into the container with sudo permissions and no-password requirement
- installing your preferred shell
- Mapping your home directory into the container on the same location
- (Optionally mapping host directories into the container)
This can be achieved with systemd-nspawn
by using the same user IDs in the container and the host system. For this, ensure the image is pristine and has never been started with systemd-nspawn
or machinectl
(without the proper setup, these commands will change file ownership in strange ways, see the --private-users
parameter to systemd-nspawn
). With the image ready, first set PrivateUsers to false in the nspawn file and configure the directories you want mounted:
cat << EOF | sudo tee /etc/systemd/nspawn/ubuntu-20.04.nspawn
[Exec]
PrivateUsers=false
[Files]
# Single-parameter binds this directory name in the host as the same name in the container
Bind=/src
# Two parameters are source-in-host:destination-in-container
Bind=/home/my/sources-dir:/src
EOF
Next, as the user you will typically use or want to own the bound directories (from host to container), do this to create a container user with the same name, ID and group:
# Setup group and user in the container matching your current user's info.
C_UID=$(id -u)
C_GID=$(id -g)
C_GROUP=$(id -gn)
C_USER=$(id -un)
# --console=pipe is needed so the command doesn't open a tty into the container,
# which in turn is needed so both these commands can be given one after the other (otherwise,
# the tty eats the second command)
sudo systemd-nspawn --console=pipe -D /var/lib/machines/ubuntu-20.04 groupadd -g $C_GID $C_GROUP
sudo systemd-nspawn --console=pipe -D /var/lib/machines/ubuntu-20.04 useradd -g $C_GROUP -G sudo -u $C_UID -s $SHELL -m $C_USER
# TODO: Setup sudo like this and install some key packages (sudo?)
%sudo ALL=(ALL) NOPASSWD:ALL
# TO DO: Replicate SSH key setup
# TO DO: Replicate bind mounting home?
# TO DO: Replicate adding specific mounts?
# TO DO: Make the config script idempotent?
Once the user is created and binds are configured, the container can be started with machinectl start ubuntu-20.04
and the Bind-provided directories should be in place and accessible.
If things are owned by nobody/nogroup in the container, it probably means the PrivateUsers option is set to non-false, this will perform username mapping and gets tricky if you want read-write access to files.
Keep in mind that using the same UID/GID namespace is somewhat insecure, so it’s best to use it only for workloads you mostly trust or for local development. systemd-nspawn
has safer options using --private-users
but those are mostly incompatible with writable Bind
directories, so I won’t discuss them here. Also keep in mind that systemd-nspawn
is pretty spartan in its user mapping and volume management/permissions capabilities, so if you need something more elaborate you will probably be better off using LXC instead.
Configuration
Most of the interesting config options supported by systemd-nspawn
can be invoked more directly using that command, instead of machinectl
; that said, most functionality is available via machinectl
but it requires creating .nspawn
systemd units as seen above.
Other references
I pieced together this tutorial from resources found in the following sites:
I also had to read the man pages for systemd-nspawn
and systemd.nspawn
extensively - the first one contains more detailed documentation on how each option works, when given as command-line parameters to systemd-nspawn
itself, but I found it more comfortable to configure things in an .nspawn
unit file as shown above, these are documented in systemd.nspawn
and always correspond to systemd-nspawn
command-line options, and doing it in the unit file allows using machinectl
for most day-to-day operations (starting, stopping, creating, removing, shelling into the container).