chris 9286 9213 0 15:51 pts/0 00:00:00 rootlesskit --net=vpnkit --mtu=1500 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run bin/dockerd-rootless.sh --experimental --storage-driver overlay2 chris 9295 9286 0 15:51 pts/0 00:00:00 /proc/self/exe --net=vpnkit --mtu=1500 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run bin/dockerd-rootless.sh --experimental --storage-driver overlay2 chris 9325 9295 0 15:51 pts/0 00:00:04 dockerd --experimental --storage-driver overlay2 chris 9343 9325 0 15:51 ? 00:00:03 containerd --config /home/chris/rootless/docker/containerd/containerd.toml --log-level info
To start a rootless mode container, we need to point Docker Engine precisely at where the Docker socket file is located. Within a second terminal, we will run the following commands to spawn a rootless Apache container:
$ systemctl --user start docker $ export XDG_RUNTIME_DIR=/home/chris/rootless; \ export DOCKER_HOST=unix:///home/chris/rootless/docker.sock; \ export PATH=/home/chris/bin:$PATH $ docker run -d -p 8000:80 httpd Unable to find image 'httpd:latest' locally latest: Pulling from library/httpd bf5952930446: Already exists 3d3fecf6569b: Pull complete b5fc3125d912: Pull complete 679d69c01e90: Pull complete 76291586768e: Pull complete Digest: sha256:3cbdff4bc16681541885ccf1524a532afa28d2a6578ab7c2d5154a7abc182379 Status: Downloaded newer image for httpd:latest a8a031f6a3a3827eb255e1d92619519828f0b1cecfadde25f802a064c6258138
Excellent. That is what success looks like when the Docker runtime downloads an image and spawns a container in rootless mode. Note that if you had not chosen TCP port 8000 but instead a port lower than 1024 (normally TCP port 80 for web servers), then you would have received an error because, as a nonroot user, we can't open a privileged or root port.
Also, take note that this feature is very new, and the process to getting rootless Docker to work may vary between builds. You have been warned!
If you run into trouble and need to start again, then carefully as the root
user you can try the following command (after trying to execute it as your lower privileged user first) to kill off related processes:
$ pkill rootlesskit; pkill dockerd; pkill experimental; pkill containerd
This should stop all the processes so you can start fresh.
Let's do one final test to show that we have a container running in rootless mode that would be accessing the web server. A reminder that unfortunately network namespaces work differently when using rootless mode. Instead, you can try a few other familiar commands such as the following:
$ docker ps CONTAINER ID IMAGE COMMAND STATUS PORTS a8a031f6a3a3 httpd "httpd-foreground" Up 15 minutes 0.0.0.0:8000->80/tcp
In the slightly abbreviated output, you can see that Apache's httpd
container is running as hoped. To prove that the networking is different with this implementation, we can use this command to check our container's IP address (replacing a8a031f6a3a3
with the name or hash ID of your container):
$ docker inspect a8a031f6a3a3 | grep IPAddress "IPAddress": "172.17.0.2", "MacAddress": "02:42:ac:11:00:02", "IPAddress": "172.17.0.2",
We can see that the container is using 172.17.0.2 as its IP address, but now try to connect to the exposed port, TCP port 8000:
$ nc -v 172.17.0.2 8000
Nothing happens. The connection does not work using the netcat
tool, so we can see that there's definitely a change in the way standard networking is running. According to the documentation cited earlier, this is expected behavior and occurs because “the daemon is namespaced inside RootlessKit's network namespace.” We are not using privileged ports (lower than port 1024), so it is possible to access the container's exposed port; but as you might have guessed, we must do so via the host's network stack. For some potentially helpful context, if you're familiar with Kubernetes, this functionality is called NodePort, where a container directly uses a host port on the host's IP Address so that the container is accessible from outside of the cluster (more can be found at kubernetes.io/docs/concepts/services-networking/service
). The following netcat
command will work and will not just quietly fail this time:
$ nc -v localhost 8000 Connection to localhost 8000 port [tcp/*] succeeded!
And, to prove that is the correct container that responded to our netcat
request, we can use the curl
command to check that port on our localhost
too:
$ curl localhost:8000 <html><body><h1>It works!</h1></body></html>
We have completed the installation of rootless mode using Docker and additionally successfully proven the networking service of a container running as the user chris
. The next steps to continue exploring this improvement to container security would be running a number of containers of varying types to check limitations that this mode introduces in a more complex environment.
Running Rootless Podman
For a more mature version of Docker Engine being run without the root
user, let's look at another container runtime that can achieve that, too.
Some industry commentators were surprised when Red Hat applied extra development efforts to a runtime called Podman, as it appeared to come out of the blue (developers.redhat.com/blog/2018/08/29/intro-to-podman
). Red Hat has now gone a step further and reportedly removed official support for the Docker package for Red Hat Enterprise Linux v8.0. It has been said that keeping up with feature and security updates for the Community Edition of Docker was a driver in the decision to go it alone. Indeed, Red Hat has created even a podman-docker
RPM package that links Docker to Podman for people who are comfortable using Docker commands (access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/building_running_and_managing_containers/index
). Maintaining autonomy and avoiding reliance on a third party were also apparently factors; Red Hat customers made it clear that their preference was for the container runtime to be either integral to the operating system or integral with OpenShift.
Another valuable Podman feature is its ability to run daemonless. Consider that for a moment. Rather than running an application all year round on critical systems (which in itself represents a superuser-run attack vector), it is possible to use the container runtime only when it is needed. It is a clever and welcome addition to Podman's offering. For backward compatibility, the venerable Podman happily runs container images compliant with the Open Container Initiative (OCI; see www.opencontainers.org
), and it is compatible with Docker images, too.
And, with Red Hat Enterprise Linux v8.0 there has been a clearer focus on helping users move away from Docker in Kubernetes to use CRI-O (cri-o.io
), which is now one of the preferred container runtimes in Kubernetes thanks to its lightweight and more secure nature. An interesting Red Hat blog entry can be found at developers.redhat.com/blog/2019/01/29/podman-kubernetes-yaml
.
It is safe to say that Podman handles the running of containers differently than Docker. Instead of using containerd
(the popular runtime) and containerd-shim
(the runtime used for daemonless containers that acts a type of parent that shepherds a container's child processes), it uses a conmon
process for every running container. According to Red Hat (as described at developers.redhat.com/blog/2019/01/15/podman-managing-containers-pods/
), conmon
is a small program written in the C language that monitors the parent process of each container. If the container stops or fails, then it dutifully passes on the exit code. Additionally, conmon
is responsible for allowing the tty
to be attached to a container, and conmon
also provides the daemonless functionality that Podman achieves. It manages this by continuing to run, even when Podman has stopped, which cleverly keeps a detached container alive in the background. There is more information on how that works here: developers.redhat.com/blog/2019/01/15/podman-managing-containers-pods
.