The documentation encourages you to delve into the /usr/share/containers
directory. As a low-privileged user, you should be able to read the files but not necessarily edit them, as these are for the sysadmin to edit. The files are as follows:
containers.conf seccomp.json
If you look inside the directory /etc/containers
, then you can apparently override the settings in the previous directory. The file listing looks like this:
containers.conf policy.json registries.conf registries.d/ storage.conf
Note that Podman reads these configuration files in this order, with the last overriding the previous file's settings potentially:
/usr/share/containers/containers.conf /etc/containers/containers.conf $HOME/.config/containers/containers.conf
The containers.conf
file contains a number of user-tunable settings. You can configure cgroups (control groups) and resource quotas such as RAM and CPU, and you can also define which kernel capabilities are included. In Listing 2.5 we can see many default capabilities have been commented out, which means they are not in use but are instead replaced by Podman's corresponding default settings.
Listing 2.5: Some Additional Kernel Capabilities That Can Be Uncommented for Containers to Use
# List of default capabilities for containers. # If it is empty or commented out, # the default capabilities defined in the container engine will # be added. # # default_capabilities = [ # "AUDIT_WRITE", # "CHOWN", # "DAC_OVERRIDE", # "FOWNER", # "FSETID", # "KILL", # "MKNOD", # "NET_BIND_SERVICE", # "NET_RAW", # "SETGID", # "SETPCAP", # "SETUID", # "SYS_CHROOT", # ]
The storage.conf
file is a comprehensive way of tweaking your rootless container storage options. You can remap UIDs and GIDs if required so they appear differently inside and outside your containers to suit your volume mounting needs.
There are also settings for the devicemapper
logging levels, which can help debug storage driver issues if required.
Inside the registries.conf
file it is also possible to set up your image registry settings. In that file you can see the following:
[registries.search] registries = ['docker.io', 'quay.io']
And, in the registries.d/
directory you can configure the settings required to access those container image registries with authentication, for example.
Summary
In this chapter, we have proven that running containers without relying on the exposure of the root
user is thankfully now no longer a distant reality when running containerized workloads.
Our first container runtime, Docker Engine, needs some more fine-tuning to get rootless mode working but did successfully launch a fully functional container, without needing the root
user. The second runtime, Podman, not only does not need to run around the clock as a daemon but additionally took little effort, using Ubuntu 20.04, to install. Its configuration also looks like a logical process in addition. Remember that not only is Podman capable of running with less privileges, but it is also a highly versatile, lightweight, and daemonless container runtime that can be used in a number of scenarios as the root
user too.
Watch this space carefully. Although the nascent rootless innovations still need a little more work, rootless Podman is growing increasingly mature. Thanks to Red Hat's reach within enterprise environments, it is used extensively in OpenShift v4.0 platforms and is indeed battle-hardened as a production container runtime.
CHAPTER 3 Container Runtime Protection
In previous chapters, we looked at the need to get the permissions correctly configured to protect other containers running on a host and indeed the host itself. In Chapter 6, “Container Image CVEs,” we will also look at protecting against common vulnerabilities and exploits (CVEs) to plug security holes in container images. The third major aspect of container security is at least as important from an operational perspective. That is the need to capture and potentially automatically remediate any issues when anomalous behavior is discovered from your running containers.
Only a handful of trustworthy and battle-worn container runtime security applications exist. Of those there is one Open Source tool that stands out from the crowd. Created by a company called Sysdig (sysdig.com
) in 2016 and a member of the Cloud Native Computing Forum (CNCF), Falco (falco.org
) excels at both container and host security rules enforcement and alerting. Of the more popular commercial tools there are Prisma Cloud Compute Edition (formerly Twistlock prior to acquisition) and Aqua from AquaSec.
Falco (sysdig.com/opensource/falco
) offers exceptional Open Source functionality that can be used to create rulesets to force containers to behave in precisely the way you want. It also integrates with Kubernetes API Audit Events, which means that all sorts of orchestrator actions can be secured in addition. You can find more information here:
falco.org/docs/event-sources/kubernetes-audit.
In this chapter, we will look at installing Falco and then explore its features and how it can help secure our container runtime and underlying hosts, in the same way that some commercial products do, but without any associated fees. We will also explore using some of its rulesets and how to make changes to them yourself.
Running Falco
Following true Cloud Native methodology, we will use a container image to spawn Falco. That said, there are Linux rpm
, deb
, and binary files that you can install or execute directly, too, which appears to be the preferred route for their installation.
You can run Falco either on a host or by a userland container that additionally needs to access a pre-installed driver on the underlying host. Falco works by tapping into the kernel with elevated permissions to pick up the kernel's system calls (syscalls), and the driver is needed to offer that required functionality. We also need to provide Falco with the requisite permissions to enable such functionality. As described in Chapter 1, “What Is A Container?,” for a container runtime we define these permissions using kernel capabilities. To get an idea of what is available, you could do worse than looking over some of the names of the kernel capabilities in the manual (using the command man capabilities
). Various versions of the manual are online too, such as this:
man7.org/linux/man-pages/man7/capabilities.7.html
To protect the underlying host, we will run Falco with as few privileges as possible. Be warned, however, that you will need a kernel version of v5.8 or higher to make use of the extended Berkeley Packet Filter (eBPF) driver without running a one-off --privileged
container to install that driver to the underlying host(s) that Falco will run on. The Berkeley Packet Filter has been extended to allow increased access to the networking stack to applications via the kernel.
If you are lucky enough to have a kernel of v5.8 or later, the way around the one-off driver installation is to add the CAP_SYS_BPF
option to your running container at startup time, which the more modern kernels will support. Add it using this command-line switch:
--cap--add SYS_BPF
For this demonstration, we will not assume that you have that kernel version, so we will install the driver on a host where we will use the one-off container method. The commands are as follows:
$ docker pull falcosecurity/falco-driver-loader:latest $ docker run --rm -it --privileged -v /root/.falco:/root/.falco \ -v /proc:/host/proc:ro -v /boot:/host/boot:ro \ -v /lib/modules:/host/lib/modules:ro \ -v /usr:/host/usr:ro