Customizing containerd
Running sandboxed containers using gVisor
For additional security in a Kubernetes cluster it can be useful to run certain containers in a restricted runtime environment known as a sandbox. One option for this is to use gVisor which provides a layer of separation between a running container and the host kernel.
To use gVisor, the necessary executables and containerd configuration can be added
to the image generated with image-builder by setting the containerd_gvisor_runtime
flag to true. For example, in a packer configuration file:
{
"containerd_gvisor_runtime": "true",
"containerd_gvisor_version": "yyyymmdd",
}
This will tell image_builder to install runsc, the executable for gVisor, as well as
the necessary configuration for containerd. Note that containerd_gvisor_version: yyyymmdd can be used to install a specific
point release version. The version defaults to latest.
Once you have built your cluster using the new image, you can then create a RuntimeClass object
as follows:
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
# The name the RuntimeClass will be referenced by.
# RuntimeClass is a non-namespaced resource.
name: gvisor
handler: gvisor
Now, to run a pod in the sandboxed environment you just need to specify the name of the RuntimeClass
using runtimeClassName in the Pod spec:
apiVersion: v1
kind: Pod
metadata:
name: test-sandboxed-pod
spec:
runtimeClassName: gvisor
containers:
- name: sandboxed-container
image: nginx
Once the pod is up and running, you can verify by using kubectl exec to start a shell on the
pod and run dmesg. If the container sandbox is running correctly you should see output similar
to the following:
root@sandboxed-container:/# dmesg
[ 0.000000] Starting gVisor...
[ 0.511752] Digging up root...
[ 0.910192] Recruiting cron-ies...
[ 1.075793] Rewriting operating system in Javascript...
[ 1.351495] Mounting deweydecimalfs...
[ 1.648946] Searching for socket adapter...
[ 2.115789] Checking naughty and nice process list...
[ 2.351749] Granting licence to kill(2)...
[ 2.627640] Creating bureaucratic processes...
[ 2.954404] Constructing home...
[ 3.396065] Segmenting fault lines...
[ 3.812981] Setting up VFS...
[ 4.164302] Setting up FUSE...
[ 4.224418] Ready!
You are running a sandboxed container.
Additional Customizations
Containerd can be further customized in a couple of ways. One option that is directly inserted into the containerd
config.toml
is to override the image pull progress timeout. This can be done using containerd_image_pull_progress_timeout.
You can also add further configuration by adding values for containerd_additional_settings. This is rendered at the
end of the
config.toml
default template.
Overriding LimitNOFILE
By default a LimitNOFILE systemd drop-in (capping the value at 1048576) is only deployed on
Common Base Linux Mariner, Flatcar, and Microsoft Azure Linux, where the upstream infinity value
has been known to cause issues with some containerized software. To opt-in to deploying the same
drop-in on other operating systems, set containerd_enable_limit_no_file to true. It defaults to
false.