Kubernetes: Seccomp
Context
I was attempting to run the Astronomer Software application locally using Kind. Part of that application includes Elasticsearch, which does some bootstrap checks at start up. One of the checks was failing, preventing the Pod from getting to a Running state. The error included the message,
system call filters failed to install
So I found this brief entry in the Elasticsearch docs,
Elasticsearch installs system call filters of various flavors depending on the operating system (e.g., seccomp on Linux). These system call filters are installed to prevent the ability to execute system calls related to forking as a defense mechanism against arbitrary code execution attacks on Elasticsearch. [...]
Since I'm running a cluster via Kind, I wanted to check out any configurations related to system call filters. Claude told me,
The core issue is that kind doesn't enable seccomp by default (source)
Findings
Seccomp stands for secure computing mode and has been a feature of the Linux kernel since 2.6.12.
The primary purpose of seccomp in the context of Kubernetes is to restrict the system calls a container can make to the kernel.
There are different seccomp profiles:
Unconfined
: removes all seccomp restrictions, allowing the container to make any system callRuntimeDefault
: the default seccomp profile that the container runtime usesLocalhost
: a profile that is defined on the host machine at/var/lib/kubelet/seccomp
(a JSON file)
You can set the seccomp profile on a Pod (or a container) using something like spec.securityContext.seccompProfile
Unconfined seccomp profile means:
- Kubernetes tells containerd: "Don't apply any seccomp restrictions to this container"
- The container process can make any syscalls to the kernel
- BUT this doesn't guarantee the container has the capabilities needed to install seccomp filters
What Elasticsearch needs:
- The ability to call
seccomp()
orprctl()
syscalls to install its own filters - This requires specific Linux capabilities, particularly
CAP_SYS_ADMIN
or proper seccomp setup in the container - Even with an
Unconfined
seccomp profile, containers typically run withoutCAP_SYS_ADMIN
by default, which is required for processes to install their own seccomp filters.
You can see some information about capabilities by inspecting /proc/x/status
:
❯ k exec -it astronomer-elasticsearch-data-0 -- cat /proc/1/status | grep Cap
Defaulted container "es-data" out of: es-data, sysctl (init)
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000000000000000
CapAmb: 0000000000000000
❯ kubectl run privileged-test --image=busybox --restart=Never --rm -it --privileged -- cat /proc/1/status | grep Cap CapInh: 0000000000000000 CapPrm: 000001ffffffffff CapEff: 000001ffffffffff CapBnd: 000001ffffffffff CapAmb: 0000000000000000
The Elasticsearch container has zero capabilities (0000000000000000
), while the privileged container has full capabilities (000001ffffffffff
).
This is why Elasticsearch's seccomp bootstrap check is failing:
Elasticsearch tries to call seccomp()
or prctl()
syscalls to install its own filters, but these syscalls require CAP_SYS_ADMIN
capability. Your container has no capabilities at all, so the kernel rejects these calls.
Unconfined seccomp profile ≠ having capabilities to modify seccomp. It just means "don't apply external seccomp restrictions," but the process still needs proper capabilities to manage seccomp itself.