Containers are eating the world, and more than ever, cloud computing affects crucial aspects of our lives - our personal data, government secrets, and personal privacy. Every day, huge data leaks and attacks on our critical infrastructure are in the news. We're a decade into Kubernetes, yet the list of critical container escape vulnerabilities continues to grow. Secure containers haven't gained widespread adoption because the solutions are too complex and require dedicated bare metal nodes. Bare metal is at odds with scalability, lacks general availability in major cloud providers, and is very expensive to run. If security is at odds with economics, security loses out.
Containers weren’t designed to be secure, and the industry’s answer to security hasn't given us much hope. It seems like every new moon, a new startup is born, or a new enterprise product line released, selling the same container detection and alerting product. The industry is hyper-focused on improved detection when a container is exploited because the lack of container security gives an attacker unfettered access to a cloud environment. Don’t get me wrong. Detection is a crucial part of security, but detection should complement secure defaults, not be a substitute for them.
If you haven’t read the excellent Palo Alto’s Unit 42 report on container escape techniques, I highly recommend checking it out; it is likely the most comprehensive compilation of container escape techniques I’ve seen. And I spend a lot of time thinking about this! The article quickly calls out the fundamental vulnerability that standard containers suffer from: a shared kernel.
“Sharing the same kernel and often lacking complete isolation from the host's user-mode, containers are susceptible to various techniques employed by attackers seeking to escape the confines of a container environment.”
Container isolation in Kubernetes is complex, involving multiple layers that require fine-tuned configuration and deviate from the Kubernetes default configuration.
Containers are just standard Linux processes that depend on, among other layers, namespaces and capabilities to isolate them from the shared host kernel. Namespaces are the default isolation primitive of containers and emulate isolation with the mount, PID, network, cgroup, IPC, UTS, and user namespaces. Unless you configure additional layers of isolation, namespaces are the only isolation in Kubernetes, and they’re woefully inadequate.
The other isolation technique that’s widely deployed is Linux capabilities. Capabilities add or remove individual privileges from each workload, which means if you run 1000 workloads, you need 1000 capabilities security policies. Just knowing which capabilities your workload needs to run requires you to observe all applications ahead of time and then translate each capability into security policy. That’s an awful lot of time and manual process. Even if you could afford to comprehensively observe and remove every dangerous capability, a single container escape negates all the protections of your capability policies.
We don’t have room in this post to get into all the additional isolation techniques for containers like cgroups, Linux Security Modules, and SECCOMP. Suffice it to say, they don’t holistically mitigate the vulnerability a shared kernel imposes, and configuring them only increases complexity and toil further. This limitation is well known and is called out in the official Kubernetes documentation.
“In a shared environment, unpatched vulnerabilities in the application and system layers can be exploited by attackers for container breakouts and remote code execution that allow access to host resources“ - Kubernetes documentation on Multi-tenancy
At Edera, we’re working to make container security something you don’t have to think about. We remove the burden imposed on DevOps and security teams of spending countless cycles building their own bespoke isolation configurations. With Edera, we securely isolate all containers on your cluster by deploying a single Kubernetes manifest so you can run untrusted, multi-tenant, or end-of-life applications securely.
We’re building containers how they should have been built. We don’t rely on namespaces and capability-based isolation. Edera eliminates the root of the problem: the shared kernel. All workloads on Edera run with their own dedicated Linux kernel, that’s isolated from the host, eliminating shared kernel state of any kind between containers. Our guest kernels are configurable and deployed as OCI images, with signed provenance for supply chain security, and leverage state-of-the-art hardening with OpenPaX.
Edera securely isolates all containers from others and the host, whether you’re running containers with keys to the kingdom or end-of-life, untrusted and multi-tenant. We leverage our custom memory-safe type-1 hypervisor, enabling secure isolation with no impact on workload performance. Edera runs in any Kubernetes environment, on any hardware platform, with SLAs and support from industry experts.
Our core philosophy is to make containers secure design by changing how they’re run at the lowest levels. All of this is transparent to your applications; they don’t even know they’re being protected.
If you’re tired of complex isolation solutions that still don’t work and endless monitoring of every move your containers make, reach out. We’re changing the paradigm of container security, and we’re just getting started.