Edera Native Workload Intelligence for Kubernetes

If you run multi-tenant platforms, untrusted code, or AI agent workloads on Kubernetes, your platform team has already had this argument: the isolation models strong enough to contain a hostile workload also sever the signals that provide visibility on the production stack. Deploy true isolation and `kubectl top` returns wrong numbers, autoscaling functionality loses the metrics it needs and Prometheus dashboards go dark. Skip it and you are running untrusted code next to production with nothing in between them. It comes down to security versus operability but not both. 

Today we're ending that tradeoff. Edera Native Workload Intelligence delivers hardware-enforced isolation without compromises. Your visibility, scalability and scheduling advantages continue but now with a stronger security boundary.

Why Kubernetes Isolation Breaks Observability and Autoscaling

Isolation technologies have to enforce hard boundaries to be effective. But those boundaries often come with undesirable side effects. Kata Containers, one of the most common solutions, relies on the legacy Pod-request mechanism, which misreports workload memory and shows VM-aggregate allocation instead of workload-specific metrics. The downstream cost lands on platform teams: resource attribution breaks, so you over-provision two to three times what you actually need. You can't watch your memory usage and suffer from unexpected OOM kills.When something goes wrong, debugging means filing tickets and waiting days for SRE help.

With Native Workload Intelligence you keep the operational surface platform teams actually depend on: 

What We Shipped: Full Kubernetes Parity, No Custom Tooling

Native Workload Intelligence makes Edera isolated pods first-class citizens in Kubernetes, from how they're scheduled and observed to how they're placed on hardware. This means we've added:

  • Dynamic Resource Allocation
  • Horizontal Pod Autoscaling
  • Stable, Production-Grade Observability
  • NUMA-Aware Placement
Edera zone dashboard for the api-gateway-7f4c workload showing 184ms cold start, zero memory pressure escapes, and in-zone kernel metrics including slab_reclaimable, dirty pages, TCP retransmit rate, page faults, context switches, and established connections over a 20-minute window.

First Non-GPU Runtime to Ship a Production DRA Driver

Pods backed by Edera zones are now modeled as Kubernetes devices using the Dynamic Resource Allocation (DRA) system, Kubernetes' newest scheduling primitive. The scheduler understands zone resources natively, allocating CPU and memory the same way it allocates GPUs. Edera is the first non-GPU isolation runtime to ship a production DRA driver, which puts isolation on the same scheduling path Kubernetes is building every new resource type around.

In practice: kubectl top works. Autoscaling of your pods works. VPA recommendations are accurate. Your dashboards, whether Prometheus, Datadog, or Grafana, see real per-pod data once you configure them at the new endpoints. One helm install deploys the Edera agent containing a DRA driver across your cluster. No sidecars, no per-pod agents and no re-platforming.

Pod-level metrics flow into the Kubernetes Metrics API with namespace and pod labels intact, so your existing dashboards keep working without re-instrumentation.

Horizontal Pod Autoscaling on Real Zone Metrics

Pod autoscalers work end-to-end with Edera zones. Edera exposes a new metrics endpoint that publishes zone-level CPU and memory usage along with the Pod and Namespace correlation needed to tie that consumption back to Kubernetes resources. Any Kubernetes-native metrics aggregator can consume it. We document the Prometheus path: Prometheus Operator scrapes the endpoint and Prometheus Adapter transforms the metrics for the autoscaler. 

The same pattern works for Datadog, Grafana Alloy, or whatever your team already runs. You can now define your target utilization and the autoscaler responds with real-time Edera isolated pod data. The result: autoscaling responds to real workload pressure inside the zone, not a static VM allocation. No more scaling decisions based on numbers that drift from reality.

Connect Your Existing Monitoring Stack — No Re-Instrumentation Required

Edera adds isolation without forcing your team to change or give up the monitoring stack you already run. Any APM tool that reads from the Kubernetes Metrics API can ingest Edera pod metrics directly. No new agents, no per-pod instrumentation, no parallel monitoring pipeline to maintain. Point your existing scrape config at the Edera endpoint and your existing monitoring investment carries forward. The same pattern carries over to Datadog, Grafana Alloy, and other metrics aggregators your team may already be using.

The metrics server is built for production load. Clean Prometheus label schemas, accurate zone-to-pod attribution, and a pipeline organized for the tools your team already uses.

Automatic NUMA-Aware Placement for GPU and HPC Workloads

In multi-socket servers, non-uniform memory access means that the placement of your workloads in memory can have dramatic latency and bandwidth penalties. What should be a local DRAM access becomes a cross-socket transaction over the CPU interconnect, adding hundreds of processor cycles of latency and reducing throughput at scale. Teams either accept the performance loss or resort to fragile manual CPU pinning that breaks as hardware changes.

Edera detects host NUMA topology automatically at boot, including CPU sockets, cores, memory channels, and PCIe device locations, and places every workload on the optimal node. GPU and NIC passthrough workloads are placed on the NUMA node local to their target device, minimizing cross-socket traffic and avoiding unnecessary interconnect latency. There is nothing to configure. No pinning scripts, no affinity rules, no topology configs to maintain.

For teams running AI training, inference pipelines, HPC, or high-throughput networking, this is the difference between bare-metal performance and leaving capacity on the table.

What This Enables for Platform Teams

Native Workload Intelligence is the same set of capabilities everywhere: hardware-enforced isolation plus full Kubernetes-native observability and scheduling. The three scenarios below are examples of where that combination changes what platform teams can actually ship, not a list of supported configurations.

Multi-tenant platforms

Run tenant workloads with strong isolation while keeping standard resource management, monitoring, and autoscaling consistent across all tenants. No custom tooling per tenant, no blind spots in resource attribution.

Hardened workload isolation

Hardware-enforced boundaries around every workload with full visibility into what's happening inside them. Useful whether the workload is untrusted code, a regulated tenant, or a sensitive production service. Scheduling, metrics, and autoscaling all work the same way either way.

AI agent sandboxing

GPU workloads placed on optimal NUMA nodes automatically, fully isolated from each other, and fully visible to your existing monitoring stack. Run untrusted agent code at scale without giving up the operational consistency your platform team needs. You end up with bare-metal-class GPU performance through NUMA-aware placement and hardware isolation between agents. Not one or the other.

Coming Next: Per-Workload Kernel Visibility

Native Workload Intelligence gets you to maximize the value of your tools. What we are building next goes further: visibility you have never had on Kubernetes. Because Edera workloads run in their own kernel, we can surface per-workload kernel metrics that are architecturally impossible in shared-kernel environments. For example,

  • True mem_available that accounts for reclaimable caches and buffers. 
  • Memory pressure signals that predict OOMKills before they happen. 
  • Per-workload page fault rates, CPU context switches, TCP retransmits, and disk I/O. 

When kernel visibility lands, developers will be able to self-service debug memory issues in minutes using isolated kernel metrics. Platform teams will be able to right-size resource requests based on actual data instead of fear. Because this data feeds your existing monitoring stack (Prometheus, Datadog, Grafana), there's nothing new to learn or adopt.

In-zone kernel metrics panel showing telemetry from inside the guest kernel over a 20-minute window: slab_reclaimable at 1.24 GiB (ballooning target), dirty_pages at 82 MiB, tcp_retrans_rate at 0.02%, page_faults/sec at 1.1k, context_switches/s at 14.2k/s, and tcp_estab at 1,978 — all stable and scoped exclusively to this workload.

This isn’t a configuration setting or a plug in or an additional layer; it’s inherent in Edera's unique architecture. Vanilla Kubernetes shares the host kernel and cannot isolate per-pod kernel metrics. Kata Containers have dedicated kernels but lack built-in kubernetes integration. Edera is the only platform where per-workload kernel visibility is both architecturally possible and operationally accessible through standard Kubernetes tooling.

Get Started and Shape the Roadmap

Native Workload Intelligence is available now. A single helm install brings the Kubernetes integration components online. NUMA placement is enabled by default with no configuration. 

You can help shape our roadmap! We’d love to hear feedback on these new features, as well as, what platform teams would like us to build next. Reach out to our product team here.  

Start running your infrastructure with confidence. Request a demo or explore the documentation to get started.

FAQ

What is Edera Native Workload Intelligence?

Native Workload Intelligence is a set of Kubernetes-native capabilities delivered via a single Helm install that gives Edera isolated pods full parity with standard pods: working HPA, accurate per-pod metrics in Prometheus and Datadog, VPA accuracy, and automatic NUMA-aware workload placement — without sidecars or custom tooling.

Why does Kata Containers break horizontal pod autoscaling?

Kata Containers uses the legacy Pod-request mechanism, which reports VM-aggregate memory allocation rather than workload-specific usage. This causes resource attribution to break, forcing teams to overprovision by 2–3x and leaving autoscalers without accurate per-pod data to act on.

What is a DRA driver and why is Edera's significant?

Dynamic Resource Allocation (DRA) is Kubernetes' newest scheduling primitive, previously used only for GPUs. Edera is the first non-GPU container isolation runtime to ship a production DRA driver, putting isolated workloads on the same scheduling path Kubernetes uses for all new resource types.

How does NUMA-aware placement affect GPU and AI workload performance?

Edera detects host NUMA topology at boot and places GPU and NIC passthrough workloads on the NUMA node local to their target device, eliminating cross-socket memory traffic. For AI inference and training, this is the difference between bare-metal throughput and significant latency overhead from the CPU interconnect.

Cute cartoon axolotl with a light blue segmented body, big eyes, and dark gray external gills.

You know you wanna

Let’s solve this together