| 4:00 PM | Opening Remarks | |
| 4:05 PM | A decade of Envoy: From Lyft's Debugging Nightmare to the AI Infrastructure Era | In 2015, Lyft had over 30 microservices and a major problem: when something failed, developers couldn't identify if it was the app, AWS, or the network. This uncertainty made them hesitant to make service calls.
So Matt Klein built Envoy. A proxy designed around one idea: the network should be transparent to applications.
Envoy graduated from the CNCF in just two years. Today, it has hundreds of active contributors per quarter, powering production workloads from startups to the Fortune 500 as edge proxies, service mesh sidecars, and load balancers.
But Envoy's story isn't just about where it's been. Envoy Gateway brought Kubernetes-native simplicity to Envoy's power. Now Envoy AI Gateway extends that foundation to handle AI traffic: intelligent inference routing, multi-provider failover, token-based rate limiting, and MCP support.
This talk traces Envoy through three eras. The origin story. The ecosystem. And the AI-native future the community is building together. |
| 4:20 PM | MCP support in Envoy | This session will delve into the technical impacts of MCP on network proxies, exploring why MCP's structure (e.g., body-based attributes) and potential statefulness (e.g., Streamable HTTP sessions) require new thinking in traffic management, load balancing, and policy enforcement. Also will brief how to leverage Envoy’s native support and extensibility to build MCP aware networking features such as request inspection, dynamic routing, transcoding, session management, observability, and security. This talk will also give insights for how MCP and network proxies can evolve together to better support AI agent ecosystems. |
| 4:50 PM | Seamless AI-as-a-Service across Kubernetes Clusters With Envoy AI Gateway and kube-bind | As organizations scale their AI initiatives, they face a recurring nightmare: fragmented API keys, inconsistent model access, and "shadow AI" across multiple Kubernetes clusters. How do you provide a centralized, high-performance AI platform to dozens of remote teams without the overhead of complex Service Meshes or manual credential rotation?
In this session, we explore two powerful open source ecosystem tools: kube-bind and Envoy AI Gateway. We will demonstrate a "Provider-Consumer" architecture where a central GPU-enabled cluster acts as an AI Hub. |
| 5:05 PM | Integrating Stateful Load Balancing At Scale In Envoy | Traditional stateless load balancing strategies lead to poor cache hit rates, higher latency, and wasted compute for services. Stateful routing addresses the core issue but generally introduces availability and operational issues.
At Databricks, we integrated Dicer, an open source autosharder for large scale applications, with Envoy to provide highly available stateful load balancing with automatic fallback to Envoy's standard stateless load balancing.
In this talk we will walk through the evolution of this system:
- Why stateful routing matters
- Our first architecture using ExtProc + sidecar
- The current architecture using Envoy dynamic modules
- The limits of the system
This architecture achieves 90-95% cache hit rates, supports 99.95% availability for critical services, and preserves horizontal pod autoscaling.
We will also discuss our upstream contributions to the Envoy ExtProc filter and the Rust SDK for Envoy dynamic modules. |
| 5:35 PM | Bringing the power of Envoy extensibility to Go and Rust developers | Envoy is a highly extensible proxy that provides many mechanisms to extend its core functionality to specific use cases: Lua, ext-authz, ext-proc, WASM, dynamic modules... Leveraging those powerful features, however, is not easy. Some knowledge of Envoy internals is usually needed; the maturity of SDKs for languages other than C++ is an issue, and the overall developer UX is simply not there.
In this session, I'll introduce Built On Envoy, a project that fills the gap in the Envoy extensibility UX, providing developers with a zero-friction framework to create extensions using the official Dynamic Modules SDK, and very easily run and test extensions on their laptops using Rust and Go.
Built On Envoy also provides a home for extensions that don't belong to the Envoy tree, allowing users to discover, build, and share extensions, and lowering the barrier of Envoy adoption to make it the go-to proxy not only for knowledgeable platform teams, but for the wider appdev community. |
| 5:50 PM | Break | |
| 6:00 PM | GitOps Your Gateway: Automating Traffic Management at Scale | In this talk, we’ll explore how to scale your traffic management by managing HTTPRoutes and SecurityPolicies with GitOps principles using ArgoCD. We’ll also demonstrate how dynamic values produced in Argo CD sync waves can be used to generate secure HTTPRoutes automatically. Using Tekton webhook integrations as a real-world example, you’ll see how GitOps workflows can orchestrate complex gateway configuration safely and consistently, making your traffic management predictable, automated, and developer-friendly. |
| 6:30 PM | Beyond OS Sockets: Custom Socket Interfaces in Envoy | Every Envoy connection begins the same way: a call to the OS socket API. But not every transport plays by those rules!
Some need to bypass the kernel entirely for performance; others need to invert who initiates the connection to reach services behind NAT. This talk introduces Envoy's custom socket interface — a pluggable extension point that allows users to plug into Envoy's socket creation layer and replace it entirely, as a bootstrap extension, without touching a single line of Envoy's L7 code. We walk through the SocketInterface API, how Envoy knows which backend to use for each connection, and two real implementations: a reverse tunnel socket interface that inverts the connection direction to reach NAT-hidden services, and the VPP/VCL socket interface that moves packet processing into user space — all without giving up a single Envoy feature. |
| 6:45 PM | SeaWall: Building a Planetary-Scale S3 Gateway with Envoy | Petabytes of data, billions of daily requests, and millions of tenants flow through a single gateway. At this scale, a small authentication delay or a failed rate limiter can trigger outages across regions. We built SeaWall, an Envoy-based gateway that inspects every request before it reaches origin. Even a 1 ms slowdown here translates into massive throughput loss.
This talk shows how we reduced authorization latency from 20 ms to sub-ms using gRPC ext_authz filters that parse S3 operations in-band while preserving strict multi-tenant isolation. We explain how we stopped cascading rate-limit failures through cross-region coordination that blocks quota abuse at the edge.
We also share how we eliminated deployment windows with SIGHUP-based Envoy reloads and accelerated incident debugging with enriched access logs. Built on lessons from operating planet-scale infrastructure, this talk equips you to move critical business logic to the edge where it belongs. |
| 7:00 PM | AI Traffic Is Different: Rethinking Proxy Architecture for Inference with Envoy | AI inference traffic introduces patterns that differ significantly from traditional microservice calls - long-lived gRPC streams, large payloads, bursty workloads, model version rollouts, and GPU-backed services. These shifts require us to rethink how proxy layers are designed and operated.
In this session, we examine how Envoy’s architecture - listeners, filter chains, routing, load balancing, circuit breaking, and xDS-driven dynamic configuration, can be applied to modern inference workloads. We explore architectural considerations such as streaming behavior, backpressure, version-based traffic splitting for models, and resilience strategies.
This talk provides a forward-looking, design-focused perspective on how Envoy can support evolving AI traffic patterns in cloud-native environments. |
| 7:30 PM | Closing Remarks | |