Evolving KServe: The Unified Model Inference Platform for Both Predictive and... F. Spolti & J. Lee
Apr 30, 2026•Channel
AI Analysis
Data from YouTube Data API v3•Updated Just now
Video Overview
Video Details
Published1 month ago
Duration32:40
Video IDzcdrXu1Fpy0
Languageen
CategoryScience & Technology
PrivacyPublic
Made for KidsNo
Video TypeRegular Video
Performance Metrics
Views688
Likes8
Comments0
Engagement Rate1.16%
Likes per 100 views1.16
Comments per 1K views0.00
Description
Don't miss out! Join us at our next KubeCon + CloudNativeCon events in Mumbai, India (18-19 June, 2026), Yokohama, Japan (29-30 July, 2026), and Shanghai, China (8-9 September, 2026). Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at https://kubecon.io
Evolving KServe: The Unified Model Inference Platform for Both Predictive and Generative AI - Filippe Spolti & Jooho Lee, Red Hat
As generative AI transforms how organizations build and deploy intelligent applications, the need for scalable, flexible, and interoperable model serving infrastructure is becoming critical. This session explores the evolution of model serving — from early, custom-built deployments to today’s cloud-native, Kubernetes-based platforms. We’ll discuss emerging challenges in productionizing large language models (LLMs), including inference efficiency, distributed execution, KV-cache management, and cost optimization.
We are excited to introduce the latest addition to the CNCF family - KServe project - and its latest release, a major leap forward in serving generative AI workloads beyond predictive AI. This release features a new CRD purpose-built for LLM serving via llm-d, support for disaggregated inference architectures, enhanced model and KV caching, and seamless integration with the open source Envoy AI Gateway.