Sponsored Keynote: Scaling Platform Ops with AI Agents: Troubleshootin... Jorge Palma & Natan Yellin

Mar 27, 2026Channel
AI Analysis
Data from YouTube Data API v3Updated Just now

Video Overview

Video Details

Published2 months ago
Duration6:12
Video IDApha61UYfLY
Languageen
CategoryScience & Technology
PrivacyPublic
Made for KidsNo
Video TypeRegular Video

Performance Metrics

Views371
Likes9
Comments0
Engagement Rate2.43%
Likes per 100 views2.43
Comments per 1K views0.00

Description

Don't miss out! Join us at our next KubeCon + CloudNativeCon events in Mumbai, India (18-19 June, 2026), Yokohama, Japan (29-30 July, 2026), and Shanghai, China (8-9 September, 2026). Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at https://kubecon.io Sponsored Keynote: Scaling Platform Ops with AI Agents: Troubleshooting to Remediation - Jorge Palma, Principal PDM Manager, Microsoft & Natan Yellin, CEO, Robusta This keynote explores the practical reality of deploying AI agents to maintain Kubernetes clusters at scale. We'll demonstrate HolmesGPT, an open-source CNCF sandbox project that connects LLMs to operational and observability data to diagnose production issues. You'll see how agents reduce MTTR by correlating logs, metrics, and cluster state far faster than manual investigation. Then we'll tackle the harder problem: moving from diagnosis to remediation. We'll show how agents with remediation policies can detect and fix issues autonomously, within strict RBAC boundaries, approval workflows, and audit trails. We'll be honest about challenges: LLM non-determinism, building trust, and why guardrails are non-negotiable. This isn't about replacing SREs; it's about multiplying their effectiveness so they can focus on creative problem-solving and system design.

Related Videos

More videos from CNCF [Cloud Native Computing Foundation]