Sponsored Keynote: Scaling Platform Ops with AI Agents: Troubleshootin... Jorge Palma & Natan Yellin
Mar 27, 2026•Channel
AI Analysis
Data from YouTube Data API v3•Updated Just now
Video Overview
Video Details
Published2 months ago
Duration6:12
Video IDApha61UYfLY
Languageen
CategoryScience & Technology
PrivacyPublic
Made for KidsNo
Video TypeRegular Video
Performance Metrics
Views371
Likes9
Comments0
Engagement Rate2.43%
Likes per 100 views2.43
Comments per 1K views0.00
Description
Don't miss out! Join us at our next KubeCon + CloudNativeCon events in Mumbai, India (18-19 June, 2026), Yokohama, Japan (29-30 July, 2026), and Shanghai, China (8-9 September, 2026). Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at https://kubecon.io
Sponsored Keynote: Scaling Platform Ops with AI Agents: Troubleshooting to Remediation - Jorge Palma, Principal PDM Manager, Microsoft & Natan Yellin, CEO, Robusta
This keynote explores the practical reality of deploying AI agents to maintain Kubernetes clusters at scale. We'll demonstrate HolmesGPT, an open-source CNCF sandbox project that connects LLMs to operational and observability data to diagnose production issues. You'll see how agents reduce MTTR by correlating logs, metrics, and cluster state far faster than manual investigation.
Then we'll tackle the harder problem: moving from diagnosis to remediation. We'll show how agents with remediation policies can detect and fix issues autonomously, within strict RBAC boundaries, approval workflows, and audit trails. We'll be honest about challenges: LLM non-determinism, building trust, and why guardrails are non-negotiable.
This isn't about replacing SREs; it's about multiplying their effectiveness so they can focus on creative problem-solving and system design.