Fast and flexible inference on open-source AI models at scale | BRK117
Nov 25, 2025•Channel
AI Analysis
Data from YouTube Data API v3•Updated Just now
Video Overview
Video Details
Published6 months ago
Duration42:53
Video IDmFG2hiMsR34
Languageen
CategoryScience & Technology
PrivacyPublic
Made for KidsNo
Video TypeRegular Video
Performance Metrics
Views67
Likes2
Comments0
Engagement Rate2.99%
Likes per 100 views2.99
Comments per 1K views0.00
Video Tags
#brk117#cary chai#english (us)#fast and flexible inference on open-source ai models at scale | brk117#innovate with azure ai apps and agents#innovate with azure ai apps and agents:azure container apps#innovate with azure ai apps and agents:azure kubernetes service (aks)#mehrdad abdolghafari#sachi desai#technical#f5n1#ignite#ignite 2025#microsoft#microsoft ignite#microsoft ignite 2025#ms ignite#ms ignite 2025#msft ignite#msft ignite 2025
Description
Run open-source AI models of your choice with flexibility—from local environments to cloud deployments using Azure Container Apps and serverless GPUs for fast, cost-efficient inferencing. You will also learn how AKS powers scalable, high-performance LLM operations with fine-tuned control, giving you confidence to deploy your models your way. You’ll leave with a clear path to run custom and OSS models with agility and cost clarity.
To learn more, please check out these resources:
* https://aka.ms/ignite25-plans-AIAppsCosmosDB
𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀:
* Mehrdad Abdolghafari
* Cary Chai
* Sachi Desai
𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻:
This is one of many sessions from the Microsoft Ignite 2025 event. View even more sessions on-demand and learn about Microsoft Ignite at https://ignite.microsoft.com
BRK117 | English (US) | Innovate with Azure AI apps and agents, Azure Container Apps, Azure Kubernetes Service (AKS)
Breakout | Intermediate (200)
#MSIgnite, #InnovatewithAzureAIappsandagents
Chapters:
0:00 - Use cases: hybrid model architecture, LLMS agents, data boundary control
00:09:09 - Introduction to GPU-intensive workloads like physics and video processing
00:11:47 - Docker Compose for AI agents and simplified cloud deployment
00:16:00 - Live testing of the dashboard generator and log streaming visualization
00:20:31 - AKS investment areas: scale, security, cost optimization and AI support
00:25:04 - Enhanced workload scheduling and configuration for AI workloads
00:30:25 - Inference traffic management using Gateway API and Ignite demo preview
00:35:11 - RBC’s CI/CD pipeline accelerating secure GPU resource provisioning
00:38:01 - RBC strategy: building Canada’s largest AI farm within compliance boundaries