Build Hour: Prompt Caching

Feb 18, 2026Channel
AI Analysis
Data from YouTube Data API v3Updated Just now
OpenAI
OpenAI

1.9M subscribers

View Channel

Video Overview

Video Details

Published3 months ago
Duration56:04
Video IDtECAkJAI_Vk
Languageen
CategoryScience & Technology
PrivacyPublic
Made for KidsNo
Video TypeRegular Video

Performance Metrics

Views4.5K
Likes218
Comments21
Engagement Rate5.28%
Likes per 100 views4.82
Comments per 1K views4.64

Description

Build faster, cheaper, and with lower latency using prompt caching. This Build Hour breaks down how prompt caching works and how to design your prompts to maximize cache hits. Learn what’s actually being cached, when caching applies, and how small changes in your prompts can have a big impact on cost and performance. Erika Kettleson (Solutions Engineer) covers: • What prompt caching is and why it matters for real-world apps • How cache hits work (prefixes, token thresholds, and continuity) • Best practices like using the Responses API and prompt_cache_key • How to measure cache hit rate, latency, and token savings • Customer Spotlight: Warp (ttps://www.warp.dev/) led by Suraj Gupta (Team Lead) to explain the impact of prompt caching 👉 Prompt Caching Docs: https://platform.openai.com/docs/guides/prompt-caching 👉 Prompt Caching 101 Cookbook: https://developers.openai.com/cookbook/examples/prompt_caching101 👉 Prompt Caching 201 Cookbook: https://developers.openai.com/cookbook/examples/prompt_caching_201 👉 Follow along with the code repo: http://github.com/openai/build-hours 👉 Sign up for upcoming live Build Hours: https://webinar.openai.com/buildhours 00:00 Introduction 02:37 Foundations, Mechanics, API Walkthrough 12:11 Demo: Batch Image Processing 16:55 Demo: Branching Chat 26:02 Demo: Long Running Compaction 32:39 Cache Discount Pricing Overview 36:03 Customer Spotlight: Warp 49:37 Q&A

Related Videos

More videos from OpenAI