Anthropic Head of Pretraining on Scaling Laws, Compute, and the Future of AI
Sep 30, 2025•Channel
AI Analysis
Data from YouTube Data API v3•Updated Just now
Video Overview
Video Details
Published8 months ago
Duration1:04:05
Video IDYFeb3yAxtjE
Languageen-US
CategoryScience & Technology
PrivacyPublic
Made for KidsNo
Video TypeRegular Video
Performance Metrics
Views23.1K
Likes453
Comments18
Engagement Rate2.04%
Likes per 100 views1.96
Comments per 1K views0.78
Video Tags
Description
Ever wonder what it actually takes to train a frontier AI model?
Ankit Gupta, YC General Partner, sits down with Nick Joseph, Anthropic's Head of Pre-training, to explore the engineering challenges behind training Claude—from managing thousands of GPUs and debugging cursed bugs to balancing compute between pre-training and RL. We cover scaling laws, data strategies, team composition, and why the hardest problems in AI are often infrastructure problems, not ML problems.
Apply to Y Combinator: https://www.ycombinator.com/apply
Work at a startup: https://www.ycombinator.com/jobs
Chapters:
00:00 – Introduction
01:05 – From Vicarious to OpenAI to Anthropic
06:40 – What pretraining is
11:20 – Why next-word prediction won out
16:05 – Scaling laws and the feedback loop of compute → models → revenue
21:50 – Building Anthropic’s early infrastructure
27:35 – Efficiency hacks and debugging at scale
33:10 – Generalists vs. specialists on the pretraining team
38:45 – Challenges of training across thousands of GPUs
44:15 – Working with new chips: GPUs vs. TPUs
49:00 – Pretraining vs. post-training (RLHF and reasoning models)
54:25 – The future of data quality and availability
59:10 – Where pretraining goes next
1:03:00 – Closing reflections