Post
Conversation
. (OpenAI lead researcher, Transformer co-author) tells us what's really happening with pretraining at OpenAI:
> scaling pretraining still improves models but we're at the top of the S-curve
> due to economic constraints OpenAI shifted focus to more efficient models that could deliver same capability
> around the time of GPT-4 all compute was dedicated to training but when ChatGPT hit it big GPUs were dedicated to inference
> a lot more GPUs coming online 2026
> big chungus model training with distillation coming back in style next year for OpenAI
tldr; there is no wall.
Quote
Matt Turck
@mattturck
·
Thanksgiving-week treat: an epic conversation on Frontier AI with @lukaszkaiser -co-author of “Attention Is All You Need” (Transformers) and leading research scientist at @OpenAI working on GPT-5.1-era reasoning models.
00:00 – Cold open and intro
01:29 – “AI slowdown” vs a
Show more
No comments:
Post a Comment