Chinchilla Projects .

Technology

Chinchilla

DeepMind's 70B-parameter LLM: compute-optimal scaling, outperforming larger models like GPT-3 and Gopher with 4x the training data.

Chinchilla is DeepMind's 70-billion-parameter large language model (LLM), introduced in 2022 to redefine scaling laws (Source: *Training Compute-Optimal Large Language Models*). It challenges the 'bigger is better' trend: using the same compute budget as the 280B-parameter Gopher, Chinchilla achieves superior performance by training on 1.4 trillion tokens, four times more data. This compute-optimal approach yields an average accuracy of 67.5% on the MMLU benchmark and drastically cuts inference and fine-tuning costs: a clear win for efficiency.

https://deepmind.google/discover/blog/an-empirical-analysis-of-compute-optimal-large-language-model-training/
1 project · 1 city

Related technologies

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Sign in to see who built these projects