Language Models are Few-Shot Learners

作者: T, o, m 等47人

arXiv: 001

分类: c, s, ., C, L, ,, , c, s, ., L, G

📝 论文摘要

We show that scaling up language models greatly improves task-agnostic, few-shot performance. GPT-3, with 175B parameters, achieves strong performance on many NLP datasets without task-specific fine-tuning.