Skip to content

Training Language Models to Follow Instructions with Human Feedback

← Back to topic

Authors: Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, et al.
Year: 2022
Journal: NeurIPS
DOI: 10.48550/arXiv.2203.02155
Publisher: https://arxiv.org/abs/2203.02155

Keywords: rlhf, instruction following, alignment

Abstract

We show that a 1.3B-parameter InstructGPT model is preferred to outputs from a 175B GPT-3 model.

Cite this paper

bibtex
@misc{rlhfinstructgpt2022,
  title  = {Training Language Models to Follow Instructions with Human Feedback},
  author = {Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, et al.},
  year   = {2022},
  journal = {NeurIPS},
  doi    = {10.48550/arXiv.2203.02155},
  url    = {https://doi.org/10.48550/arXiv.2203.02155},
}

Source files

Released under the MIT License.