RoBERTa: A Robustly Optimized BERT Pretraining Approach
← Back to topic
Authors: Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Ves Stoyanov
Year: 2019
Journal: arXiv
DOI: 10.48550/arXiv.1907.11692
Publisher: https://arxiv.org/abs/1907.11692
Keywords: roberta, pretraining
Abstract
We present a replication study of BERT pretraining that carefully measures the impact of many key hyperparameters and training data size.
Cite this paper
bibtex
@misc{roberta2019,
title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
author = {Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Ves Stoyanov},
year = {2019},
journal = {arXiv},
doi = {10.48550/arXiv.1907.11692},
url = {https://doi.org/10.48550/arXiv.1907.11692},
}