Skip to content

Attention Is All You Need

← Back to topic

Authors: Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
Year: 2017
Journal: NeurIPS
DOI: 10.48550/arXiv.1706.03762
Publisher: https://arxiv.org/abs/1706.03762

Keywords: transformer, attention, neural machine translation

Abstract

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks. We propose a new simple network architecture the Transformer based solely on attention mechanisms.

Cite this paper

bibtex
@misc{transformer2017,
  title  = {Attention Is All You Need},
  author = {Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin},
  year   = {2017},
  journal = {NeurIPS},
  doi    = {10.48550/arXiv.1706.03762},
  url    = {https://doi.org/10.48550/arXiv.1706.03762},
}

Source files

Released under the MIT License.