PDF4PRO ⚡AMP

Modern search engine that looking for books and documents around the web

Example: quiz answers

Attention Is All You Need - arXiv

Attention Is All You NeedAshish Vaswani Google Shazeer Google Parmar Google Uszkoreit Google Jones Google N. Gomez University of ukasz Kaiser Google Polosukhin dominant sequence transduction models are based on complex recurrent orconvolutional neural networks that include an encoder and a decoder. The bestperforming models also connect the encoder and decoder through an attentionmechanism. We propose a new simple network architecture, the Transformer,based solely on Attention mechanisms, dispensing with recurrence and convolutionsentirely. Experiments on two machine translation tasks show these models tobe superior in quality while being more parallelizable and requiring significantlyless time to train. Our model achieves BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, includingensembles, by over 2 BLEU. On the WMT 2014 English-to-French translation task,our model establishes a new single-model state-of-the-art BLEU score of aftertraining for days on eight GPUs, a small fraction of the training costs of thebest models from the literature.

Attention Is All You Need Ashish Vaswani Google Brain avaswani@google.com Noam Shazeer Google Brain noam@google.com Niki Parmar Google Research nikip@google.com

Tags:

  Needs, Attention, Attention is all you need

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Spam in document Broken preview Other abuse

Transcription of Attention Is All You Need - arXiv

Related search queries