PDF4PRO ⚡AMP

Modern search engine that looking for books and documents around the web

Example: air traffic controller

XLNet: Generalized Autoregressive Pretraining for Language ...

xlnet : Generalized Autoregressive Pretrainingfor Language UnderstandingZhilin Yang 1, Zihang Dai 12, Yiming Yang1, Jaime Carbonell1,Ruslan Salakhutdinov1, Quoc V. Le21 Carnegie Mellon University,2 Google AI Brain the capability of modeling bidirectional contexts, denoising autoencodingbased Pretraining like BERT achieves better performance than Pretraining ap-proaches based on Autoregressive Language modeling. However, relying on corrupt-ing the input with masks, BERT neglects dependency between the masked positionsand suffers from a pretrain-finetune discrepancy. In light of these pros and cons, wepropose xlnet , a Generalized Autoregressive Pretraining method that (1) enableslearning bidirectional contexts by maximizing the expected likelihood over allpermutations of the factorization order and (2) overcomes the limitations of BERT thanks to its Autoregressive formulation.

conditional distribution. Since an AR language model is only trained to encode a uni-directional con-text (either forward or backward), it is not effective at modeling deep bidirectional contexts. On the ... the autoregressive objective also provides a natural way to …

Tags:

  Conditional, Autoregressive, Xlnet

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Spam in document Broken preview Other abuse

Transcription of XLNet: Generalized Autoregressive Pretraining for Language ...

Related search queries