Self-supervised Learning - 國立臺灣大學
Self-supervised LearningHung-yi Lee (Embeddingsfrom Language Models)BERT(Bidirectional Encoder Representations from Transformers)ERNIE (Enhanced Representation through Knowledge Integration)Big Bird: Transformers for Longer SequencesSource of image : Hoover340M parametersBERTGPT-2T5GPT-3ELMoSource: of image : (94M)BERT (340M)GPT-2 (1542M)The models become larger and larger ...Megatron (8B)GPT-2T5 (11B)TuringNLG (17B)The models become larger and larger ...GPT-3 is 10times larger than Turing (340M) GPT-3 (175B)BERTGPT-3 Transformer( )Outline BERT seriesGPT seriesSelf- supervised LearningSupervised labelModel ModelSelf- supervised Masking InputBERT MASKRandom(special token) Transformer EncoderLinear (all characters)==orRandomly masking some tokens ...softmaxMasking InputBERT MASKRandom(special token) Transformer EncoderLinear==orRandomly masking some tokens.
•Applying BERT to protein, DNA, music classification ... Zero-shot Reading Comprehension Training on the sentences of 104 languages Multi-BERT Doc1 Query1 Ans1 Doc2 Query2 Doc3 Ans2 Query3 Doc4 Ans3 ... Image - BYOL Bootstrap your own latent: A new approach to self-supervised Learning
Download Self-supervised Learning - 國立臺灣大學
Information
Domain:
Source:
Link to this page:
Please notify us if you found a problem with this document: