Foundation models cheatsheet

#llm #ai #cheatsheet Created at 240523 # [Anonymous feedback](https://www.admonymous.co/louis030195) # [[Epistemic status]] #shower-thought #human-in-the-loop Last modified date: 240523 Commit: 0 # Related - [[Computing/Intelligence/Machine Learning/Energy Models]] - [[Computing/Intelligence/Machine Learning/Language models direction]] - [[Computing/Prediction is compression]] - [[Computing/Intelligence/Machine Learning/Transformer]] - [[Mental models for sailing the current ocean of massive short-term AI noise]] # Foundation models cheatsheet | Model | Architecture | Training | Release Date | Tokens in Training Dataset | | ----------- | ----------------------------- | ------------------------------------------------------------------------------------ | ------------ | -------------------------- | | Transformer | Encoder-Decoder | Supervised | 2017 | N/A | | BERT | Encoder | Semi-supervised (masking) | 2018 | 3.3 billion | | GPT-2 | Decoder | Unsupervised (next word prediction) | 2019 | 40 billion | | T5 | Encoder-Decoder | Denoising Autoencoder (text-to-text) | 2019 | 750 billion | | GPT-3 | Decoder | Unsupervised (next word prediction) | 2020 | 45 trillion | | BART | Encoder-Decoder | Denoising Autoencoder (masking) | 2020 | N/A | | ELECTRA | Encoder-Decoder with GAN-like | Replaced token detection (adversarial) | 2020 | 2 trillion | | ChatGPT | Decoder | Unsupervised (next word prediction) + Reinforcement learning from human feedback | late 2022 | N/A | | GPT4 | Decoder | Unsupervised (next word prediction) + Reinforcement learning from human feedback | 2023 | N/A |