Making aligned LLMs end-to-end

#ai #llm #ai-alignment Created at 010423 # [Anonymous feedback](https://www.admonymous.co/louis030195) # [[Epistemic status]] #shower-thought Last modified date: 010423 Commit: 0 # Related - [[Computing/RLHF is an alignment algorithm]] - [[Computing/AI thoughts 100323]] - [[Computing/Intelligence/LLMs are a lever rather than an engine]] - [[RLHF is an alignment algorithm]] - [[Computing/Prediction is compression]] - [[Computing/Intelligence/Alignment/Reinforcement learning from human feedback is an analogy of Asimov laws]] # TODO > [!TODO] TODO # Making aligned LLMs end-to-end By "aligned" [[Large language model|LLM]] i mean [[ChatGPT]] is different from [[GPT3]] in the sense that it better fulfils human desires thanks to [[Reinforcement learning from human feedback|RLHF]] But [[Reinforcement learning from human feedback|RLHF]] is a hack, a very expensive hack that requires a lot of human work, and typically [[Artificial intelligence|AI]] research tries to rely less and less on human to achieve higher scale, because humans are fucking slow . So we need to get rid of [[Reinforcement learning from human feedback|RLHF]] by building a [[Self supervised learning]] algorithm that is able to get this human feedback by itself How babies are getting human feedback and learn?