Active learning and Reinforcement Learning via Human Feedback in Knowledge Management software

#idea #computing #ai #knowledge #education #personalised-education # [[Epistemic status]] #shower-thought #to-digest # Related > [!TODO] Related > [[AI-Personalised education space]] # TODO > [!TODO] TODO # Active learning and Reinforcement Learning via Human Feedback in Knowledge Management software Training of language models with Reinforcement Learning via Human Feedback is a common theme nowadays, for example, OpenAI uses it[^1] in its InstructGPT models to make models generate outputs that are more [[Alignment|aligned]] with the human [[Personal growth/Goal|goal]], resulting in their [[GPT3]] models "text-davinci-003". ![[Pasted image 20221203101152.png]] There are a few open source implementations[^2] that could be used to have a personalized [[Artificial intelligence|AI]] assistant that actively learn your [[The Map is not the Territory|map of the territory]] with sole purpose: optimizing your [[Education|education]]. # External links [^1]: https://arxiv.org/pdf/2203.02155.pdf [^2]: https://github.com/CarperAI/trlx