#ai #llm # [[Epistemic status]] #shower-thought #to-digest # Related > [!TODO] Related > [[Active learning and Reinforcement Learning via Human Feedback in Knowledge Management software]] # TODO > [!TODO] TODO # Active learning in Obsidian Ava 1. provide an easy mean to give feedback to [[Artificial intelligence|AI]] outputs 2. collect feedback (locally to stay in the Obsidian mindset?), slowly build dataset 3. regularly fine-tune [[Artificial intelligence|AI]] to be more [[Alignment|aligned]] with human intent ## Example Let's say i generate [[Semantic links]] using [[Computing/Obsidian ava]] now: --- Similar topic links: [[Obsidian ava]] [[AI-Personalised education space]] [[Active learning and Reinforcement Learning via Human Feedback in Knowledge Management software]] [[en.wikipedia.org - Neuro-Symbolic AI - Wikipedia]] --- Let's say it failed and added "[[Evolution]]" We let the user tell the software that it was wrong, collect that evolution is unrelated to "Active learning in Obsidian Ava" and it can at least be used for evaluation when [[Fine-tuning sentence embeddings models using denoising autoencoder]]. I'm more and more inclined towards [[Reinforcement Learning]] when it comes to align human intent in [[AI assistant]]s, but there does not seem to exist clean implementation to fine-tune sentence embeddings using reinforcement learning yet. Indeed [[Reinforcement Learning]] is a framework that neatly map human behaviour, better than traditional [[Supervised Learning]]. # External links