#ai #computing
# [[Epistemic status]]
#shower-thought
# Related
- [[Computing/Intelligence/Machine Learning/Constrastive Learning]]
- [[Computing/Intelligence/Self supervised learning is the dark matter of intelligence]]
- [[Computing/Intelligence/Machine Learning/Reinforcement Learning/Ideas]]
- [[Computing/Intelligence/Machine Learning/Self supervised learning]]
# Self supervised reinforcement learning
#to-digest
[[Constrastive Learning]] is used in [[Self supervised learning]] to acquire supervision from the data itself.
In [[Reinforcement Learning]] a reward is usually given in some condition (win the game, whatever), how can we let the agent get his own reward?
In a more metaphysical words, shouldn’t a conscious agent theoretically have a goal by definition fundamentally imprinted in it: survival and conservation of its consciousness?
From this it can derive sub rewards, like we humans do, eating broccoli becomes a reward when Bob learn that broccoli is good for his health.
I don’t see how an agent could discover a form of supervision beside that?
# External links
https://arxiv.org/abs/2106.13970