Solomonoff induction - louis030195

#solomonoff-induction #induction #turing-machine #mathematics #epistemology Created at 2023-01-19 # [Anonymous feedback](https://www.admonymous.co/louis030195) # [[Epistemic status]] #shower-thought Last modified date: 2023-01-19 21:29 Commit: 5 # Related - [[Philosophy/Epistemology/Kolmogorov complexity]] - [[Computing/Complexity]] - [[Computing/Software Engineering/Kolmogorov maintainability]] - [[Readwise/Articles/en.wikipedia.org - Kolmogorov Complexity - Wikipedia]] - [[Computing/Turing completeness]] - [[Maximum entropy]] # TODO > [!TODO] TODO # Solomonoff induction Solomonoff induction is an algorithm proposed by Ray Solomonoff in the 1960s which provides a formal way of inferring the most likely explanations for a given observation. It works by assigning a probability to each possible explanation, given the data it has seen so far. This probability is determined by a universal prior, which is a probability distribution that assigns a probability to all possible explanations, regardless of their content. By combining the universal prior with the data it has seen, Solomonoff induction can identify the most likely explanation and demonstrate its probability. This has implications for many areas of artificial intelligence, including natural language processing, computer vision, and robotics. Solomonoff induction is like a detective trying to figure out what happened. The detective starts by looking at all the possible explanations, and assigns each one a probability based on how likely it seems. After looking at the evidence, the detective can figure out which explanation is most likely and how sure they are about it. >The formalism of Solomonoff induction measures the "complexity of a description" by the length of the shortest computer program which produces >that description as an output. To talk about the "shortest computer program" >that does something, you need to specify a space of computer programs, which requires a language and interpreter. Solomonoff induction uses Turing machines, or rather, bitstrings that specify Turing machines. >~ [[Eliezer Yudkowsky]] It seems to be a "perfect" [[Induction]], in the sense that it needs minimal data to correctly predict anything. ## Example In the seventeen century, Chinese were very concerned about predicting solar eclipses because the emperor was the symbol of the sun. At some point, some Jesuits came to China and brought with them [[Euclid]] [[Book|book]]s which allowed them to use its [[Mathematic]]al tools to improve the predictions. A **Solomonoff [[Induction]]** to predict the solar eclipses would use minimal data, say, only the Sun movement data ([[Lorentzian space-time]]?). Another more concrete example, in https://www.kaggle.com/c/house-prices-advanced-regression-techniques, we try to predict houses price given dataset of features<->price. A system achieving **Solomonoff [[Induction]]** might only need 10% of the dataset? ## Solomonoff [[Induction]], paradoxical? I believe the true meaning of Solomonoff is more like "optimal" amount of data, because [[Godel Theorem]] taught us that you can never reach perfection. Therefore, Solomonoff [[Induction]] can be seen as a way of finding the best possible prediction with the least amount of data. However, it can be seen as paradoxical in the sense that it requires nearly infinite amount of data to reach the perfect result. This means that it might not be practicable - even though it is theoretically possible. This paradox might be resolved by using a combination of Solomonoff [[Induction]] and other methods such as [[Bayesian Inference]] to find the best possible prediction with the available data. [[Maximum entropy]] is another concept that can be used in combination with Solomonoff [[Induction]]. This concept tries to find the best possible prediction given a limited amount of data. It tries to maximize the [[Entropy|entropy]] of the system, which is basically the uncertainty of the prediction. This means that it tries to find the most suitable prediction without sacrificing too much accuracy.