Deep Learning With Python

https://github.com/fchollet/deep-learning-with-pythonnotebooks.

Concisely, AI can be described as the effort to automate intellectual tasks normally performed by humans.

Could a general-purpose computer “originate” anything, or would it always be bound to dully execute processes we humans fully understand? Could it ever be capable of any original thought? Could it learn from experience? Could it show creativity?

Machine learning turns this around: the machine looks at the input data and the corresponding answers, and figures out what the rules should be

Therefore, the central problem in machine learning and deep learning is to meaningfully transform data: in other words, to learn useful representations of the input data at hand—representations that get us closer to the expected output.

At its core, it’s a different way to look at data—to represent or encode data. For instance, a color image can be encoded in the RGB format (red-green-blue) or in the HSV format (hue-saturation-value): these are two different representations of the same data.

Decision trees learned from data began to receive significant research interest in the 2000s, and by 2010 they were often preferred to kernel methods.

Tags: #ai

1 Draw a batch of training samples, x, and corresponding targets, y_true. 2 Run the model on x to obtain predictions, y_pred (this is called the forward pass). 3 Compute the loss of the model on the batch, a measure of the mismatch between y_pred and y_true. 4 Compute the gradient of the loss with regard to the model’s parameters (this is called the backward pass). 5 Move the parameters a little in the opposite direction from the gradient—for example,W -= learning_rate * gradient—thus reducing the loss on the batch a bit. The learning rate (learning_rate here) would be a scalar factor modulating the “speed” of the gradient descent process.

Then the chain rule states that grad(y, x) == grad(y, x1) * grad(x1, x). This enables you to compute the derivative offg as long as you know the derivatives of f and g. The chain rule is named as it is because when you add more intermediate functions, it starts looking like a chain:

Computation graphs have been an extremely successful abstraction in computer science because they enable us to treat computation as data: a computable expression is encoded as a machine-readable data structure that can be used as the input or output of another program.

Tags: #computer-science #ai