Through algorithms, machines can learn rules from a large number of data and make decisions on new sample

Four Elements in Machine Learning

- Data
- Model
- Learning Rule
- Optimization Algorithm

## Learning Rules

A good model should be consistent with the real mapping function in all values:

## Loss Func

it is a non-negative real function, used to quantify the difference between model’s prediction and true label

For example,Quadratic loss function:

## Empirical risk minimization

After selecting the appropriate risk function, we look for a parameter

ML problem is transformed into an optimization problem

### Expected Risk

期望风险(真实风险):

: Real data distribution

### Empirical Risk(经验风险)

Expected risk is unknown, approximated by empirical risk

## Stochastic Gradient Descent

SGD: sampling one samples in each iteration

## Generalization Error (泛化误差)

Generalization error:

## Regularization

the principle of empirical risk minimization can easily lead to a low error rate in the training set, but a high error rate in the unknown data.