一、模型
線性迴歸
in supervised learning,have a data set(called a training set)-> to predict the price of the house
symbols of supervised learning:
m ->訓練樣本的數量
x ->輸入變量(feature)
y->目標變量
二、代價函數
what we want to do is come up with values for parameter
(minimization problem)->minimize the square difference
(h(x)-y) ->sum over my training set,sum from i equals 1 to M of the square difference between the prediction
in actually,cost function:
depend on your training set,get a cost function looks like that:
使用等高線:
三、梯度下降
- start with some parameters
- keep changing to reduce cost function until end up at a minimum
可能由於初始點的不同,導致最後結果的差異:
learning rate:how big a step we take downhill with gradient descent - if too small,gradient descent can be slow
- if too large,gradient descent can overshot the minimum,it may fail to converge,or even diverge
- as we approach a local minimum,gradient descent will automatically take small steps,so,no need to decrease learning rate over time
需要先更新temp後,再賦值,否則temp1是按照更新後的值進行計算所得到的。
simplified situation:
“Batch” Gradient Descent:each step of gradient descent uses all the training examples.
some other of gradient descent that are not batch versions,look at small subsets of the training sets at a time.
四、多元特徵變量
多個特徵量表達式可以寫爲;
五、多元梯度下降(特徵縮放)
idea:make sure features are on a similar scale
it turns out gradient descent will just have a much harder time(反覆振盪)
特徵值除以最大值:
get every feature into approximately (-1,1)
歸一化:
即(x-average value)/(range value)
六、多元梯度下降法(學習率)
- debugging:how to make sure gradient descent is working correctly(cost function decrease after every iteration)
- how to choose learning rate(for sufficiently small learning rate,cost function decrease on every iteration)
除了線性迴歸,還可以使用多項式迴歸等其他擬合方法