介紹
這門課有一個很好的筆記在52nlp網站上
部分摘錄+自己補充
機器學習的定義
Arthur Samuel (1959): Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed.
Tom Mitchell (1998) : Well-posed Learning Problem: A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
分類
supervised learning監督學習, 訓練集是標記的
本課程涉及了分類(預測離散值),線性/邏輯迴歸(預測連續值),神經網絡,支持向量機SVMunsupercised leaning 無監督學習,訓練集沒有標記
訓練集沒有標記,本課程涉及K-means,PCA主成分分析,異常檢測e.g. “ 雞尾酒會問題”(cocktail party problem)
雞尾酒會問題算法主需要一行代碼
[W,s,v] = svd((repmat(sum(x.*x,1),size(x,1),1).*x)*x’);
## Linear regression with one variable單變量線性迴歸
###Model representation(模型表示)
m = 訓練集樣本數
x = 輸入變量,feature
y = 輸出變量 ,label
單變量線性迴歸的預測函數 Gradient descent(梯度下降)
1、給
2、每次改變
需要注意,先用原來的
對應的代碼爲
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
temp1 = theta(1) - (alpha/m)*(X*theta-y)'*X(:,1);
temp2 = theta(2) - (alpha/m)*(X*theta-y)'*X(:,2);
theta(1) = temp1;
theta(2) = temp2;
J_history(iter) = computeCost(X, y, theta);
end
end
這裏的
注意:梯度下降可能使函數收斂到一個局部最小值,而不是global optima
給出單變量線性迴歸梯度下降算法:
體現在代碼上的區別:求和函數需要寫循環,向量化則簡潔得多: