代碼:https://github.com/LiuZhe6/AndrewNGMachineLearning
文章目錄
- 測驗1:Advice for Applying Machine Learning
- 編程作業:Regularized Linear Regression and Bias/Variance
- 作業一:Regularized Linear Regression Cost Function
- 作業二:Regularized Linear Regression Gradient
- 作業三:Learning Curve
- 作業四:Polynomial Feature Mapping
- 作業五:Validation Curve
- 選做題一:Computing test set error
- 選做二:Plotting learning curves with randomly selected examples
- 測驗2:Machine Learning System Design
測驗1:Advice for Applying Machine Learning
第一題
答案
C
分析:測試集誤差與訓練集誤差趨於一致,此時應爲高偏差。
第二題
答案
CD
分析:根據題目,假設函數在訓練集上表現非常好,但是泛化能力不行,很有可能發生了高方差(過擬合),此時可以採取的措施有:1.減少特徵數量 2.增加正則化參數lamda
第三題
答案
AB
分析:根據題目,假設函數在訓練集和交叉驗證集上表現都不好,故爲高偏差(欠擬合),此時可以採取的措施有:1.增加多項式 2.減少正則化參數lamda。
第四題
答案
AC
第五題
答案
ACD
分析:
A:採用學習曲線可以幫助我們更好的瞭解當前所選擇的學習算法的好壞,正確。
B:高偏差和高方差都不好,錯誤。
C:處於高偏差的情況下,增加訓練樣本並不一定可以幫助提升效果,正確。
D:處於高方差的情況下,增加訓練樣本可能幫助提升效果,正確。
編程作業:Regularized Linear Regression and Bias/Variance
作業一:Regularized Linear Regression Cost Function
linearRegCostFunction.m
記得正則化時,theta0不懲罰
J = 1 / (2 * m) * (X * theta - y)' * (X * theta - y) + lambda / (2 * m) * ((theta') * theta - theta(1)^2 );
作業二:Regularized Linear Regression Gradient
linearRegCostFunction.m
grad = 1/m * X' *(X * theta - y) + lambda / m * theta;
grad(1) -= lambda/m * theta(1);
作業三:Learning Curve
learningCurve.m
% 因爲需要得到隨m的增大,兩個誤差的值,故需要for循環
for i = 1 : m
theta = trainLinearReg(X(1:i,:), y(1:i) ,lambda);
error_train(i) = linearRegCostFunction(X(1:i,:), y(1:i), theta, 0); %lambda設置爲0
error_val(i) = linearRegCostFunction(Xval, yval,theta,0);
作業四:Polynomial Feature Mapping
polyFeatures.m
for i = 1 : p
X_poly(:,i) = X.^i;
end;
作業五:Validation Curve
validationCurve.m
for i = 1 : length(lambda_vec)
lambda = lambda_vec(i);
theta = trainLinearReg(X,y,lambda);
error_train(i) = linearRegCostFunction(X,y,theta,0);
error_val(i) = linearRegCostFunction(Xval,yval,theta,0);
end;
剛開始寫error_train和error_val的時候,後面穿的參數寫成了lambda,提交發現不對。我思考了一下,覺得這裏計算代價的時候,只是針對theta做不同的改變就可以了,lambda始終是0 。因爲在上面計算theta的時候,就已經考慮了lambda。
選做題一:Computing test set error
ex5.m
%% 選做部分1:使用最優lambda計算test error
lambda = 3;
theta = trainLinearReg(X_poly,y,lambda);
test_error = linearRegCostFunction(X_poly_test,ytest,theta,0);
打印test_error,可以看到結果爲test_error = 3.8599
選做二:Plotting learning curves with randomly selected examples
參考github
新建文件learningCurveWithRandomSel.m
function [error_train, error_val] = ...
learningCurveWithRandomSel(X, y, Xval, yval, lambda)
%LEARNINGCURVE Generates the train and cross validation set errors needed
%to plot a learning curve
% [error_train, error_val] = ...
% LEARNINGCURVE(X, y, Xval, yval, lambda) returns the train and
% cross validation set errors for a learning curve. In particular,
% it returns two vectors of the same length - error_train and
% error_val. Then, error_train(i) contains the training error for
% i examples (and similarly for error_val(i)).
%
% In this function, you will compute the train and test errors for
% dataset sizes from 1 up to m. In practice, when working with larger
% datasets, you might want to do this in larger intervals.
%
% Number of training examples
m = size(X, 1);
% You need to return these values correctly
error_train = zeros(m, 1);
error_val = zeros(m, 1);
% ====================== YOUR CODE HERE ======================
% Instructions: Fill in this function to return training errors in
% error_train and the cross validation errors in error_val.
% i.e., error_train(i) and
% error_val(i) should give you the errors
% obtained after training on i examples.
%
% Note: You should evaluate the training error on the first i training
% examples (i.e., X(1:i, :) and y(1:i)).
%
% For the cross-validation error, you should instead evaluate on
% the _entire_ cross validation set (Xval and yval).
%
% Note: If you are using your cost function (linearRegCostFunction)
% to compute the training and cross validation error, you should
% call the function with the lambda argument set to 0.
% Do note that you will still need to use lambda when running
% the training to obtain the theta parameters.
%
% Hint: You can loop over the examples with the following:
%
% for i = 1:m
% % Compute train/cross validation errors using training examples
% % X(1:i, :) and y(1:i), storing the result in
% % error_train(i) and error_val(i)
% ....
%
% end
%
% ---------------------- Sample Solution ----------------------
% X: m*(n+1)
% y: m*1
% Xval: k*(n+1)
% yval: k*1
k = size(Xval, 1);
for i=1:m,
ki = min(i,k);
sum_train_J = 0;
sum_val_J = 0;
for j=1:50,
% Randomly select i data
rand_indices = randperm(m);
X_sel = X(rand_indices(1:i), :);
y_sel = y(rand_indices(1:i), :);
[theta] = trainLinearReg(X_sel, y_sel, 1); % lambda=1
[J, grad] = linearRegCostFunction(X_sel, y_sel, theta, 0); % lambda=0
sum_train_J = sum_train_J + J;
rand_indices = randperm(k);
Xval_sel = Xval(rand_indices(1:ki), :);
yval_sel = yval(rand_indices(1:ki), :);
[J, grad] = linearRegCostFunction(Xval_sel, yval_sel, theta, 0); % lambda=0
sum_val_J = sum_val_J + J;
end;
error_train(i) = sum_train_J/50;
error_val(i) = sum_val_J/50;
end;
% -------------------------------------------------------------
% =========================================================================
end
ex5.m
%% 選做部分2:隨機選擇樣本,計算誤差,取平均值
lambda = 0.01;
[theta] = trainLinearReg(X_poly, y, lambda);
% Plot training data and fit
figure(1);
plot(X, y, 'rx', 'MarkerSize', 10, 'LineWidth', 1.5);
plotFit(min(X), max(X), mu, sigma, theta, p);
xlabel('Change in water level (x)');
ylabel('Water flowing out of the dam (y)');
title (sprintf('Polynomial Regression Fit (lambda = %f)', lambda));
figure(2);
[error_train, error_val] = ...
learningCurveWithRandomSel(X_poly, y, X_poly_val, yval, lambda);
plot(1:m, error_train, 1:m, error_val);
title(sprintf('Polynomial Regression Learning Curve (lambda = %f)', lambda));
xlabel('Number of training examples')
ylabel('Error')
axis([0 13 0 100])
legend('Train', 'Cross Validation')
fprintf('Polynomial Regression (lambda = %f)\n\n', lambda);
fprintf('# Training Examples\tTrain Error\tCross Validation Error\n');
for i = 1:m
fprintf(' \t%d\t\t%f\t%f\n', i, error_train(i), error_val(i));
end
fprintf('Program paused. Press enter to continue.\n');
pause;
測驗2:Machine Learning System Design
第一題
答案
(85 + 10)/1000 = 0.095
第二題
答案
AC
第三題
答案
A
分析:
A:當閥值變爲0.9時,查準率提高了,正確。
B:當閥值爲0.5時,可以計算出F1值爲0.5;當閥值爲0.9時,F1變爲0.18,下降了,錯誤。
C:閥值變爲0.9時,查準率提高了,錯誤。
D:查全率(召回率)沒有提高,錯誤。
第四題
答案
ACD
分析:
A:全部預測爲y=0,則根據recall定義知,分子爲0,故結果爲0,正確。
B:全部預測爲1,根據recall = TP/(TP+FN)知,FN爲0,故recall爲1,錯誤。
C:全部預測爲1,由B知recall爲1,即100%;根據precision = TP/(TP + FP)知,TP + FP = 100%且TP=1%,故precision=1%,正確。
D:依據題意,99%都是non spam,故正確。
第五題
答案
BC