重學 Statistics, Cha15 Multiple Regression

怎麼算兩列數之間的 correlatoin coefficient?

15.1 Multiple Regression Model

這裏寫圖片描述

這裏寫圖片描述

這裏寫圖片描述

這裏寫圖片描述

15.3 Coefficient of Determination

這裏寫圖片描述

這裏寫圖片描述

Why Adjusted?
Avoid overestimating the impact of adding an independent variable on the amount of variability explained by the estimated regression equation.

15.4 Model Assumptions

這裏寫圖片描述

這裏寫圖片描述

15.5 Testing for Significance

F test for overall significance; T test for individual significance

F Test

H0: β1 = β2 = … = βp = 0
Ha: One or more of the parameters is not equal to zero
這裏寫圖片描述
n= 觀測數目
p =自變量數目

這裏寫圖片描述

t Test

這裏寫圖片描述

Multicollinearity 多重共線性

當多元迴歸方程總體顯著性的 F 檢驗表明有一個顯著關係時,我們可能得出單個參數沒有一個是顯著地不同於0的結論。只有當自變量之間的相關性非常小,纔有可能迴避這個問題。

F test is significant. but two t test is not significant. With x2 already in the model, x1 does not make a significant contribution to determining the value of y. 怎麼發現?當兩個自變量的 correlation coefficient >0.7

15.7 Qualitative Independent Variables

如果 qualitative variables 是兩個的話,那麼可以變成 0 1

The important point to remember is that when a qualitative variable has k levels, k-1 dummy variables are required in the multiple regression analysis.

15.8 Residual Analysis

Detecting Outliers

Minitab classifies an observation as an outlier if the value of its standardized residual is less than -2 or greater than +2.

Influential Observations

Minitab computes the leverage values and uses the rule of thumb hi > 3( p + 1)/n to identify influential observations.

Cook’s Distance

這裏寫圖片描述

如果 Di >1,那麼表明第 i 次觀測值是一個有影響力的觀測值,並對這個觀測值做進一步的考察。

15.9 Logistic Regression

這裏寫圖片描述

這裏寫圖片描述
The Probability of y=1 given x1,x2,…,xp

這裏寫圖片描述

Testing for Significance

H0: β1 = β2 = 0
Ha: One or both of the parameters is not equal to zero

G follows a chi-square distribution with degrees of freedom equal to the number of independent variables in the model
如果是一個個的 Variable,就用 z test

Managerial Use

問題是:發 coupon,想預測一下哪些消費者在收到 coupon 會用?
通過 logistics regression,得到下面的這張表:
這裏寫圖片描述
結果:
Customers who have a Simmons credit card: Send the catalog to every customer who spent 2000 or more last year
Customers who do not have a Simmons credit card: Send the catalog to every customer who spent 6000 or more last year

Interpreting Logistic Regression Equation

The odds in favor of an event occurring is defined as the probability the event will occur divided by the probability the event will not occur.

Odds ratio: odds of a one-unit increase in only one of the independent variables.
Odds Ratio = odds1 / odds0

這裏寫圖片描述

Logit Transformation

這裏寫圖片描述

這裏寫圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章