目錄
權重分配:Exponential Almon polynomial 約束一致係數
示例:
根據上述函數及數據要求,利用R代碼實現混頻迴歸示例。具體代碼如下:
R代碼實現
加載包
library(midasr)
生成符合條件的隨機數
set.seed(1001) # 設置隨機數種子
n <- 250#低頻數據樣本容量
trend <- c(1:n)#線性趨勢
x <- rnorm(4 * n)#高頻解釋變量,eg:季度數據
z <- rnorm(12 * n)#高頻解釋變量,eg:月度數據
權重分配:Exponential Almon polynomial 約束一致係數
常見加權函數形式:
#權重分配:指數阿爾蒙滯後多項式nealmon
fn_x <- nealmon(p = c(1, -0.5), d = 8)
fn_z <- nealmon(p = c(2, 0.5, -0.1), d = 17)
低頻序列模擬 (e.g. 年度)
y <- 2 + 0.1 * trend + mls(x, 0:7, 4) %*% fn_x + mls(z, 0:16, 12) %*% fn_z + rnorm(n)
繪製係數散點圖
plot(fn_z, col = "red")
points(fn_x,col = "blue")
MIDAS 迴歸示例 月度、季度數據轉化爲同頻
基於最小二乘的線性模型
eq_u1 <- lm(y ~ trend + mls(x, k = 0:7, m = 4) + mls(z, k = 0:16, m = 12))
summary(eq_u1)
eq_u1結果:
Call:
lm(formula = y ~ trend + mls(x, k = 0:7, m = 4) + mls(z, k = 0:16,
m = 12))
Residuals:
Min 1Q Median 3Q Max
-2.2651 -0.6489 0.1073 0.6780 2.7707
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.9694327 0.1261210 15.615 < 2e-16 ***
trend 0.1000072 0.0008769 114.047 < 2e-16 ***
mls(x, k = 0:7, m = 4)X.0/m 0.5268124 0.0643322 8.189 2.07e-14 ***
mls(x, k = 0:7, m = 4)X.1/m 0.3782006 0.0641497 5.896 1.38e-08 ***
mls(x, k = 0:7, m = 4)X.2/m 0.1879689 0.0680465 2.762 0.006219 **
mls(x, k = 0:7, m = 4)X.3/m -0.0052409 0.0658730 -0.080 0.936658
mls(x, k = 0:7, m = 4)X.4/m 0.1504419 0.0627623 2.397 0.017358 *
mls(x, k = 0:7, m = 4)X.5/m 0.0104345 0.0655386 0.159 0.873647
mls(x, k = 0:7, m = 4)X.6/m 0.0698753 0.0692803 1.009 0.314270
mls(x, k = 0:7, m = 4)X.7/m 0.1463317 0.0650285 2.250 0.025412 *
mls(z, k = 0:16, m = 12)X.0/m 0.3671055 0.0618960 5.931 1.14e-08 ***
mls(z, k = 0:16, m = 12)X.1/m 0.3502401 0.0599615 5.841 1.83e-08 ***
mls(z, k = 0:16, m = 12)X.2/m 0.4514656 0.0659546 6.845 7.37e-11 ***
mls(z, k = 0:16, m = 12)X.3/m 0.3733747 0.0577702 6.463 6.42e-10 ***
mls(z, k = 0:16, m = 12)X.4/m 0.3609667 0.0700954 5.150 5.75e-07 ***
mls(z, k = 0:16, m = 12)X.5/m 0.2155748 0.0632036 3.411 0.000769 ***
mls(z, k = 0:16, m = 12)X.6/m 0.0648163 0.0630194 1.029 0.304828
mls(z, k = 0:16, m = 12)X.7/m 0.0665581 0.0673284 0.989 0.323955
mls(z, k = 0:16, m = 12)X.8/m -0.0014853 0.0624569 -0.024 0.981048
mls(z, k = 0:16, m = 12)X.9/m 0.0466486 0.0675116 0.691 0.490305
mls(z, k = 0:16, m = 12)X.10/m 0.0384882 0.0664136 0.580 0.562824
mls(z, k = 0:16, m = 12)X.11/m -0.0077722 0.0591409 -0.131 0.895564
mls(z, k = 0:16, m = 12)X.12/m -0.0283221 0.0620632 -0.456 0.648589
mls(z, k = 0:16, m = 12)X.13/m -0.0375062 0.0608348 -0.617 0.538179
mls(z, k = 0:16, m = 12)X.14/m 0.0297271 0.0652273 0.456 0.649018
mls(z, k = 0:16, m = 12)X.15/m 0.0184075 0.0588059 0.313 0.754558
mls(z, k = 0:16, m = 12)X.16/m -0.0546460 0.0677214 -0.807 0.420574
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9383 on 222 degrees of freedom
(1 observation deleted due to missingness)
Multiple R-squared: 0.9855, Adjusted R-squared: 0.9838
F-statistic: 579.3 on 26 and 222 DF, p-value: < 2.2e-16
基於無約束的混頻迴歸
eq_u2 <- midas_u(y ~ trend + mls(x, 0:7, 4) + mls(z, 0:16, 12))
summary(eq_u2)
eq_u2結果:
Call:
lm(formula = y ~ trend + mls(x, 0:7, 4) + mls(z, 0:16, 12), data = ee)
Residuals:
Min 1Q Median 3Q Max
-2.2651 -0.6489 0.1073 0.6780 2.7707
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.9694327 0.1261210 15.615 < 2e-16 ***
trend 0.1000072 0.0008769 114.047 < 2e-16 ***
mls(x, 0:7, 4)X.0/m 0.5268124 0.0643322 8.189 2.07e-14 ***
mls(x, 0:7, 4)X.1/m 0.3782006 0.0641497 5.896 1.38e-08 ***
mls(x, 0:7, 4)X.2/m 0.1879689 0.0680465 2.762 0.006219 **
mls(x, 0:7, 4)X.3/m -0.0052409 0.0658730 -0.080 0.936658
mls(x, 0:7, 4)X.4/m 0.1504419 0.0627623 2.397 0.017358 *
mls(x, 0:7, 4)X.5/m 0.0104345 0.0655386 0.159 0.873647
mls(x, 0:7, 4)X.6/m 0.0698753 0.0692803 1.009 0.314270
mls(x, 0:7, 4)X.7/m 0.1463317 0.0650285 2.250 0.025412 *
mls(z, 0:16, 12)X.0/m 0.3671055 0.0618960 5.931 1.14e-08 ***
mls(z, 0:16, 12)X.1/m 0.3502401 0.0599615 5.841 1.83e-08 ***
mls(z, 0:16, 12)X.2/m 0.4514656 0.0659546 6.845 7.37e-11 ***
mls(z, 0:16, 12)X.3/m 0.3733747 0.0577702 6.463 6.42e-10 ***
mls(z, 0:16, 12)X.4/m 0.3609667 0.0700954 5.150 5.75e-07 ***
mls(z, 0:16, 12)X.5/m 0.2155748 0.0632036 3.411 0.000769 ***
mls(z, 0:16, 12)X.6/m 0.0648163 0.0630194 1.029 0.304828
mls(z, 0:16, 12)X.7/m 0.0665581 0.0673284 0.989 0.323955
mls(z, 0:16, 12)X.8/m -0.0014853 0.0624569 -0.024 0.981048
mls(z, 0:16, 12)X.9/m 0.0466486 0.0675116 0.691 0.490305
mls(z, 0:16, 12)X.10/m 0.0384882 0.0664136 0.580 0.562824
mls(z, 0:16, 12)X.11/m -0.0077722 0.0591409 -0.131 0.895564
mls(z, 0:16, 12)X.12/m -0.0283221 0.0620632 -0.456 0.648589
mls(z, 0:16, 12)X.13/m -0.0375062 0.0608348 -0.617 0.538179
mls(z, 0:16, 12)X.14/m 0.0297271 0.0652273 0.456 0.649018
mls(z, 0:16, 12)X.15/m 0.0184075 0.0588059 0.313 0.754558
mls(z, 0:16, 12)X.16/m -0.0546460 0.0677214 -0.807 0.420574
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9383 on 222 degrees of freedom
(1 observation deleted due to missingness)
Multiple R-squared: 0.9855, Adjusted R-squared: 0.9838
F-statistic: 579.3 on 26 and 222 DF, p-value: < 2.2e-16
同頻時基於最小二乘和無約束混頻迴歸的eq_u1與eq_u2結果一致!
基於midas_r的非線性估計
eq_r <- midas_r(y ~ trend + mls(x, 0:7, 4, nealmon) + mls(z, 0:16, 12, nealmon), start = list(x = c(1,-0.5), z = c(2, 0.5, -0.1)))
summary(eq_r)
收斂性檢驗
通過計算梯度和黑森矩陣檢驗是否滿足收斂的充分必要條件,計算梯度的歐氏範數和hessian的特徵值。然後判斷梯度範數是否接近零,特徵值是否爲正
deriv_tests(eq_r, tol = 1e-06)
結果:
$`first`
[1] FALSE
$second
[1] TRUE
$gradient
[1] 0.005542527 0.173490055 -0.005754945 0.028655981 -0.030150807 0.019224017 0.053829528
$eigenval
[1] 1.047988e+07 5.891393e+04 3.664513e+02 1.221231e+02 8.116275e+01 5.148844e+01 4.598507e+01
係數輸出
coef(eq_r)#返回t檢驗顯著的係數
結果:
(Intercept) trend x1 x2 z1 z2 z3
1.98820762 0.09988275 1.35336157 -0.50747744 2.26312231 0.40927986 -0.07293300
coef(eq_r, midas = TRUE)#返回所有係數
結果:
(Intercept) trend x1 x2 x3 x4 x5 x6 x7
1.988208e+00 9.988275e-02 5.480768e-01 3.299490e-01 1.986333e-01 1.195797e-01 7.198845e-02 4.333793e-02 2.608997e-02
x8 z1 z2 z3 z4 z5 z6 z7 z8
1.570648e-02 3.346806e-01 4.049071e-01 4.233810e-01 3.826120e-01 2.988388e-01 2.017282e-01 1.176921e-01 5.934435e-02
z9 z10 z11 z12 z13 z14 z15 z16 z17
2.586203e-02 9.740853e-03 3.170901e-03 8.921121e-04 2.169239e-04 4.558760e-05 8.280130e-06 1.299807e-06 1.763484e-07
eq_r的基礎上使用函數amweights來形成幾個標準的週期函數約束
amweights(p = c(1, -0.5), d = 8, m = 4, weight = nealmon, type = "C")
nealmon(p = c(1, -0.5), d = 4)
eq_r1 <- midas_r(y ~ trend + mls(x, 0:7, 4, amweights, nealmon, "C") + mls(z, 0:16, 12, nealmon), start = list(x = c(1, -0.5), z = c(2, 0.5, -0.1)))
summary(eq_r1)
其它加權形式
fn <- gompertzp#概率密度函數的權重設定
eq_r2 <- midas_r(y ~ trend + mls(x, 0:7, 4, nealmon) + mls(z, 0:16, 12, fn), start = list(x = c(1,-0.5), z = c(1, 0.5, 0.1)))
summary(eq_r2)
eq_r3 <- midas_r(y ~ trend + mls(x, 0:7, 4) + mls(z, 0:16, 12, nealmon), start = list(z = c(1,-0.5)))
summary(eq_r3)
eq_r4 <- midas_r(y ~ trend + mls(y, 1:2, 1) + mls(x, 0:7, 4, nealmon), start = list(x = c(1,-0.5)))
summary(eq_r4)
eq_r5 <- midas_r(y ~ trend + mls(y, 1:2, 1, "*") + mls(x, 0:7, 4, nealmon), start = list(x = c(1,-0.5)))
summary(eq_r5)
eq_r6 <- midas_r(y ~ trend + mls(y, 1:4, 1, nealmon) + mls(x, 0:7, 4, nealmon), start = list(y = c(1, -0.5), x = c(1, -0.5)))
summary(eq_r6)
eq_r7 <- midas_r(y ~ trend + mls(x, 0:7, 4, amweights, nealmon, "B"), start = list(x = c(1, 1, -0.5))) summary(eq_r7)
eq_r8 <- midas_r(y ~ trend + mls(x, 0:7, 4, amweights, nealmon, "C"), start = list(x = c(1, -0.5))) summary(eq_r8)
eq_r9 <- midas_r(y ~ trend + mls(x, 0:7, 4, amweights, nealmon, "A"), start = list(x = c(1, 1, 1, -0.5)))
summary(eq_r9)
fn <- function(p, d) {
p[1] * c(1:d)^p[2]
}
eq_r10 <- midas_r(y ~ trend + mls(x, 0:101, 4, fn), start = list(x = rep(0, 2)))
summary(eq_r10)
約束的充分性檢驗
只要誤差是獨立同分布的,就可以使用hAh_test檢驗,而hAhr_test是HAC-Roust穩健版本檢驗。只要在殘差中沒有觀察到顯著的hac,在小樣本中使用hAh_test檢驗更精確。
eq_r12 <- midas_r(y ~ trend + mls(x, 0:7, 4, nealmon) + mls(z, 0:16, 12, nealmon), start = list(x = c(1, -0.5), z = c(2, 0.5, -0.1)))
summary(eq_r12)
#約束的充分性檢驗
hAh_test(eq_r12)
hAhr_test(eq_r12)
結果:
hAh restriction test
data:
hAh = 16.552, df = 20, p-value = 0.6818
hAh restriction test (robust version)
data:
hAhr = 14.854, df = 20, p-value = 0.7847
對比:將變量z的函數約束的參數數量由17減少爲2個
eq_r13 <- midas_r(y ~ trend + mls(x, 0:7, 4, nealmon) + mls(z, 0:12, 12, nealmon), start = list(x = c(1, -0.5), z = c(2, -0.1)))
hAh_test(eq_r13)
hAhr_test(eq_r13)
summary(eq_r13)
結果:
hAh restriction test
data:
hAh = 36.892, df = 17, p-value = 0.00348
hAh restriction test (robust version)
data:
hAhr = 32.879, df = 17, p-value = 0.01168
拒絕原假設,即約束不充分
最優模型選取
step1:函數expand_weights_lags定義每個模型的潛在模型集
set_x <- expand_weights_lags(weights = c("nealmon", "almonp"), from = 0, to = c(5, 10), m = 1, start = list(nealmon = rep(1, 2), almonp = rep(1, 3)))
set_z <- expand_weights_lags(weights = c("nealmon", "nbeta"), from = 1, to = c(2, 3), m = 1, start = list(nealmon = rep(0,2), nbeta = rep(0.5, 3)))
step2:所有可能模型的估計
函數midas_r_ic_table返回所有模型的彙總表,以及常用信息準則的相應值和參數約束充分性檢驗(默認是hAh_test檢驗)的經驗大小。給出了導數檢驗的結果和優化函數的收斂狀態。
eqs.ic <- midas_r_ic_table(y ~ trend + mls(x, 0, m = 4) + fmls(z, 0, m = 12), table = list(z = set_z,x = set_x), start = c(`(Intercept)` = 0, trend = 0))
mod <- modsel(eqs.ic, IC = "AIC", type = "restricted")
step3:查看某個候選模型可微調其收斂性,並更新結果:
eqs_ic$candlist[[5]] <- update(eqs_ic$candlist[[5]], Ofunction = "nls")
eqs_ic <- update(eqs_ic)
手動選取模型
step1:訓練集(樣本內預測)、測試集(樣本外預測)分割
datasplit <- split_data(list(y = y, x = x, z = z, trend = trend), insample = 1:200, outsample = 201:250)
step2:分別擬合待選模型
mod1 <- midas_r(y ~ trend + mls(x, 4:14, 4, nealmon) + mls(z, 12:22, 12, nealmon), data = datasplit$indata, start = list(x = c(10, 1, -0.1), z = c(2, -0.1)))
mod2 <- midas_r(y ~ trend + mls(x, 4:20, 4, nealmon) + mls(z, 12:25, 12, nealmon), data = datasplit$indata, start = list(x = c(10, 1, -0.1), z = c(2, -0.1)))
step3:計算
avgf <- average_forecast(list(mod1, mod2), data = list(y = y, x = x, z = z, trend = trend), insample = 1:200, outsample = 201:250, type = "fixed", measures = c("MSE", "MAPE", "MASE"), fweights = c("EW", "BICW", "MSFE", "DMSFE"))
樣本內、樣本外預測精度輸出:
avgf$accuracy
結果對比:
$`individual`
Model MSE.out.of.sample MAPE.out.of.sample
1 y ~ trend + mls(x, 4:14, 4, nealmon) + mls(z, 12:22, 12, nealmon) 1.848474 4.474242
2 y ~ trend + mls(x, 4:20, 4, nealmon) + mls(z, 12:25, 12, nealmon) 1.763432 4.391594
MASE.out.of.sample MSE.in.sample MAPE.in.sample MASE.in.sample
1 0.8155477 1.768720 14.05880 0.6992288
2 0.7976764 1.726365 13.56287 0.6897242
$average
Scheme MSE MAPE MASE
1 EW 1.803640 4.432918 0.8066121
2 BICW 1.763433 4.391595 0.7976766
3 MSFE 1.802640 4.431945 0.8064017
4 DMSFE 1.801787 4.431112 0.8062216
mod1,mod2預測結果輸出:
avgf$forecast
結果對比:
[,1] [,2]
[1,] 22.24057 22.38300
[2,] 22.12774 22.13778
[3,] 22.22678 22.21942
[4,] 22.44535 22.61872
[5,] 22.36448 22.46157
[6,] 22.49847 22.66663
[7,] 22.70192 22.70818
[8,] 22.49411 22.53326
[9,] 22.47978 22.54235
[10,] 22.43439 22.46522
[11,] 22.61563 22.57788
[12,] 22.87686 22.88271
[13,] 22.59329 22.64370
[14,] 23.39675 23.47606
[15,] 23.25864 23.27932
[16,] 22.97588 23.09488
[17,] 23.13196 23.21956
[18,] 24.05489 24.04334
[19,] 24.11690 24.16742
[20,] 23.79328 23.91429
[21,] 24.58047 24.58925
[22,] 24.33794 24.45825
[23,] 23.96658 24.03107
[24,] 24.00510 24.12939
[25,] 23.96473 23.99645
[26,] 24.02198 24.04416
[27,] 25.01521 24.91966
[28,] 24.47063 24.59918
[29,] 24.58748 24.51964
[30,] 24.69662 24.72768
[31,] 25.19092 25.22659
[32,] 25.57452 25.72345
[33,] 25.06312 25.17678
[34,] 24.99302 25.09359
[35,] 25.13005 25.16293
[36,] 25.85870 25.85633
[37,] 25.43593 25.46734
[38,] 25.80690 25.88224
[39,] 26.80900 26.82394
[40,] 26.44056 26.63095
[41,] 26.04786 26.17996
[42,] 26.09545 26.29647
[43,] 25.99309 25.98112
[44,] 25.89755 26.03643
[45,] 25.99814 26.18442
[46,] 26.25727 26.38797
[47,] 26.79563 26.92593
[48,] 26.57034 26.68217
[49,] 26.53808 26.51001
[50,] 26.72772 26.76183
組合預測效果輸出:
avgf$avgforecast
結果對比:
其中EW:基於特徵值加權組合;BICW:BIC信息準則加權組合;MSFE: 預測均方誤差加權組合;DMSFE:discounted MSFE
EW BICW MSFE DMSFE
[1,] 22.31178 22.38299 22.31346 22.31489
[2,] 22.13276 22.13778 22.13288 22.13298
[3,] 22.22310 22.21942 22.22301 22.22294
[4,] 22.53204 22.61872 22.53408 22.53582
[5,] 22.41303 22.46157 22.41417 22.41515
[6,] 22.58255 22.66663 22.58453 22.58622
[7,] 22.70505 22.70818 22.70512 22.70519
[8,] 22.51369 22.53326 22.51415 22.51454
[9,] 22.51107 22.54235 22.51180 22.51244
[10,] 22.44980 22.46522 22.45017 22.45048
[11,] 22.59675 22.57788 22.59631 22.59593
[12,] 22.87979 22.88271 22.87986 22.87991
[13,] 22.61849 22.64370 22.61909 22.61960
[14,] 23.43641 23.47606 23.43734 23.43814
[15,] 23.26898 23.27932 23.26922 23.26943
[16,] 23.03538 23.09488 23.03678 23.03798
[17,] 23.17576 23.21956 23.17679 23.17767
[18,] 24.04912 24.04334 24.04898 24.04887
[19,] 24.14216 24.16742 24.14276 24.14326
[20,] 23.85379 23.91429 23.85521 23.85643
[21,] 24.58486 24.58925 24.58497 24.58505
[22,] 24.39809 24.45825 24.39951 24.40072
[23,] 23.99882 24.03107 23.99958 24.00023
[24,] 24.06724 24.12938 24.06870 24.06996
[25,] 23.98059 23.99645 23.98096 23.98128
[26,] 24.03307 24.04416 24.03334 24.03356
[27,] 24.96744 24.91967 24.96631 24.96535
[28,] 24.53491 24.59918 24.53642 24.53771
[29,] 24.55356 24.51964 24.55277 24.55208
[30,] 24.71215 24.72768 24.71252 24.71283
[31,] 25.20876 25.22659 25.20918 25.20954
[32,] 25.64898 25.72345 25.65074 25.65224
[33,] 25.11995 25.17678 25.12129 25.12244
[34,] 25.04331 25.09359 25.04449 25.04550
[35,] 25.14649 25.16293 25.14688 25.14721
[36,] 25.85751 25.85633 25.85749 25.85746
[37,] 25.45163 25.46734 25.45200 25.45232
[38,] 25.84457 25.88224 25.84546 25.84622
[39,] 26.81647 26.82394 26.81665 26.81680
[40,] 26.53575 26.63095 26.53799 26.53991
[41,] 26.11391 26.17996 26.11546 26.11679
[42,] 26.19596 26.29647 26.19833 26.20035
[43,] 25.98711 25.98112 25.98696 25.98684
[44,] 25.96699 26.03642 25.96862 25.97002
[45,] 26.09128 26.18442 26.09348 26.09535
[46,] 26.32262 26.38797 26.32416 26.32548
[47,] 26.86078 26.92593 26.86232 26.86363
[48,] 26.62626 26.68217 26.62757 26.62870
[49,] 26.52404 26.51001 26.52371 26.52343
[50,] 26.74478 26.76183 26.74518 26.74552
參考文獻:《Mixed Frequency Data Sampling Regression Models: The R Package midasr》