R語言實現混頻數據分析(二)----示例解析

目錄

示例: 

R代碼實現

 加載包

生成符合條件的隨機數

權重分配:Exponential Almon polynomial 約束一致係數

低頻序列模擬 (e.g. 年度)

MIDAS 迴歸示例 月度、季度數據轉化爲同頻

基於最小二乘的線性模型

基於無約束的混頻迴歸

基於midas_r的非線性估計

收斂性檢驗

其它加權形式

約束的充分性檢驗

最優模型選取

手動選取模型


示例: 

根據上述函數及數據要求,利用R代碼實現混頻迴歸示例。具體代碼如下:

R代碼實現

 加載包

library(midasr)

生成符合條件的隨機數

set.seed(1001)  # 設置隨機數種子
n <- 250#低頻數據樣本容量
trend <- c(1:n)#線性趨勢
x <- rnorm(4 * n)#高頻解釋變量,eg:季度數據
z <- rnorm(12 * n)#高頻解釋變量,eg:月度數據

權重分配:Exponential Almon polynomial 約束一致係數

常見加權函數形式:

#權重分配:指數阿爾蒙滯後多項式nealmon
fn_x <- nealmon(p = c(1, -0.5), d = 8)
fn_z <- nealmon(p = c(2, 0.5, -0.1), d = 17)

低頻序列模擬 (e.g. 年度)

y <- 2 + 0.1 * trend + mls(x, 0:7, 4) %*% fn_x + mls(z, 0:16, 12) %*% fn_z + rnorm(n)

繪製係數散點圖

plot(fn_z, col = "red")
points(fn_x,col = "blue")

MIDAS 迴歸示例 月度、季度數據轉化爲同頻

基於最小二乘的線性模型

eq_u1 <- lm(y ~ trend + mls(x, k = 0:7, m = 4) + mls(z, k = 0:16, m = 12))
summary(eq_u1)

eq_u1結果:

Call:
lm(formula = y ~ trend + mls(x, k = 0:7, m = 4) + mls(z, k = 0:16, 
    m = 12))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.2651 -0.6489  0.1073  0.6780  2.7707 

Coefficients:
                                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)                     1.9694327  0.1261210  15.615  < 2e-16 ***
trend                           0.1000072  0.0008769 114.047  < 2e-16 ***
mls(x, k = 0:7, m = 4)X.0/m     0.5268124  0.0643322   8.189 2.07e-14 ***
mls(x, k = 0:7, m = 4)X.1/m     0.3782006  0.0641497   5.896 1.38e-08 ***
mls(x, k = 0:7, m = 4)X.2/m     0.1879689  0.0680465   2.762 0.006219 ** 
mls(x, k = 0:7, m = 4)X.3/m    -0.0052409  0.0658730  -0.080 0.936658    
mls(x, k = 0:7, m = 4)X.4/m     0.1504419  0.0627623   2.397 0.017358 *  
mls(x, k = 0:7, m = 4)X.5/m     0.0104345  0.0655386   0.159 0.873647    
mls(x, k = 0:7, m = 4)X.6/m     0.0698753  0.0692803   1.009 0.314270    
mls(x, k = 0:7, m = 4)X.7/m     0.1463317  0.0650285   2.250 0.025412 *  
mls(z, k = 0:16, m = 12)X.0/m   0.3671055  0.0618960   5.931 1.14e-08 ***
mls(z, k = 0:16, m = 12)X.1/m   0.3502401  0.0599615   5.841 1.83e-08 ***
mls(z, k = 0:16, m = 12)X.2/m   0.4514656  0.0659546   6.845 7.37e-11 ***
mls(z, k = 0:16, m = 12)X.3/m   0.3733747  0.0577702   6.463 6.42e-10 ***
mls(z, k = 0:16, m = 12)X.4/m   0.3609667  0.0700954   5.150 5.75e-07 ***
mls(z, k = 0:16, m = 12)X.5/m   0.2155748  0.0632036   3.411 0.000769 ***
mls(z, k = 0:16, m = 12)X.6/m   0.0648163  0.0630194   1.029 0.304828    
mls(z, k = 0:16, m = 12)X.7/m   0.0665581  0.0673284   0.989 0.323955    
mls(z, k = 0:16, m = 12)X.8/m  -0.0014853  0.0624569  -0.024 0.981048    
mls(z, k = 0:16, m = 12)X.9/m   0.0466486  0.0675116   0.691 0.490305    
mls(z, k = 0:16, m = 12)X.10/m  0.0384882  0.0664136   0.580 0.562824    
mls(z, k = 0:16, m = 12)X.11/m -0.0077722  0.0591409  -0.131 0.895564    
mls(z, k = 0:16, m = 12)X.12/m -0.0283221  0.0620632  -0.456 0.648589    
mls(z, k = 0:16, m = 12)X.13/m -0.0375062  0.0608348  -0.617 0.538179    
mls(z, k = 0:16, m = 12)X.14/m  0.0297271  0.0652273   0.456 0.649018    
mls(z, k = 0:16, m = 12)X.15/m  0.0184075  0.0588059   0.313 0.754558    
mls(z, k = 0:16, m = 12)X.16/m -0.0546460  0.0677214  -0.807 0.420574    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9383 on 222 degrees of freedom
  (1 observation deleted due to missingness)
Multiple R-squared:  0.9855,	Adjusted R-squared:  0.9838 
F-statistic: 579.3 on 26 and 222 DF,  p-value: < 2.2e-16

基於無約束的混頻迴歸

eq_u2 <- midas_u(y ~ trend + mls(x, 0:7, 4) + mls(z, 0:16, 12))
summary(eq_u2)

eq_u2結果:

Call:
lm(formula = y ~ trend + mls(x, 0:7, 4) + mls(z, 0:16, 12), data = ee)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.2651 -0.6489  0.1073  0.6780  2.7707 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)             1.9694327  0.1261210  15.615  < 2e-16 ***
trend                   0.1000072  0.0008769 114.047  < 2e-16 ***
mls(x, 0:7, 4)X.0/m     0.5268124  0.0643322   8.189 2.07e-14 ***
mls(x, 0:7, 4)X.1/m     0.3782006  0.0641497   5.896 1.38e-08 ***
mls(x, 0:7, 4)X.2/m     0.1879689  0.0680465   2.762 0.006219 ** 
mls(x, 0:7, 4)X.3/m    -0.0052409  0.0658730  -0.080 0.936658    
mls(x, 0:7, 4)X.4/m     0.1504419  0.0627623   2.397 0.017358 *  
mls(x, 0:7, 4)X.5/m     0.0104345  0.0655386   0.159 0.873647    
mls(x, 0:7, 4)X.6/m     0.0698753  0.0692803   1.009 0.314270    
mls(x, 0:7, 4)X.7/m     0.1463317  0.0650285   2.250 0.025412 *  
mls(z, 0:16, 12)X.0/m   0.3671055  0.0618960   5.931 1.14e-08 ***
mls(z, 0:16, 12)X.1/m   0.3502401  0.0599615   5.841 1.83e-08 ***
mls(z, 0:16, 12)X.2/m   0.4514656  0.0659546   6.845 7.37e-11 ***
mls(z, 0:16, 12)X.3/m   0.3733747  0.0577702   6.463 6.42e-10 ***
mls(z, 0:16, 12)X.4/m   0.3609667  0.0700954   5.150 5.75e-07 ***
mls(z, 0:16, 12)X.5/m   0.2155748  0.0632036   3.411 0.000769 ***
mls(z, 0:16, 12)X.6/m   0.0648163  0.0630194   1.029 0.304828    
mls(z, 0:16, 12)X.7/m   0.0665581  0.0673284   0.989 0.323955    
mls(z, 0:16, 12)X.8/m  -0.0014853  0.0624569  -0.024 0.981048    
mls(z, 0:16, 12)X.9/m   0.0466486  0.0675116   0.691 0.490305    
mls(z, 0:16, 12)X.10/m  0.0384882  0.0664136   0.580 0.562824    
mls(z, 0:16, 12)X.11/m -0.0077722  0.0591409  -0.131 0.895564    
mls(z, 0:16, 12)X.12/m -0.0283221  0.0620632  -0.456 0.648589    
mls(z, 0:16, 12)X.13/m -0.0375062  0.0608348  -0.617 0.538179    
mls(z, 0:16, 12)X.14/m  0.0297271  0.0652273   0.456 0.649018    
mls(z, 0:16, 12)X.15/m  0.0184075  0.0588059   0.313 0.754558    
mls(z, 0:16, 12)X.16/m -0.0546460  0.0677214  -0.807 0.420574    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9383 on 222 degrees of freedom
  (1 observation deleted due to missingness)
Multiple R-squared:  0.9855,	Adjusted R-squared:  0.9838 
F-statistic: 579.3 on 26 and 222 DF,  p-value: < 2.2e-16

同頻時基於最小二乘和無約束混頻迴歸的eq_u1與eq_u2結果一致!

基於midas_r的非線性估計

eq_r <- midas_r(y ~ trend + mls(x, 0:7, 4, nealmon) + mls(z, 0:16, 12, nealmon), start = list(x = c(1,-0.5), z = c(2, 0.5, -0.1)))
summary(eq_r)

收斂性檢驗

通過計算梯度和黑森矩陣檢驗是否滿足收斂的充分必要條件,計算梯度的歐氏範數和hessian的特徵值。然後判斷梯度範數是否接近零,特徵值是否爲正

deriv_tests(eq_r, tol = 1e-06)

結果:

$`first`
[1] FALSE

$second
[1] TRUE

$gradient
[1]  0.005542527  0.173490055 -0.005754945  0.028655981 -0.030150807  0.019224017  0.053829528

$eigenval
[1] 1.047988e+07 5.891393e+04 3.664513e+02 1.221231e+02 8.116275e+01 5.148844e+01 4.598507e+01

係數輸出

coef(eq_r)#返回t檢驗顯著的係數

結果:

(Intercept)       trend          x1          x2          z1          z2          z3 
 1.98820762  0.09988275  1.35336157 -0.50747744  2.26312231  0.40927986 -0.07293300 

 

coef(eq_r, midas = TRUE)#返回所有係數

結果:

 (Intercept)        trend           x1           x2           x3           x4           x5           x6           x7 
1.988208e+00 9.988275e-02 5.480768e-01 3.299490e-01 1.986333e-01 1.195797e-01 7.198845e-02 4.333793e-02 2.608997e-02 
          x8           z1           z2           z3           z4           z5           z6           z7           z8 
1.570648e-02 3.346806e-01 4.049071e-01 4.233810e-01 3.826120e-01 2.988388e-01 2.017282e-01 1.176921e-01 5.934435e-02 
          z9          z10          z11          z12          z13          z14          z15          z16          z17 
2.586203e-02 9.740853e-03 3.170901e-03 8.921121e-04 2.169239e-04 4.558760e-05 8.280130e-06 1.299807e-06 1.763484e-07

eq_r的基礎上使用函數amweights來形成幾個標準的週期函數約束

amweights(p = c(1, -0.5), d = 8, m = 4, weight = nealmon, type = "C")
nealmon(p = c(1, -0.5), d = 4)

eq_r1 <- midas_r(y ~ trend + mls(x, 0:7, 4, amweights, nealmon, "C") + mls(z, 0:16, 12, nealmon), start = list(x = c(1, -0.5), z = c(2, 0.5, -0.1)))
summary(eq_r1)

其它加權形式

fn <- gompertzp#概率密度函數的權重設定
eq_r2 <- midas_r(y ~ trend + mls(x, 0:7, 4, nealmon) + mls(z, 0:16, 12, fn), start = list(x = c(1,-0.5), z = c(1, 0.5, 0.1)))
summary(eq_r2)
eq_r3 <- midas_r(y ~ trend + mls(x, 0:7, 4) + mls(z, 0:16, 12, nealmon), start = list(z = c(1,-0.5)))
summary(eq_r3)

eq_r4 <- midas_r(y ~ trend + mls(y, 1:2, 1) + mls(x, 0:7, 4, nealmon), start = list(x = c(1,-0.5)))                                                                                         
summary(eq_r4)

eq_r5 <- midas_r(y ~ trend + mls(y, 1:2, 1, "*") + mls(x, 0:7, 4, nealmon), start = list(x = c(1,-0.5)))                                                                                               
summary(eq_r5)

eq_r6 <- midas_r(y ~ trend + mls(y, 1:4, 1, nealmon) + mls(x, 0:7, 4, nealmon), start = list(y = c(1, -0.5), x = c(1, -0.5)))                                                                                           
summary(eq_r6)

eq_r7 <- midas_r(y ~ trend + mls(x, 0:7, 4, amweights, nealmon, "B"), start = list(x = c(1, 1, -0.5)))                                                                                         summary(eq_r7)

eq_r8 <- midas_r(y ~ trend + mls(x, 0:7, 4, amweights, nealmon, "C"), start = list(x = c(1, -0.5)))                                                                                         summary(eq_r8)

eq_r9 <- midas_r(y ~ trend + mls(x, 0:7, 4, amweights, nealmon, "A"), start = list(x = c(1, 1, 1, -0.5)))                                                                                      
summary(eq_r9)
fn <- function(p, d) {
  p[1] * c(1:d)^p[2]
}
eq_r10 <- midas_r(y ~ trend + mls(x, 0:101, 4, fn), start = list(x = rep(0, 2)))
summary(eq_r10)

約束的充分性檢驗

只要誤差是獨立同分布的,就可以使用hAh_test檢驗,而hAhr_test是HAC-Roust穩健版本檢驗。只要在殘差中沒有觀察到顯著的hac,在小樣本中使用hAh_test檢驗更精確。

eq_r12 <- midas_r(y ~ trend + mls(x, 0:7, 4, nealmon) + mls(z, 0:16, 12, nealmon), start = list(x = c(1, -0.5), z = c(2, 0.5, -0.1)))
summary(eq_r12)
#約束的充分性檢驗
hAh_test(eq_r12)
hAhr_test(eq_r12)

結果:

hAh restriction test

data:  
hAh = 16.552, df = 20, p-value = 0.6818

hAh restriction test (robust version)

data:  
hAhr = 14.854, df = 20, p-value = 0.7847

對比:將變量z的函數約束的參數數量由17減少爲2個

eq_r13 <- midas_r(y ~ trend + mls(x, 0:7, 4, nealmon) + mls(z, 0:12, 12, nealmon), start = list(x = c(1, -0.5), z = c(2, -0.1)))                                                                                                
hAh_test(eq_r13)
hAhr_test(eq_r13)
summary(eq_r13)

結果:

	hAh restriction test

data:  
hAh = 36.892, df = 17, p-value = 0.00348


	hAh restriction test (robust version)

data:  
hAhr = 32.879, df = 17, p-value = 0.01168

拒絕原假設,即約束不充分

最優模型選取

step1:函數expand_weights_lags定義每個模型的潛在模型集

set_x <- expand_weights_lags(weights = c("nealmon", "almonp"), from = 0, to = c(5, 10), m = 1, start = list(nealmon = rep(1, 2), almonp = rep(1, 3)))


set_z <- expand_weights_lags(weights = c("nealmon", "nbeta"), from = 1, to = c(2, 3), m = 1, start = list(nealmon = rep(0,2), nbeta = rep(0.5, 3)))

step2:所有可能模型的估計

函數midas_r_ic_table返回所有模型的彙總表,以及常用信息準則的相應值和參數約束充分性檢驗(默認是hAh_test檢驗)的經驗大小。給出了導數檢驗的結果和優化函數的收斂狀態。

eqs.ic <- midas_r_ic_table(y ~ trend + mls(x, 0, m = 4) + fmls(z, 0, m = 12), table = list(z = set_z,x = set_x), start = c(`(Intercept)` = 0, trend = 0))

mod <- modsel(eqs.ic, IC = "AIC", type = "restricted")

step3:查看某個候選模型可微調其收斂性,並更新結果:

eqs_ic$candlist[[5]] <- update(eqs_ic$candlist[[5]], Ofunction = "nls")

eqs_ic <- update(eqs_ic)

手動選取模型

step1:訓練集(樣本內預測)、測試集(樣本外預測)分割

datasplit <- split_data(list(y = y, x = x, z = z, trend = trend), insample = 1:200, outsample = 201:250)

step2:分別擬合待選模型

mod1 <- midas_r(y ~ trend + mls(x, 4:14, 4, nealmon) + mls(z, 12:22, 12, nealmon), data = datasplit$indata, start = list(x = c(10, 1, -0.1), z = c(2, -0.1)))

mod2 <- midas_r(y ~ trend + mls(x, 4:20, 4, nealmon) + mls(z, 12:25, 12, nealmon), data = datasplit$indata, start = list(x = c(10, 1, -0.1), z = c(2, -0.1)))

step3:計算

avgf <- average_forecast(list(mod1, mod2), data = list(y = y, x = x, z = z, trend = trend), insample = 1:200, outsample = 201:250, type = "fixed", measures = c("MSE", "MAPE", "MASE"), fweights = c("EW", "BICW", "MSFE", "DMSFE"))

樣本內、樣本外預測精度輸出:

avgf$accuracy

結果對比:

$`individual`
                                                              Model MSE.out.of.sample MAPE.out.of.sample
1 y ~ trend + mls(x, 4:14, 4, nealmon) + mls(z, 12:22, 12, nealmon)          1.848474           4.474242
2 y ~ trend + mls(x, 4:20, 4, nealmon) + mls(z, 12:25, 12, nealmon)          1.763432           4.391594
  MASE.out.of.sample MSE.in.sample MAPE.in.sample MASE.in.sample
1          0.8155477      1.768720       14.05880      0.6992288
2          0.7976764      1.726365       13.56287      0.6897242

$average
  Scheme      MSE     MAPE      MASE
1     EW 1.803640 4.432918 0.8066121
2   BICW 1.763433 4.391595 0.7976766
3   MSFE 1.802640 4.431945 0.8064017
4  DMSFE 1.801787 4.431112 0.8062216

mod1,mod2預測結果輸出:

avgf$forecast

結果對比:

          [,1]     [,2]
 [1,] 22.24057 22.38300
 [2,] 22.12774 22.13778
 [3,] 22.22678 22.21942
 [4,] 22.44535 22.61872
 [5,] 22.36448 22.46157
 [6,] 22.49847 22.66663
 [7,] 22.70192 22.70818
 [8,] 22.49411 22.53326
 [9,] 22.47978 22.54235
[10,] 22.43439 22.46522
[11,] 22.61563 22.57788
[12,] 22.87686 22.88271
[13,] 22.59329 22.64370
[14,] 23.39675 23.47606
[15,] 23.25864 23.27932
[16,] 22.97588 23.09488
[17,] 23.13196 23.21956
[18,] 24.05489 24.04334
[19,] 24.11690 24.16742
[20,] 23.79328 23.91429
[21,] 24.58047 24.58925
[22,] 24.33794 24.45825
[23,] 23.96658 24.03107
[24,] 24.00510 24.12939
[25,] 23.96473 23.99645
[26,] 24.02198 24.04416
[27,] 25.01521 24.91966
[28,] 24.47063 24.59918
[29,] 24.58748 24.51964
[30,] 24.69662 24.72768
[31,] 25.19092 25.22659
[32,] 25.57452 25.72345
[33,] 25.06312 25.17678
[34,] 24.99302 25.09359
[35,] 25.13005 25.16293
[36,] 25.85870 25.85633
[37,] 25.43593 25.46734
[38,] 25.80690 25.88224
[39,] 26.80900 26.82394
[40,] 26.44056 26.63095
[41,] 26.04786 26.17996
[42,] 26.09545 26.29647
[43,] 25.99309 25.98112
[44,] 25.89755 26.03643
[45,] 25.99814 26.18442
[46,] 26.25727 26.38797
[47,] 26.79563 26.92593
[48,] 26.57034 26.68217
[49,] 26.53808 26.51001
[50,] 26.72772 26.76183

組合預測效果輸出:

avgf$avgforecast

結果對比:

其中EW:基於特徵值加權組合;BICW:BIC信息準則加權組合;MSFE: 預測均方誤差加權組合;DMSFE:discounted MSFE

         EW     BICW     MSFE    DMSFE
 [1,] 22.31178 22.38299 22.31346 22.31489
 [2,] 22.13276 22.13778 22.13288 22.13298
 [3,] 22.22310 22.21942 22.22301 22.22294
 [4,] 22.53204 22.61872 22.53408 22.53582
 [5,] 22.41303 22.46157 22.41417 22.41515
 [6,] 22.58255 22.66663 22.58453 22.58622
 [7,] 22.70505 22.70818 22.70512 22.70519
 [8,] 22.51369 22.53326 22.51415 22.51454
 [9,] 22.51107 22.54235 22.51180 22.51244
[10,] 22.44980 22.46522 22.45017 22.45048
[11,] 22.59675 22.57788 22.59631 22.59593
[12,] 22.87979 22.88271 22.87986 22.87991
[13,] 22.61849 22.64370 22.61909 22.61960
[14,] 23.43641 23.47606 23.43734 23.43814
[15,] 23.26898 23.27932 23.26922 23.26943
[16,] 23.03538 23.09488 23.03678 23.03798
[17,] 23.17576 23.21956 23.17679 23.17767
[18,] 24.04912 24.04334 24.04898 24.04887
[19,] 24.14216 24.16742 24.14276 24.14326
[20,] 23.85379 23.91429 23.85521 23.85643
[21,] 24.58486 24.58925 24.58497 24.58505
[22,] 24.39809 24.45825 24.39951 24.40072
[23,] 23.99882 24.03107 23.99958 24.00023
[24,] 24.06724 24.12938 24.06870 24.06996
[25,] 23.98059 23.99645 23.98096 23.98128
[26,] 24.03307 24.04416 24.03334 24.03356
[27,] 24.96744 24.91967 24.96631 24.96535
[28,] 24.53491 24.59918 24.53642 24.53771
[29,] 24.55356 24.51964 24.55277 24.55208
[30,] 24.71215 24.72768 24.71252 24.71283
[31,] 25.20876 25.22659 25.20918 25.20954
[32,] 25.64898 25.72345 25.65074 25.65224
[33,] 25.11995 25.17678 25.12129 25.12244
[34,] 25.04331 25.09359 25.04449 25.04550
[35,] 25.14649 25.16293 25.14688 25.14721
[36,] 25.85751 25.85633 25.85749 25.85746
[37,] 25.45163 25.46734 25.45200 25.45232
[38,] 25.84457 25.88224 25.84546 25.84622
[39,] 26.81647 26.82394 26.81665 26.81680
[40,] 26.53575 26.63095 26.53799 26.53991
[41,] 26.11391 26.17996 26.11546 26.11679
[42,] 26.19596 26.29647 26.19833 26.20035
[43,] 25.98711 25.98112 25.98696 25.98684
[44,] 25.96699 26.03642 25.96862 25.97002
[45,] 26.09128 26.18442 26.09348 26.09535
[46,] 26.32262 26.38797 26.32416 26.32548
[47,] 26.86078 26.92593 26.86232 26.86363
[48,] 26.62626 26.68217 26.62757 26.62870
[49,] 26.52404 26.51001 26.52371 26.52343
[50,] 26.74478 26.76183 26.74518 26.74552

參考文獻:《Mixed Frequency Data Sampling Regression Models: The R Package midasr》

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章