2.預測示例
2.1 數據
2.2 基本預測
2.3 替代預測方案
2.4 TARCH
××××××××××××××××××××××
2.1 Data
這些示例是用來Yahoo網站的標準普爾500指數,並通過pandas-datareader包管理數據下載。
import datetime as dt
import sys
import numpy as np
import pandas as pd
import pandas_datareader.data as web
from arch import arch_model
start = dt.datetime(2000,1,1)
end = dt.datetime(2017,1,1)
data = web.get_data_famafrench('F-F_Research_Data_Factors_daily', start=start, end=end)
mkt_returns = data[0]['Mkt-RF'] + data[0]['RF']
returns = mkt_returns
2.2 基本預測
預測可以使用標準GARCH(p,q)模型及以下三種方法的任一種生成:
- 解析法
- 基於模擬法
- 基於自舉法
默認預測將用完樣本中的最後一個觀察值,對樣本外的數據進行預測。
預測開始時,將使用給定的模型和估計所得到的參數。
am = arch_model(returns, vol='Garch', p=1, o=0, q=1, dist='Normal')
res = am.fit(update_freq=5)
Iteration: 5, Func. Count: 39, Neg. LLF: 6130.463290920333
Iteration: 10, Func. Count: 71, Neg. LLF: 6128.4731771407005
Optimization terminated successfully. (Exit mode 0)
Current function value: 6128.4731681952535
Iterations: 11
Function evaluations: 77
Gradient evaluations: 11
forecasts = res.forecast()
預測被放在 ARCHModelForecast目標類中,其具有四個屬性:
mean
- 預測均值residual_variance
- 預測殘差的方差,即 。variance
- 預測過程的方差,即 . 當模型均值動態變化時,比如是一個AR過程時,該方差不同於殘差方差。simulations
- 一個包括模擬詳細信息的對象類,僅僅在預測方法設爲模擬或自舉時可用;如果使用解析方法analytical,則該選項不可用。
這三種結果均返回h.#列的DataFrame格式數據, #
表示預測步數。也就是說,h.1
對應於提前一步預測,而h.10對應於提前10步預測。
默認預測僅僅產生1步預測。
print(forecasts.mean.iloc[-3:])
print(forecasts.residual_variance.iloc[-3:])
print(forecasts.variance.iloc[-3:])
h.1
Date
2016-12-28 NaN
2016-12-29 NaN
2016-12-30 0.061286
h.1
Date
2016-12-28 NaN
2016-12-29 NaN
2016-12-30 0.400956
h.1
Date
2016-12-28 NaN
2016-12-29 NaN
2016-12-30 0.400956
更長步數的預測可以通過傳遞horizon參數進行計算得出。
forecasts = res.forecast(horizon=5)
print(forecasts.residual_variance.iloc[-3:])
h.1 h.2 h.3 h.4 h.5
Date
2016-12-28 NaN NaN NaN NaN NaN
2016-12-29 NaN NaN NaN NaN NaN
2016-12-30 0.400956 0.416563 0.431896 0.446961 0.461762
沒有計算的值則用 nan 填充。
2.3 替代預測方案
2.3.1 固定窗口預測
固定窗口預測使用截至給定日期的數據來產生此日期後的全部預測結果。在初始化模型時,可以通過傳遞進全部數據,在使用fit.forecast()時使用last_obs將會產生該日期後的全部預測結果。
注意: last_obs
遵從Python序列規則,因此last_obs中的實際日期並非在樣本中。
res = am.fit(last_obs = '2011-1-1', update_freq=5)
forecasts = res.forecast(horizon=5)
print(forecasts.variance.dropna().head())
Iteration: 5, Func. Count: 38, Neg. LLF: 4204.91956121224
Iteration: 10, Func. Count: 72, Neg. LLF: 4202.815024845146
Optimization terminated successfully. (Exit mode 0)
Current function value: 4202.812110685669
Iterations: 12
Function evaluations: 84
Gradient evaluations: 12
h.1 h.2 h.3 h.4 h.5
Date
2010-12-31 0.365727 0.376462 0.387106 0.397660 0.408124
2011-01-03 0.451526 0.461532 0.471453 0.481290 0.491043
2011-01-04 0.432131 0.442302 0.452387 0.462386 0.472300
2011-01-05 0.430051 0.440239 0.450341 0.460358 0.470289
2011-01-06 0.407841 0.418219 0.428508 0.438710 0.448825
2.3.2 滾動窗口預測
滾動窗口預測使用固定長度樣本,且隨即產生基於最後一個觀察值的一步式預測。這個可以通過first_obs和last_obs來實現。
index = returns.index
start_loc = 0
end_loc = np.where(index >= '2010-1-1')[0].min()
forecasts = {}
for i in range(20):
sys.stdout.write('.')
sys.stdout.flush()
res = am.fit(first_obs=i, last_obs=i+end_loc, disp='off')
temp = res.forecast(horizon=3).variance
fcast = temp.iloc[i+end_loc-1]
forecasts[fcast.name] = fcast
print()
print(pd.DataFrame(forecasts).T)
h.1 h.2 h.3
2009-12-31 0.598199 0.605960 0.613661
2010-01-04 0.771974 0.778431 0.784837
2010-01-05 0.724185 0.731008 0.737781
2010-01-06 0.674237 0.681423 0.688555
2010-01-07 0.637534 0.644995 0.652399
2010-01-08 0.601684 0.609451 0.617161
2010-01-11 0.562393 0.570450 0.578447
2010-01-12 0.613401 0.621098 0.628738
2010-01-13 0.623059 0.630676 0.638236
2010-01-14 0.584403 0.592291 0.600119
2010-01-15 0.654097 0.661483 0.668813
2010-01-19 0.725471 0.732355 0.739187
2010-01-20 0.758532 0.765176 0.771770
2010-01-21 0.958742 0.964005 0.969229
2010-01-22 1.272999 1.276121 1.279220
2010-01-25 1.182257 1.186084 1.189883
2010-01-26 1.110357 1.114637 1.118885
2010-01-27 1.044077 1.048777 1.053442
2010-01-28 1.085489 1.089873 1.094223
2010-01-29 1.088349 1.092875 1.097367
2.3.3 遞歸預測方案
除了初始數據維持不變意外,其他方面遞歸方法與滾動方法類似. 這個可以方便地通過略掉first_obs選項而實現。
import pandas as pd
import numpy as np
index = returns.index
start_loc = 0
end_loc = np.where(index >= '2010-1-1')[0].min()
forecasts = {}
for i in range(20):
sys.stdout.write('.')
sys.stdout.flush()
res = am.fit(last_obs=i+end_loc, disp='off')
temp = res.forecast(horizon=3).variance
fcast = temp.iloc[i+end_loc-1]
forecasts[fcast.name] = fcast
print()
print(pd.DataFrame(forecasts).T)
h.1 h.2 h.3
2009-12-31 0.598199 0.605960 0.613661
2010-01-04 0.772200 0.778629 0.785009
2010-01-05 0.723347 0.730126 0.736853
2010-01-06 0.673796 0.680934 0.688017
2010-01-07 0.637555 0.644959 0.652306
2010-01-08 0.600834 0.608511 0.616129
2010-01-11 0.561436 0.569411 0.577324
2010-01-12 0.612214 0.619798 0.627322
2010-01-13 0.622095 0.629604 0.637055
2010-01-14 0.583425 0.591215 0.598945
2010-01-15 0.652960 0.660231 0.667447
2010-01-19 0.724212 0.730968 0.737673
2010-01-20 0.757280 0.763797 0.770264
2010-01-21 0.956394 0.961508 0.966583
2010-01-22 1.268445 1.271402 1.274337
2010-01-25 1.177405 1.180991 1.184549
2010-01-26 1.106326 1.110404 1.114450
2010-01-27 1.040930 1.045462 1.049959
2010-01-28 1.082130 1.086370 1.090577
2010-01-29 1.082251 1.086487 1.090690
2.4 TARCH模型
2.4.1 解析預測
所有的 ARCH-類模型都可以進行一步解析預測。更長步數的解析預測僅針對特定模型的特別設定而言。TARCH模型下當步數大於1時,不存在解析式預測(封閉式預測) 。因此,更長步數的解析需要採用模擬或自舉方法。嘗試使用解析方法mothod='analytical'去產生大於1步的預測結果,則會返回ValueError錯誤值。
# TARCH specification
am = arch_model(returns, vol='GARCH', power=2.0, p=1, o=1, q=1)
res = am.fit(update_freq=5)
forecasts = res.forecast()
print(forecasts.variance.iloc[-1])
Iteration: 5, Func. Count: 44, Neg. LLF: 6037.930348422024
Iteration: 10, Func. Count: 82, Neg. LLF: 6034.462051044527
Optimization terminated successfully. (Exit mode 0)
Current function value: 6034.461795464493
Iterations: 12
Function evaluations: 96
Gradient evaluations: 12
h.1 0.449483
Name: 2016-12-30 00:00:00, dtype: float64
2.4.2 模擬預測
當使用模擬或自舉方法進行預測時,關於ARCHModelForecast
對象的一種屬性非常有價值– simulation
.
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,1)
subplot = (res.conditional_volatility['2016'] ** 2.0).plot(ax=ax, title='Conditional Variance')
forecasts = res.forecast(horizon=5, method='simulation')
sims = forecasts.simulations
lines = plt.plot(sims.residual_variances[-1,::10].T, color='#9cb2d6')
lines[0].set_label('Simulated path')
line = plt.plot(forecasts.variance.iloc[-1].values, color='#002868')
line[0].set_label('Expected variance')
legend = plt.legend()
import seaborn as sns
sns.boxplot(data=sims.variances[-1])
2.4.3 自舉預測
除了基於歷史數據而非基於假定分佈以外,自舉預測方法幾乎與模擬預測一致。使用這種方法的預測也返回一個ARCHModelForecastSimulation
對象類,包括關於模擬路徑的信息。
forecasts = res.forecast(horizon=5, method='bootstrap')
sims = forecasts.simulations
lines = plt.plot(sims.residual_variances[-1,::10].T, color='#9cb2d6')
lines[0].set_label('Simulated path')
line = plt.plot(forecasts.variance.iloc[-1].values, color='#002868')
line[0].set_label('Expected variance')
legend = plt.legend()