Recently, there has been much work on modeling the series data, especially in financial markets.
A time series is a series of data points indexed (or listed or graphed) in time order. - From WikipediaA time series is a series of data points indexed (or listed or graphed) in time order. - From Wikipedia
The purposes of time series analysis are to understand or model the stochastic mechanism, which gives rise to an observed series, and to predict or forecast the future values of a series based on the history of that series. Generally, there are two primary methods in series analysis, which are frequency-domain and time-domain. The former is based mostly on Fourier Transform while the latter closely investigates the autocorrelation of the time series. Time domain series make huge use of Box-Jenkins and ARCH/GARCH methods to perform forecast of the series.
In this project, this research generally investigates the financial time series such as the price and return of stock index NASDAQ Composite Index using ARIMA and GARCH methods.
The basic information we can capture from financial market is price. And the basic feature of financial time series, such as price of NASDAQ Composite in picture 1, is its relatively high volatility, which usually changes through time. The picture 1 illustrates the daily price of NASDAQ Composite from 01-Jan-2001 to 01-Jan-2015, from which we can get that the volatility of the stock index is obvious.
Picture 1 NASDAQ Composite Price and Log-Return
Due to the geometrically randomness of stock price, we can investigate the log-price instead the original price of stock index by using the formula below
The lower part in Picture 1 illustrates the graph of log-return of NASDAQ Composite, from which we can observe the volatility clustering, giving us a hint that they may not be independently and identically distributed—otherwise the variance would be constant over time.
Time domain method is established and implemented by observing the autocorrelation of the time series. For stationary time series, we can use AR process to investigate. Assume is modeled as weighted average of past observations plus a white noise “error”. The formula is as follow
is White Noise with zero mean and variance .
Assume time series is modeled as a weighted average (moving average) of the past value of white noise process rather than of past value. The formula is as follow
is also White Noise with zero mean and variance . If we combine the AR process and MA process together, we can get the ARMA process.
If the assumption of stationary of the time series can’t sustain, we can make use of differencing of the random variables to convert the non-stationary process into stationary process. A time series is said to follow an integrated autoregressive moving average (ARIMA) model if the d-th difference is a stationary ARMA process.
The ARMA time series models are unsatisfactory for modeling volatility changes. And if the conditional variance of time series is inconstant. We need to take use of ARCH model. The formula of ARCH(q) model is
is , , . process is uncorrelated and it has a constant mean and a constant unconditional variance.
In the GARCH model, assume and , we get
in which the past value of σt process are fed back. is uncorrelated with a stationary mean and variance.
Condition is that is less than 1. Generally, we can combine the ARIMA model and GARCH model, such as AR(1)/GARCH(1,1) as follow ; ;
We still use the daily price of NASDAQ Composite from 01-Jan-2001 to 01-Jan-2015 as time series data to do the research and we mainly study the log-return.
Autocorrelation of time series is very essential to study. The graphs of the Autocorrelation Functions (ACF), Partial ACF of the original value, absolute value and squared value of log-return are investigated, which can be found in Picture 2.
The dashed horizontal lines are intended to give critical values for testing whether or not the autocorrelation coefficients are significantly different from zero. The pictures display some significant autocorrelations and hence provide some evidence that the daily NASDAQ returns are not constant and ARIMA model need to be taken into consideration.
By Extended Autocorrelation (EACF) method, we get table 1.
Table 1 The Sample EACF of Log-Return of NASDAQ Composite
Of course, the sample EACF will never be this clear-cut, from the table we can get that ARMA(2,1) might be taken consideration.
We can also use Dickey-Fuller Unit-Root Test to test the stationary of the time series.
Augmented Dickey-Fuller Test
Dickey-Fuller = -14.141, Lag order = 15, p-value = 0.01
alternative hypothesis: stationary
From the result, the p-value is smaller than 0.05, which means that we can significantly reject the null hypothesis.
To select the parameters of ARIMA and GARCH model, we need use Akaike Information Criterion (AIC) to select the best model. In fitting ARIMA/GARCH model, the idea of parsimony is important in which the model should have as small parameters as possible yet still be capable of explaining the series.
Table 1 illustrates that the ARIMA(2,1,1) and GARCH(1,2) models can best fitted the NASDAQ Composite log-return time series.
Table 2 Model Selection
picture 2 ACF/PACF of log-return
Here are the general information of ARIMA(2,1,1) and GARCH(1,2) models.
Table 3 Summary of model ARIMA(2,1,1)
arima(x = spreturn, order = c(2, 1, 1))
ar1 ar2 ma1
-0.0405 -0.0668 -0.9995
s.e. 0.0170 0.0170 0.0019
sigma^2 estimated as 0.0002409: log likelihood = 9653.35,
AIC = -19300.71, BIC=-19274.05
Table 4 Summary of model GARCH(1,2)
garch(x = res.arima211, order = c(1, 2))
Min 1Q Median 3Q Max
-5.6575 -0.5666 0.1015 0.6534 4.4947
Estimate Std. Error t value Pr(&gt;|t|)
a0 2.400e-06 4.548e-07 5.278 1.31e-07 ***
a1 2.911e-02 5.940e-03 4.901 9.55e-07 ***
a2 6.618e-02 1.042e-02 6.353 2.11e-10 ***
b1 8.921e-01 1.020e-02 87.450 |t|)$ values all close to zero.
The procedure includes observing residual plot, checking Ljung-Box result and its ACF & PACF diagram. Because the normality assumption of the innovations can be explored by plotting the QQ plot, we can firstly observe the residuals graph and its QQ-Plot of ARIMA(2,1,1)/GARCH(1,2) models.
Picture 3 QQ-Plot of Residuals
The plots in picture 3 show that residuals seem to be roughly normally distributed although some points remain off the line. We can also get that the residuals of ARIMA/GARCH model are more normally distributed, compared to residuals of ARIMA model. In picture 4 is the residual graph of ARIMA(2,1,1) and GARCH(1,2) models. We can observe that the residual graphs do not have any significant lag, indicating ARIMA and mixed models are good models to represent the series. However, the distribution of mixed model is more even.
Picture 4 Residuals Graph
From the Ljung-Box Test, we can find that in the degree of freedom 6,12 and 18, the p-values are all larger than 0.05, which means that we cannot reject the hypothesis that the residuals are independent. So selected model seems to be appropriate.
Table 5 Box-Ljung test of ARIMA(2,1,1)/GARCH(1,2)
X-squared = 7.5039, df = 6, p-value = 0.2767
X-squared = 14.08, df = 12, p-value = 0.2956
X-squared = 18.902, df = 18, p-value = 0.3979
Picture 5 ACF/PACF and ARMA-subset plot of ARIMA/GARCH model
The ACF and PACF plot show that the residuals have no trend or pattern and except for lag 1, 15 and 24, residuals of most lags are below the critical values, showing that the autocorrelation coefficients are not significantly different from zero.
However, the ARMA-subsets plot show that the mixed model is still not that perfect, as the residuals of lag 14 rather that 1 or 2 has the lowest BIC value.
We can use the ARIMA(2,1,1)/GARCH(1,2) model to predict the return and price of NASDAQ COMPOSITE index, which can be used to trade in the financial market. Moreover, using time series method to predict has already been widely used and accepted. In the right part of picture 6, the grey points are the real data which are not been used to develop the mixed model in this research and highly coincide with the prediction of ARIMA/GARCH model.
Picture 6 Prediction of Return and Price
On the one hand, ARMA model focuses on analyzing time series, which does not reflect recent changes of information. And it is more suitable to be applied for stationary series and in order to study non-stationary series, difference and transform (such as log transformation) need to be used. On the other hand, ARIMA is often used together with ARCH/GARCH model, which can better measure volatility of the series. By using the mixed models, people can forecast future values with up-to-date information.
In this research, we study the NASDAQ INDEX time series by using ARIMA/GARCH model with a few techniques to select parameters and tests to diagnose the model, which indicate that this mixed model can fit the series at some level. We also use this model to predict the return and price, which are coincide with real data.