Markov Behavior of Volatility Index (VIX)

Pranav Ahluwalia | Math 7241

Introduction

The volatility index (VIX) measures the expectation of market volatility based on S&P 500 options. This paper will test the hypothesis as to whether the volatility index follows the markov property, where the value of the time series at $T(n + 1)$ is dependent only upon the value at $T(n)$.

Data and Observation

VIX time series data from January 1990 to November 2021 was procured from the chicago board of exchange website https://www.cboe.com/tradable_products/vix/vix_historical_data/.

Normalization and Time Series Mapping

In order to map the time series to a discrete set of states; the data was centered around the mean, and rounded to the nearest standard deviation.

$$y_{i} = \frac{y_{i}-\mu}{\sigma}$$

After normalization and rounding, there are 10 states the time series can exist in. Let $S$ represent our set of states. $$S = \{-1, 0, 1, 2, 3, 4, 5, 6, 7, 8\}$$ Observe that the time series data is far more readable and resembles the behavior of a state transition system more closely.

Occupation Frequencies

Below is a frequency histogram representing the occupation frequencies for the time series data in each state. The time series spends the majority of its time in state -1 and state 0.

Markov Chain Analysis

The python module below takes in a set of states $S$ and an array representing state history eg. $\begin{bmatrix}-1 & 0 & 1 & 1 & 2 & ... \end{bmatrix}$. The output is a markov chain based on the observed state transition probabilities from the sequence. The module also provides the ability to simulate the derived markov chain over an arbitrary sequence-length.

Now, we will instantiate the markov chain object and use it to produce a transition matrix.

Below is a record of the transition matrix that was extracted from the formatted time series data.

The function $getStationary(P)$ will obtain the stationary distribution of the chain by exponentiating it an arbitrary number of times.

Comparing Stationary Distribution to Occupation Frequencies

We can compare our stationary distribution calculated from the transition matrix to the stationary distribution observed from the time series. Let $W$ denote the empirical distribution. Let $W'$ denote the stationary distribution of the chain. Note that the numbers had to be rounded since their true decimal values are very long. For this reason; the displayed distributions might not add up to exactly 1. $$W \approx \begin{bmatrix}0.36 & 0.41 & 0.15 & 0.04 & 0.016 & 0.003 & 0.0029 & 0.0024 & 870.7551\mu & 373.1807\mu\end{bmatrix}$$ $$W' \approx \begin{bmatrix}0.3621 & 0.4102 & 0.1598 & 0.04085 & 0.016366 & 0.003748 & 0.002998 & 0.002498 & 870.7551\mu & 373.1807\mu\end{bmatrix}$$

The empirical and derived stationary distributions are equivalent.

Simulating The Markov Chain

We will now simulate the markov chain to produce a synthetic time series.

Comparing the time series of the original VIX to the simulation, we can see that the behavior of both data sets appears similar.

The histogram below displays the occupation frequencies for our simulation. The simulated occupation frequencies are similar to our derived stationary distribution which is in turn similar to the empirical distribution originally obtained from our data.

Autocorrelation Analysis

We will examine a plot of the autocorrelation functions of our simulated time series and our original data. The blue line displays the autocorrelation plot for the simulation whereas the red line displays the empirical autocorrelation. The behavior of both functions appear observably similar to one another.

Goodness of Fit Test

Let $\hat{p}$ denote the computed transition matrix. Let $N_{ij}$ denote the observed probability of going from i to j in two steps.

$$q_{ij} = \sum_{k}\hat{p_{ik}}\hat{p_{kj}}$$$$N_{i} = \sum_{j}N_{ij} $$$$M_{ij} = N_{i}q_{ij}$$

We will employ a goodness of fit test to compare the observed frequencies $N_{ij}$ with the expected frequencies $M_{ij}$. We will declare a function to compute each quantity stated above.

Preliminary Functions

$\chi^{2}$ Goodness of Fit Test

Expected Frequencies

Observed Frequencies

Cleaning Matrices and Test Statistic

In order to perform a $\chi^{2}$ test, we will have to remove entries with expected frequency of 0. This does not harm the validity of the test since there are no entires where the expected frequency is 0 and the empirical frequency is non-zero.

Results

Next, we will run a $\chi^{2}$ test

The Null hypothesis of che $\chi^{2}$ distribution is that the observed data comes from the expected distribution. The results of the $\chi^{2}$ test above show a failure to reject this hypothesis at the .05 significance level. Therefore, the two-step probability transitions in the observed time series follow the expected distribution.