Correlations and Copulas

LOS 1. Define correlation and covariance and differentiate between correlation and dependence.

Covariance (and correlation) measure the co-movement of two random variables based on strength of linear relationship between them. For variables $X$ and $Y$, covariance is: $$cov(X,Y) = \sigma_{XY} = E(XY)-E(X) E(Y)$$ Covariance can be scaled into a unit-less and bounded variable, i.e. the correlation as: $$\rho_{XY} = \frac{\sigma_{XY}}{\sigma_X \sigma_Y}$$ Correlation between two variables always lies between -1 and 1. A correlation of zero (i.e. uncorrelatedness) implies that: $$\rho_{XY} = 0 \Leftrightarrow \sigma_{XY} = 0 \Rightarrow E(XY) = E(X)E(Y)$$

Variables $X$ and $Y$ are independent if knowledge of one of these variables does not impact the probability distribution of the other variable. The conditional distribution then satisfies: $$f(Y|X = x) = f(Y)$$ While independence of two variables does imply that they are uncorrelated, a correlation of zero does not imply independence. A zero correlation just implies that there is no linear dependence.

Linear dependence between two variables can be visually checked by plotting expected value of one against the other i.e. say plotting $E(Y)$ vs $X$.

LOS 2. Calculate covariance using the EWMA and GARCH (1,1) models.

Variables $X$ and $Y$, whose values at end of day $i$ are given by $X_i$ and $Y_i$ are first converted into daily return. $$x_i = \frac{X_i -X_{i-1}}{X_{i-1}}\\ y_i = \frac{Y_i – Y_{i-1}}{Y_{i-1}}$$ We make the assumption that $E(x_n) =0$ and $E(y_n) =0$. Using data from past $m$ observations, this covariance can be estimated as: $$\widehat{\mbox{cov}}_n = \frac{1}{m} \sum^m_{i=1}x_{n-i}y_{n-i}$$ and, using the variances from the same data set i.e. $$\widehat{\mbox{var}}_{x,n} = \frac{1}{m} \sum^m_1 x^2_{n-i}\mbox{ and } \widehat{\mbox{var}}_{y, n} = \frac{1}{m} \sum^m_1 y^2_{n-i},$$ One can define sample correlation: $$\hat{\rho}_n = \frac{\widehat{\mbox{cov}}_n}{(\widehat{\mbox{var}}_{x,n})^{1/2}(\widehat{\mbox{var}}_{y,n})^{1/2}}$$

Using EWMA approach, covariance as of day $n$ will be: $$\widehat{\mbox{cov}}_n = \lambda \widehat{\mbox{cov}}_{n-1} + (1-\lambda)x_{n-1} y_{n-1}$$ And using GARCH, it will be: $$\widehat{\mbox{cov}}_n = \omega + \alpha x_{n-1}y_{n-1}+ \beta \widehat{\mbox{cov}}_{n-1}$$ where, $\omega = \gamma \cdot cov_L$. Since $\alpha + \beta + \gamma =1$, $$\mbox{cov}_L = \frac{\omega}{1-\alpha -\beta}$$

Covariance from EWMA or GARCH can be converted to correlation by dividing it by variances of $X$ and $Y$ (estimated using the same approach as covariance).

LOS 3. Apply the consistency condition to covariance.

Consider a set of market variables $X_1, X_2…X_N$. One can define a variance-covariance matrix $(\Sigma)$, the diagonal entries of which $(\Sigma_{ii})$ carry the variances, while non-diagonal entries carry $(\Sigma_{ij})$ covariances. Variance-covariance matrix can be obtained by:

$$\Sigma = \sigma^T\rho \sigma$$

The variance-covariance matrix should satisfy a consistency condition called positive semi-definiteness

$$w^T \sum w \geq 0$$ for all $N \times 1$ vectors $w$.

To ensure that a positive-semi definite matrix is produced, covariances should be calculated using the exact same variables used for calculating variances. This means we use the same data points to compute covariances and variances.

LOS 4. Describe the procedure of generating samples from a bivariate normal distribution.

A bivariate normal distribution as the name suggests involves two normally distributed variables, say $V_1$ and $V_2$ distributed as:. $$V_1 \sim N(\mu_1, \sigma_1)\\ V_2 \sim N(\mu_2, \sigma_2)$$ If suppose we know the value of $V_1$. Conditional on this fact, we can say that value of $V_2$ is normally distributed with mean: $$\mu_2 + \rho_{12}\sigma_2 \left(\frac{V_1-\mu_1}{\sigma_1}\right)$$ and standard deviation: $$\sigma_2 \sqrt{1-\rho^2_{12}}$$ We now look at how to generate pairs of random variables, $\varepsilon_1$ and $\varepsilon_2$, both of which are distributed as $\varepsilon_1,\varepsilon_2 \sim N (0,1) $ and have correlation $\rho_{12}=\rho$. We follow a three step procedure:

obtain independent samples $z_1$ and $z_2$ from univariate normal distributions,
the first required variable can be set directly i.e. $\varepsilon_1=z_1$,
the second variable is set as:

$$\varepsilon_2 = \rho z_1+\sqrt{1-\rho^2}z_2$$ In matrix form, the above can be expressed as: $$ \begin{bmatrix} \varepsilon_1 \\ \varepsilon_2 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ \rho & \sqrt{1-\rho^2} \end{bmatrix} \begin{bmatrix} z_1 \\ z_2 \end{bmatrix} $$

LOS 5. Describe properties of correlations between normally distributed variables when using a one-factor model.

Consider $N$ variables $X_1, X_2,…X_N$ each of which follows the standard normal distribution. In a one factor model, each $X_i$ has a component dependent on a common factor, $F$, and a component that is uncorrelated with the other variables: $$X_i = \alpha_i F + \sqrt{1-\alpha^2_i} Z_i$$ Where $F$ and the $Z_i$ have standard normal distributions and $\alpha_i$ is a constant between $-1$ and $+1$. The $Z_i$ are uncorrelated with each other and uncorrelated with $F$. $X_i$ has a mean of zero and a variance of one.

All the correlation between $X_i$ and $X_j$ arises from their dependence on the common factor $F$. The coefficient of correlation between $X_i$ and $X_j$ is $\alpha_i \alpha_j$.

Advantage of such a one factor model is that the resulting covariance matrix is always positive semi-definite. Without assuming a factor model, the number of correlations that have to be estimated for the $N$ variables is $\frac{N(N-1)}{2}$. With the one-factor model, we need only estimate $N$ parameters: $\alpha_1, \alpha_2,…,\alpha_N$. These are sufficient to get all correlations: $$\rho_{ij} = \alpha_i \alpha_j$$ An example of a one-factor model is the Capital Asset Pricing Model (CAPM). The factor in that model is the excess market return.

LOS 6. Define copula and describe the key properties of copulas and copula correlation. Describe the Gaussian copula, Student’s t-copula, multivariate copula, and one-factor copula.

Marginal distribution or density of a variable looks at this variable in isolation (without knowledge of other variables) and are hence easier to model. It can be expressed as a probability density function $f(x)$, that describes the probability of observing a value in the vicinity of $x$ or a cumulative density function $F(x)$, given by $\Pr(x_L \lt X \lt x)$.

If independent, the joint distribution is simply the product of marginal densities of individual variables. If variables are not independent their dependence structure is difficult to model unless the two variables are individually normally distributed. In case of such variables, dependence is modeled via correlation.

The dependence structure of multiple variables is modelled via a function called the copula, whose job is to attach or link marginal distributions into a joint distribution. The copula is a function (denoted by $c$) that takes as inputs the marginal CDFs of the variables (say, $F_X(x)$ and $F_Y(y)$), along with other parameters (collectively denoted as $\theta$): $c[F_X(x),F_Y(y);\theta].$

The explicit joining of marginal distributions is done via the Sklar’ stheorem, $$f_{XY} (x,y) = f_X(x)\cdot f_Y(y)\cdot c[F_X(x),F_Y(y); \theta]$$ If $X$ and $Y$ are independent, Sklar’ stheorem will tell us that $$c[F_X(x), F_Y(y); \theta]=1.$$

Copula function isolates all information that is required to model the dependence between two variables and contains no information about the marginal distributions. This feature helps in cross-applying copulas i.e. a copula of one family can be attached to marginal distributions of another family to create a joint distribution.

To create joint distribution of $X$ and $Y$ we proceed as follows. On a percentile by percentile basis, we map variables $X$ and $Y$ onto two new variables $V_1$ and $V_2$ which have known marginal distributions and dependence structure or copula (and hence, have a known joint distribution). We map any value $X = x$ to $U = u$ and $Y = y$ to $V = v$ such that the cumulative probability is matched: $$ F_X(x)= F_U (u) \\ F_Y(y) = F_V(v)$$ Corresponding to each $(x, y)$, we can find a mapped $(u, v)$ that are the same percentiles: $$u=F^{-1}_{U}(F_X(x))\\ v = F^{-1}_{V}(F_Y(y))$$ If we are using the Gaussian copula then $U$ and $V$ are bivariate normally distributed, with dependence structure given by correlation $\rho$ (called copula correlation). The cumulative joint probability $\Pr (X \leq x, Y \leq y)$ can be determined from the known cumulative joint probability of $U$ and $V$ i.e. $\Pr(U \leq u; V \leq v; \rho)$

If the true relationship between variables is not linear, it can be accommodated by other copulas:

Student’s t-copula

Variables $U$ and $V$ are assumed to have a bivariate Student’s t-distribution. To sample $t_1, t_2$ from a bivariate Student’s t-distribution with $f$ dof and correlation $\rho$:

sample a chi-squared distributed variable $U_{\chi}$, with $f$ dof (use $F^{-1}_{\chi^2} (r and(),f)$.
sample bivariate normal $z_1$ and $z_2$, with correlation $\rho$.
finally, $t_1 = z_1 \cdot \sqrt{f/U_{\chi}}$ and $t_2 = z_2 \cdot \sqrt{f/U_{\chi}}$

Multivariate Copula

This copula is used to define a correlation structure between more than two variables $(X_1, X_2,…X_N)$. For each variable $i$ we transform it on a percentile-to-percentile basis from $X_i$ into $U_i$ where $U_i$ has a standard normal distribution. $U_i$ have a multivariate normal distribution, with a given correlation matrix, we can compute the joint density of $X_1, X_2,…,X_N$.

One Factor Copula

A factor model can help simplify the correlation structure between $X_1, X_2,…,X_N$. We had assumed $Z_i, F\sim N(0,1)$ , but other choices are also possible. If $Z_i$ is normal and $F$ has Student’s t-distribution, we obtain a multivariate Student’s t-distribution for $X_1, X_2,…X_N$. So, choice of distribution assumptions of $Z_i$ and $F$ affect dependence between $X_i$, and therefore between any other set of variables $U_i$, that are mapped to these $X_i$.

LOS 7. Explain tail dependence.

Tail dependence refers to the conditional probability of an extreme move in one variable given an extreme move in another variable. There is a greater tail dependence in bivariate Student’s t-distribution than a bivariate normal distribution. This means that it is more common to observe two variables sampled from bivariate Student’s t-distribution to be in the tails of the distribution

Since correlations tend to increase during times of market stress, Student’s t-copula will be a better choice to model joint behavior of two market variables as compared to Gaussian copula.