## Rarely Thankful to Lady Luck

A bias refers to the tendency to hold a partial perspective while refusing alternative points of view. There are many forms of biases that can be attributed to various personal traits or explained within different contexts.

It is very common for people to show bias in everyday life. One of the most interesting forms is accrediting pure luck to skill based on benefiting outcomes. This bias mainly refers to the mistaken belief that if something appears more frequently than it normally does, credit should be given to a person’s skill as opposed to accepting it as a series of random events.

A few years ago, one of my professors at UCLA asked my class to perform an interesting experiment. All students would toss a coin and only the ones who would get heads would proceed to the next round. The winner is simply the person who manages to get only heads until everyone else loses.

As a reminder, the probability of getting heads using a fair coin is 1/2 per toss. That means that the probability of getting heads n times in a row is simply 1/2 ^ n. Of course, as n increases the total probability decreases.

Even though everyone would agree in the beginning of the experiment that getting heads when flipping a coin is a random event, as the rounds would go by and my classmates would go further their perspective was gradually changing and round by round less credit was given to luck and more credit was given to intense coin flipping skills!

This funny experiment clearly demonstrates how easy it is to confuse skill with luck when only getting exposure to a limited number of random trials or competing against a limited number of contestants. The same concept applies in many other activities in life including trading and investing that our blog is mainly focused on.

A lot of times, it is very hard to have a tracking record that is long enough and hence statistically significant in order to conclude if an investor’s generated returns are based on skill or are just the product of pure luck. As with most things in life, the truth could be somewhere in between and even though not generally much appreciated some merit should be given to luck. The question will always be though what portion was skill and what portion was luck. Or does it even matter at the end?

The truth is that we have a hard time to accept that many events in our lives are just random – especially the ones that benefit us – and are putting a lot of effort to explain them somehow and believe that there is something special about the circumstances or our case. After all, people are rarely thankful to lady luck.

## Poker Simulation – Probability of Hands

In this script, we are simulating poker hands and estimating the probability of each hand to come up. By increasing the number of simulations the execution gets slower but the estimates get more reliable. Finally, by removing or adding players the probabilities shift correspondingly.

```## Poker simulator
## 10,000 hands with 2 players, neither ever folds

# clear environment and console
rm(list = ls())
cat("\014")

# install and load required packages
# install.packages('holdem')
library(holdem)

# inputs
n = 10000

# initialize variables
no_pair = rep(0, n)
one_pair = rep(0, n)
two_pairs = rep(0, n)
three_of_a_kind = rep(0, n)
straight = rep(0,n)
flush = rep(0,n)
full_house = rep(0, n)
four_of_a_kind = rep(0, n)
straight_flush = rep(0, n)

# simulate hands and track hands
for(i in 1 : n)
{
x1 = deal1(2)
b1 = handeval(c(x1\$plnum1[1,], x1\$brdnum1), c(x1\$plsuit1[1,], x1\$brdsuit1))
b2 = handeval(c(x1\$plnum1[2,], x1\$brdnum1), c(x1\$plsuit1[2,], x1\$brdsuit1))
if(min(b1,b2) <= 999999) {no_pair[i] = 1}
if(min(b1,b2) >= 1000000 && min(b1,b2) < 2000000) {one_pair[i] = 1}
if(min(b1,b2) >= 2000000 && min(b1,b2) < 3000000) {two_pairs[i] = 1}
if(min(b1,b2) >= 3000000 && min(b1,b2) < 4000000) {three_of_a_kind[i] = 1}
if(min(b1,b2) >= 4000000 && min(b1,b2) < 5000000) {straight[i] = 1}
if(min(b1,b2) >= 5000000 && min(b1,b2) < 6000000) {flush[i] = 1}
if(min(b1,b2) >= 6000000 && min(b1,b2) < 7000000) {full_house[i] = 1}
if(min(b1,b2) >= 7000000 && min(b1,b2) < 8000000) {four_of_a_kind[i] = 1}
if(min(b1,b2) >= 8000000) {straight_flush[i] = 1}
}

# calculate probabilities per hand
pr1 = (sum(no_pair > .5) / n) * 100
pr2 = (sum(one_pair > .5) / n) * 100
pr3 = (sum(two_pairs > .5) / n) * 100
pr4 = (sum(three_of_a_kind > .5) / n) * 100
pr5 = (sum(straight > .5) / n) * 100
pr6 = (sum(flush > .5) / n) * 100
pr7 = (sum(full_house > .5) / n) * 100
pr8 = (sum(four_of_a_kind > .5) / n) * 100
pr9 = (sum(straight_flush > .5) / n) * 100

# print probabilities
paste0(pr1, '% probability of no pairs')
paste0(pr2, '% probability of one pair')
paste0(pr3, '% probability of two pairs')
paste0(pr4, '% probability of 3 of a kind')
paste0(pr5, '% probability of a straight')
paste0(pr6, '% probability of a flush')
paste0(pr7, '% probability of a full house')
paste0(pr8, '% probability of a 4 of a kind')
paste0(pr9, '% probability of a straight flush')
```

## Mean Reversion using Linear Regression Curves

We presented the concept of mean reversion in an earlier post, in this article we will present a tweaked version of mean reversion that utilizes Linear Regression Curves instead of moving averages.

Moving average is a lagging indicator so essentially the Linear Regression Curves can potentially provide a better fit of the available data. It is helpful to compare the results of both methods nonetheless – same applies to any tweaked versions tested, it would make sense to first compare to the base version.

Linear Regression Curve is essentially a bundle of numerous lines, but the extreme ends of the lines are hidden (upper and lower bounds), while the mid-point is only shown and is connected respectively to other mid-points across the time series.

Back-test implementation of mean reversion on S&P500 stocks using Linear Regression Curves:

• Estimate the Linear Regression Curve per look-back window
• In each day compare the closing price with the average value estimated;
• If the closing price is greater than the average + n * deviation go short  and close when you cross the mean or at a pre-determined time
• If the closing price is less than the average – n * deviation go long and  close when you cross the mean or at a pre-determined time

This is a slightly tweaked momentum strategy that involves the normilization of the stock returns using a risk metric. In this example, we will refer to the use of VIX levels for the risk adjustment of returns. Other risk metrics can be used as well such as Value at Risk (VaR). The concept is similar, we adjust the returns using a risk indicator before we sort the universe of stocks based on the returns.

VIX is the ticker of the CBOE Volatility Index, which is a popular measure of the implied volatility of S&P 500 stock options.

• Calculate daily returns of S&P500 stocks
• Calculate average returns per look-back window (e.g. 3-12 months)
• Roll the average returns calculations one day at a time
• Divide average returns by the VIX levels of each trading day
• Sort stocks by adjusted returns – select top n and bottom n stocks
• Track the returns of the n top and bottom performers for your holding period (e.g. 1-mo)
• Estimate your historical returns and volatility and compare them to a benchmark – in this case S&P500 index.

Dividing returns by the VIX levels might not produce meaningful values on an absolute scale but on a relative scale and in order to order the stocks, it provides useful information. More specifically, it downgrades the impact of high returns on very volatile days and upgrades high returns on trading days with less volatility. The concept is similar if using any other risk indicators as mentioned earlier.

## Multivariable Optimization – R code

This is simple R code on optimizing a function with two variables and a lower and upper bound for each one of them.

```## Optimization - Minimum of function

# clear environment and console
rm(list = ls())
cat("\014")

# inputs
x = rnorm(1:100)
y = rnorm(1:100) * 100

fun = function(z, x, y)
{
sum(x * z[1] - (y ^ 2) * z[2])
}

# estimate optimal values for z[1] and z[2]
opt = optim(par = c(1, 1), fun, x = x, y = y, lower = c(-1, -1), upper = c(10, 10), method="L-BFGS-B")
opt\$par
opt\$value
```

## Bayesian Regression Analysis – R Code

This is an R code on regression analysis: a Bayesian regression model is being implemented. The time series are randomly created so no meaningful results can be expected. Moreover, non-informative priors are utilized. Using proper priors the estimates would differ (the estimated slope is the mean of the OLS estimate and the information derived by the priors).

```## Bayesian linear regression

# clear environment and console
rm(list = ls())
cat("\014")

install.packages("BAS")
library("BAS")

# create random series for the independent variable (var1) and dependent
# variables (var2 var3)
v1 = data.frame(var1 = runif(100, 20, 120))
v2 = data.frame(var2 = runif(100, 20, 120))
v3 = data.frame(var3 = runif(100, 20, 120))

# concatenate the time series
vv = c(v1, v2, v3)

# run a Bayesian regression
reg1 = bas.lm(var1 ~ ., data = vv,
prior = 'BIC', modelprior = uniform())
reg1

# run a linear regression
reg2 = lm(var1 ~ ., data = vv)
summary(reg2)
```

## Momentum Strategy using PCA

Create a matrix that includes an index and the constituents: for example S&P500 and its constituents daily returns for the past 5 years.
A typical momentum strategy can be implemented using the following tweaks:
• Estimate the principle components (in R use library stats and function princomp).
• Regress each stock against the n most significant principal components.
• Use the residuals to get a z-score or any other normalize metric for each stock.
• Set a threshold for the z-score to select stocks to trade (trading trigger).
• Select the weights based on z-score.
This is a relatively complex strategy since it involves the implementation of principal components analysis, so we are going to present a few frequent questions below:
1. Do you include SPX and 500 constituents and run a PCA or is SPX supposed to be excluded?
2. Is the PCA applied on returns or prices?
3. Then we regress the returns of each stock for our lookback period againsts the n principal components. How do we specify what is the cutpoint and how many principal components to include?
1) SPX should be included.
2) Returns are more meaningful since they’re closer to be stationary than levels.
3) Plotting the cumulative sum of eigenvalues can actually be very useful. The plot will provide a good indication of when most variation is explained.

## Principal Component Analysis – R Code

This is an implementation of the principal component analysis (“PCA”) method on stock prices log returns in R.

```# Principal Component Analysis

# clear environment and console
rm(list = ls())
cat("\014")

# install and load required packages
# install.packages("stats")
library("stats")

data(EuStockMarkets)

# calculate log returns
log_prices <- log(EuStockMarkets[, 1:4])
ret = data.frame(diff(as.matrix(log_prices,lag=1,difference=1)))

# implement PCA on log returns of index levels
pca <- princomp(ret)

# plot method & print summary
plot(prcomp(ret))
biplot(prcomp(ret, scale = TRUE))
summary(prcomp(ret, scale = TRUE))

# view the scores for each date and the loadings for the components
pca\$scores
```

## Pricing Structured Products – Matlab Code

In this Matlab code we implement a prepayment modeling method and eventually price Asset Backed Securities (ABS).

```
function price = MBS(k, theta)

%1a
P = 100000;
WACC = 0.08;
r = WACC / 12;
n = 360;
SY = [0.94 0.76 0.74 0.95 0.98 0.92 0.98 1.10 1.18 1.22 1.23 0.98];
SY = repmat(SY, 1, n);
WACC = repmat(WACC, n, 1)';
r10 = cir_rates(360, 120, 0.078, k, theta, 0.09);

RI = zeros(n + 1, 1); BU = zeros(n + 1, 1); SG = zeros(n + 1, 1);
CPR = zeros(n + 1, 1); w = zeros(n, 1); PV = zeros(n, 1);
c = zeros(n, 1); TPP = zeros(n + 1, 1);
PV(1) = P;

for i = 1 : n
w(i) = - 8.57 + 430 * (WACC(i) - r10(i));
RI(i + 1) = 0.28 + 0.14 * atan(w(i));
BU(i + 1) = 0.3 + 0.7 * PV(i) / PV(1);
SG(i + 1) = min(1, (i + 1) / 30);
CPR(i + 1)= RI(i + 1) * BU(i + 1) * SG(i + 1) * SY(i + 1);
TPP(i + 1) = PV(i) * r * (1 / (1 - (1 + r) ^ (- n + (i - 1))) - 1) + ...
(PV(i) - PV(i) * r * (1 / (1 - (1 + r) ^ (- n + (i - 1))) - 1)) * ...
(1 - (1 - CPR(i + 1)) ^ (1 / 12));
PV(i + 1) = PV(i) - TPP(i + 1);
c(i + 1) = PV(i) * r / (1 - (1 + r) ^ (- n + (i - 1))) + (PV(i) - ...
PV(i) * r * (1 / (1 - (1 + r) ^ (- n + (i - 1))) - 1)) * ...
(1 - (1 - CPR(i + 1)) ^ (1 / 12));
end

l = c(2 : 361).* cir_exp(360, 12, 0.078, k, theta, 0.09)';
price = mean(sum(l));

end

function r2 = cir_rates(N, S, r0, k, theta, sigma)

dt = 1 / S;
rep = r0 * ones(N, S);
z = sqrt(dt) * randn(N, S);

for i = 1 : S
for j = 1 : N
rep(j, i + 1) = rep(j, i) + dt * k * (theta - rep(j, i)) + sigma * sqrt(rep(j, i))
* z(j, i) ;
end
end

w = zeros(N, 1);
for j = 1 : N
w(j) = mean(rep(j, :));
end

r2 = w;
end

function r2 = cir_exp(N, S, r0, k, theta, sigma)

dt = 1 / S;
rep = r0 * ones(N, S);
z = sqrt(dt) * randn(N, S);

for i = 1 : S
for j = 1 : N
rep(j, i + 1) = rep(j, i) + dt * k * (theta - rep(j, i)) + sigma * ...
sqrt(rep(j, i)) * z(j, i) ;
end
end

Rf = zeros(1, N);
for i = 1 : N
Rf(i) = mean(rep(i, :)) * i * dt;
end

r2 = exp(- Rf);

end

```

## CIR Model – Matlab

This is the implementation of the CIR model in Matlab (interest rate modeling).

```function r2 = cir3(M, S)
% dr_t = k (rt - r_t) dt + sigma dW_t

M = 10000; S = 180;
N = 365;
t = N / 365;
FV = 1000;
K = 950;
k = 0.92;
dt = 1 / S;
theta = 0.055;
sigma = 0.12 / 365;
r0 = 0.05;
T = S / 365 ;

rep = r0 * ones(M + 1, S+1);
z = sqrt(dt) * randn(M + 1, S+1);

for i = 1 : S - N
for j = 1 : M + 1
rep(j, i + 1) = rep(j, i) + dt * k * (theta - rep(j, i)) + sigma * sqrt(rep(j, i)) * z(j, i) ;
end
end

Rf = zeros(1, M + 1);
for i = 1 : M + 1
Rf(i) = dt * sum(rep(i, :));
Rf(i) = exp(-Rf(i) * T);
end

r = mean(rep);
V = FV * mean(Rf);
r2 = mean(max(0, exp(r * t) * (V - K)));
end

```