Exchange-Traded Funds List

An exchange-traded fund (“ETF”) is a marketable security that usually tracks an index or a basked of assets. Being traded as a regular stock and typically having higher liquidity and lower fees than mutual funds, it is a very popular investment option. has a great list of the available ETFs in the market, grouped by 1) asset class (e.g. commodities, fixed-income, etc.), 2) investment style (e.g. long-short, small cap, etc.), 3) sector (REIT, utilities, etc.), 4) region, and many other categories. It also provides a list of the ETFs that track main indices (e.g. S&P 500, Russel 3000, etc.).

This is definitely a list worth checking out:


Geometric vs. Arithmetic Mean

The geometric mean is a type of mean that indicates the central tendency of a set of numbers by using their product as opposed to their sum which refers to the arithmetic mean.

For n numbers the geometric mean is equal to (x1 * x2 * … * xn) ^ 1/n whereas the arithmetic mean is equal to (x1 + x2 + … + xn) / n.

Many times investment managers when presenting performance metrics they utilize arithmetic means which is a practice that has been debated over time. The reason is that the arithmetic mean downplays the impact of a bad year in the total investment and does not accurately reflect the growth of the investment from initiation and throughout the measurement period.

For example, if someone invested $1 in a fund at its initiation (let’s assume 5-years ago) and the fund had the following annual returns:


The arithmetic mean implies a positive return whereas the initial investment has actually been wiped out.

Of course other measures such as Sharpe ratio would have given an indication for bad performance and the volatility of the returns would practically have been very high, but still this extreme example highlights the fact that the arithmetic mean can sometimes be fairly deceiving.


Transaction Cost Model

In order to calculate the realized returns the total cost needs to be applied first which consists of the broker’s fees and execution cost that is dependent on the liquidity of the stock.

The broker functions as a transaction facilitator between the two parties that are willing to buy and sell a number of shares. The broker practically routes your order into an electronic network and charges a fee for this. In fact, a commission fee is charged both when buying and selling for each party.

A lot of times the broker is charging a fixed fee for each stock regardless of the number of units traded but some other times the commission is a percentage of the trade value that has a dollar amount cap.

Assuming fixed costs and that our trading strategy imposes the trade of 10 stocks, our cost will be the following:

  • Buy order: 10 x fixed fee
  • Sell order: 10 x fixed fee

As a result, the total cost is 20 x fixed fee and the hurdle rate is 20 x fixed fee / invested capital. For a fixed fee of $5 and invested capital of $10,000, the total cost is $100 and the hurdle rate is 1%.

Fortunately, individual investors usually do not have to worry about the execution cost based on the stock’s liquidity. S&P 500 stocks are relatively liquid and the orders are too small to have any impact on the price. Brokerage firms also provide limit order capabilities that can ensure a specific price.

For large trades, you essentially get a volume weighted average price (“VWAP”) since there are multiple blocks utilized to fulfill the order. Also, the order itself applies pressure on the price and a slippage effect is present creating a spiral effect for the effective price.

In conclusion, individual investors can focus on the broker’s fees to estimate their total cost. In general though and for the statistical arbitrage strategies presented in this blog, a transaction cost model of 1 basis point + 1/2 bid-ask spread will typically be used for back-testing purposes.

What is alpha?

The first letter of the Greek alphabet is used in finance as a measure of the return on an investment after comparing it to an appropriate market index that functions as a benchmark.

Alpha is a function of time and is measured over a specific period. For example if a fund has an alpha of x% over last year, it means that it exceeded the benchmark during that year by x%.

A typical benchmark for equities investments in the U.S. is the S&P 500 whereas equity investments in international markets can be mapped onto an appropriate index – typically a prevailing ETF.

The realized returns can also be compared to the theoretical expected returns by utilizing an asset pricing model (e.g. Fama-French model or CAPM) – this alpha is called Jensen’s alpha. Moreover, a regression analysis of the actual returns against the Fama-French returns for instance can provide useful information on the persistence of alpha and the fraction that is not explained by any of the factors.

The realized returns which are calculated after applying trading and execution fees are compared to the benchmark in order to conclude if the strategy over or under performed and reflect things on a relative scale. An extreme example is that during a year that S&P 500 had a negative return of y% by not investing at all you essentially beat the market by |y| %.

MC Simulation – Python Code

This is a simple Python code for pricing European call and put options using Monte Carlo simulation. According to the law of large numbers, as the number of simulations increase , the actual ratio will converge to the expected ratio of outcomes. In this case, for a large number of simulations the concluded prices will converge to the values generated by the Black-Scholes closed-form solution.

## European call / put option

import numpy as np
#import pandas as pd

# inputs
S0 = 10 # stock price
K = 10 # strike price
sigma = 0.2 # volatility
t = 0.5 # number of years
r = 0.02 # risk-free rate
n = 1000 # number of simulations
Dt = 1/252 # frequency; 1/252 is daily
h = int(t / Dt) # number of simulated periods

# simulate stock prices using MC
rnd = np.array([[0.0 for x in range(h + 1)] for y in range(n)])
for j in range(0, h + 1, 1):
rnd[:, j] = np.random.normal(0, 1, n)

S = np.array([[0.0 for x in range(h + 1)] for y in range(n)])
S[:, 0] = S0
for i in range(0, n, 1):
for j in range(1, h + 1, 1):
S[i, j] = S[i, j - 1] * np.exp((r - 0.5 * sigma ** 2) * Dt + sigma * rnd[i, j - 1] * np.sqrt(Dt))

# calculate call price
c = np.mean(np.maximum(0, S[:, h - 1] - K))

# calculate put price
p = np.mean(np.maximum(0, K - S[:, h - 1]))

Black-Scholes – Python Code

This is a simple Python code for calculating the price of a European call or put option using the Black-Scholes-Merton closed-form solution. The inputs can be easily modified in the corresponding block (initial stock price, strike price, annual volatility, time, risk-free rate and dividend yield).

## European call / put option

import numpy as np
from scipy.stats import norm

# inputs
S = 10 # stock price
K = 10 # strike price
sigma = 0.2 # volatility
t = 1.5 # number of years
r = 0.02 # risk-free rate
d = 0 # expected dividend yield

# calculate d1 & d2
d1 = (np.log(S/K) + (r - d + 0.5*sigma ** 2)*t) / (sigma * np.sqrt(t))
d2 = d1 - sigma * np.sqrt(t)

# calculate call price
c = S * norm.cdf(d1) * np.exp(-d * t) - K * norm.cdf(d2) * np.exp(-r * t)

# calculate put price
p = K * norm.cdf(-d2) * np.exp(-r * t) - S * norm.cdf(-d1) * np.exp(-d * t)

Trading Strategies 101

Generating alpha is only one out of the three legs required for a complete trading strategy. The alphas generated need to be blended with the risk model and the transaction cost estimates in the optimizer in order to come up with the corresponding orders that need to be placed. An illustration of this process is presented below:


  • For transaction costs, typically investors perform the following calculation:

Transaction costs = Trade Fees + Bid-Ask Spread Cost = x bp + Bid-Ask Spread / 2

where x is the trade fees imposed by the broker

  • The risk model refers to the variance-covariance matrix of the stock returns
  • Alphas refer to the blend of a number of standardized alphas, e.g. momentum variations, mean reversion variations, pairs trading, etc.

The optimizer in then performing the following operations (multi-objective optimization):

  • Maximize alpha exposure
  • Minimize transaction costs
  • Minimize risk

Other objectives and constraints are typically included: neutralizing exposure to beta (for market neutral strategies) and setting caps for trade size, position size, industry exposure, country exposure, etc. We are planning to dig further into the optimization process in future posts.

Asset Allocation – Weighing Schemes

This is an overview of some of the most common weighting schemes used in asset allocation. The same concepts apply in the implementation of both statistical arbitrage and portfolio management strategies.

  • Equal Weights: This is a very straightforward method. All equities (or assets in general) selected through the corresponding strategy are weighted equally.  The weight for each one of the equities is equal to 1 /  n where n is the number of equities. For example if the strategy involves the execution of 100 stocks then each weight is 1 / 100.
  • Volatility Weights: This technique is aiming to minimize the overall risk of the portfolio by assigning larger weights on less volatile stocks and lower weights on the more volatile ones. A time series of historical daily volatility is required for each constituent whereas the covariance is being ignored. For example for n stocks with volatilities sigma(i) with i from 1 to n, the weights are calculated as following:

m = 1 / Σ(1/sigma(i))

and w(i) = m / sigma(i) where w(i) is the weight assigned to stock i (i from 1 to n)

  • Value Weights: This is again a relatively straightforward method. The weights are equal to the ratio of the market cap of each constituent over the total cap of the portfolio.
  • Alternative methods: We are going to discuss about more complex ways of assigning weights like using momentum scores, first principal components, etc. in a future post. All methods refer to the estimation of a score for each constituent that is then used to calculate the respective weight.

Capital restraints as well as other restraints (industry exposure, country exposure, etc.) are also taken into account in the optimizer when incorporating each one of the above mentioned methods.

Regarding the capital that is planned to be invested the following objective function captures the mechanics:

  • Σ(round(w(i) * P / Ask Price(i)) * Ask Price(i) + trading fee(i)) <= P

with i from 1 to n, P the capital to be invested, Ask Price(i) the ask price for equity i as provided by the broker and trading fee(i) the trading fee for equity i.

Finally, the round function is used because the number of equities is an integer (you cannot purchase fractions of equities).

A simple example is illustrated below for a number of 10 stocks and the application of equal weighting:


3d Plots – R Code

In this script, we create a 3d plot in R using the plot3D package. In this example, a sphere is generated but multiple 3d shapes can be created by modifying the formulas for x,y,z.

# 3d plots

# clear environment and console
rm(list = ls())

# install and load required packages


# initiate Variables
theta = seq(from = 0, to = 2 * pi, length.out = 500)
phi = seq(from = 0, to = pi, length.out = 500)
rho = 3 # random value

M = mesh(theta, phi)
alpha = M$x
beta = M$y

# plot sphere
surf3D(x = rho * cos(alpha) * sin(beta), y = rho * sin(alpha) * sin(beta), z = rho * cos(beta), colkey=FALSE, bty="b2", main="Sphere")

This is the output for 500 points generated for each variable:


Correlated Random Numbers – R Code

This is simple R code for generating correlated random numbers using the package MASS and its built-in function mvrnorm. In the example below 3 correlated vectors of values are being generated with correlations of 0.5, 0.6 and 0.7, respectively.

## Generate Correlated Random Numbers

# clear environment and console
rm(list = ls())

# install and load required packages

# inputs
n = 1000 # number of random numbers
correl = matrix(c(1, 0.5, 0.6, 0.5, 1, 0.7, 0.6, 0.7, 1), ncol = 3)
mu = c(0, 0, 0) # Gaussian vectors of values

num = mvrnorm(n, mu, correl, empirical = TRUE)