Tail options pricing under power law (with code)

9 min readAug 15, 2024

The Black-Scholes model assumes normality and stable variance (2nd and higher moments existing). However, asset returns are never normal and are fat-tailed. What is even more important in terms of options — squares of the returns are even more fat-tailed (variance calculation) and their convexities as well.

Mandelbrot proposed that asset returns follow Levy alpha-stable and stable Pareto distributions. Their exponent is below 2, which means that variance is infinite. What it means is that it ceases to be informative. One can argue whether this is true, but for very often assets do not have the kurtosis (alpha < 4, infnite kurtosis), which could be interpreted as a stability metric for the variance, hence proving the notion of stochastic volatility. Infite kurotsis means that you just can’t assume gaussian and so-called CLT converge extremly slowly. It means we can’t say our data “does something asymptotically,” and statistics relying on normal distribution are increasingly erroneous.

The presented method is interesting as it works even with inifnite moments and can be used to assess the tail options, which very often are considered to be mispriced, and we will investigate if that is the case.

So how can one price and extrapolate the options, especially deep OTM, without creating arbitrage opportunities? In this post, I will show and provide the code for how you can do this. I will replicate the paper of, among others, Nassim Nicholas Taleb and other Universa professionals. The paper is called “Tail Option Pricing Under Power Laws,” which you can find here. The implementation is in Python, with options data from 31 December 2018 (as in the paper).

About the model

What can you do with a presnted model? You can test how extrapolated volatility surfaces with a thin-tailed distribution calibrated to the center of the distribution compare to the power law one. You can hear everywhere that far OTM options prices are overvalued. And that could be true if we lived in a Gaussian world. However, the asset prices move much more violently. What’s interesting happens in the tails as these rare events dominate the statistical properties of the whole distribution. That means that what happens in the body of the distribution does not matter much, i.e., you don’t care about +/- 1% moves; you care about +/- 10% (or more) ones. And options are insurance instruments. You don’t insure something that doesn’t have extreme consequences for you. In the same manner, you don’t buy 5% OTM puts as there is no ruin problem. You can be harmed if the assets fall 50%, and this is what options are primarily for.

Stylized facts about Power Law

The power law distribution is probably the most common in the surrounding world. The earthquake intensity or other natural phenomena manifest as power law. In this class of distributions, the statistical properties of your data are dominated by single events. It implies frequent jumps or outliers (which in fact are extremes).

The problem of the normal distribution is that it decays exponentially in the tails. It can be assumed that the tails start about +/- 2.13 sigma, but in reality, due to underestimation in the sample, we can say that it is about +/- 3 sigma (see Statistical Consequences of Fat Tails). This means that the normal distribution grossly underestimates the rare events.

Zipf plot of different distributions. The normal decays the fastest, Power law the slowest. Student’s t with df = 3 is a symmetric power law.

Zipf plot and example of ad-hoc estimation of alpha for the tail for S&P 500 (y axis = 1 — CDF)

Power law distributions help in tackling this problem and tail modeling. It concentrates just on tails, as these events are overwhelmingly important. What happens in the center of the distribution does not matter.

The alpha, which is the scale of the distribution, tells you about the moments and stability of them. Some may have heard about the Pareto distribution and 80–20 principle, which is a special case with alpha = 1.18. The parameters and its moments are listed below. The distribution can be described with the following formula:

The parameters and its moments are listed below:

3 < alpha < 4: no kurtosis (1st, 2nd, 3rd moment exist)
2 < alpha < 3: no kurtosis and skewness (1st, 2nd moment exist)
1 < alpha < 2: no kurtosis, skewness and variance (1st moment exist)
alpha < 1: no moments

What does it mean that there is no n-th moment? It means that it is very unstable, and by measuring it in a sample, you’ll be just fooled by it. When you model asset prices, it is very likely that you won’t have the 4th moment. And the fourth moment tells you about the stability of the second one — it is its sampling error. If you don’t have stable moments, you’re in trouble if you use statistical techniques, which greatly rely on them. The visualization of S&P kurtosis attribution to the statistic is below (red). Gaussian (blue) for the comparison.

Maximum to Sum plot fot S&P 500 and normally distributed random variables (made by author)

As you can see, the kurtosis for normal random variables stabilizes quickly and S&P 500 not.

The another important thing about power laws is that they are scale invariant. This means that if you have P(x>ax | x>x) is constant, for example the ratio between people with 100 mln in the bank and 50 mln is the same as 200 mln to 100 mln. The x in the previous equation is not important as the ratio is driven by a.

And what is also interesting in terms of options, thinner tails gives higher ATM options prices, while fatter ones give lower ATM and higher OTM. This stems from the fact, that peak of the fat-tailed distribution is lower.

Implementation

I don’t want to rewrite every single sentence from the paper, so I will briefly describe the steps. The article is much more precise, so if you didn’t check it out, for maximum clarity, do it as it is very concise.

Let’s start with l calibration and call formula. Formula:

Calibrated l

Call price formula

def Paretian_L(alpha, call_price, strike, spot):
    exp = (1/alpha)
    L = (alpha-1)**exp * call_price**exp*(strike-spot)**(1-exp)
    return L / spot

def call_price(K, L, alpha, spot):
    nominator = (L*spot)** alpha * (K - spot) ** (1-alpha)
    return nominator / (alpha - 1)

What’s interesting is that the two parameters that we relied on can be even further reduced. By pricing with ratios, we now don’t even rely on l, just on the alpha. The price of the subsequent call can be written as follows:

def call_price_ratio(K, K_anchor, C_anchor, alpha, spot):
    return ((K-spot) / (K_anchor-spot)) ** (1 - alpha) * C_anchor

Let’s see how the market prices computed with first formula for calls compare to the powe law.

And now let’s see the ratio

def call_price_ratio(K, K_anchor, C_anchor, alpha, spot):
    return ((K-spot) / (K_anchor-spot)) ** (1 - alpha) * C_anchor

prices = [C]
for i in range(1, len(strikes)):
    K_prev = strikes[i - 1]
    K_curr = strikes[i]
    C_prev = prices[-1]
    C_curr = call_price_ratio(K=K_curr, K_anchor=K_prev, C_anchor=C_prev, alpha=alpha, spot=S_0)
    prices.append(C_curr)

One of the necesary constions to check is the buterfly arbitrage.

butter_fly_arbitrage = [prices[i-1] - 2*prices[i] + prices[i+1] for i in range(1, len(prices) -2)]

… and there is no butterfly arbitrage (belive me).

Let’s see now how the prices translate into volatility surface and.

prices = [1.25]
for i in range(1, len(strikes)):
    K_prev = strikes[i - 1]
    K_curr = strikes[i]
    C_prev = prices[-1]
    C_curr = call_price_ratio(K=K_curr, K_anchor=K_prev, C_anchor=C_prev, alpha=4, spot=S_0)
    prices.append(C_curr)

The code for IV:

def implied_volatilityDF(row, option_type):
    S_0 = row["close"]
    K = row['strike']
    r = row["rate"]
    q = 0
    tau = row['tau']
    mid = row['mid']
    option_type = option_type.lower()

    if option_type in ['p', 'put']:
        option_type = 'p'
    elif option_type in ['c', 'call']:
        option_type = 'c'
    try:
        iv = implied_volatility(mid,S_0, K, tau, r, q, option_type)
        return iv
    except Exception as e:
        return np.nan

df_filtered['IV'] = df_filtered.apply(implied_volatilityDF, option_type='c', axis=1)

As we can see, the implied volatility on the call side looks smooth and quite sensible.

Now, let’s have a look at puts. The formula for puts is below. This is the ratio formula, without l. The issue with the formula is that for non-integer alphas, we get complex numbers. I did not expect that while reading the article, and I don’t exclude that my interpretation or implementation might be wrong. So, if you successfully implemented the fromula, let me know.

def put_price_ratio(K_2, K_1, K_1_price, alpha, spot):
    exp = (1-alpha)
    nominator = (K_2 - spot)**exp - spot**exp * ((-exp)*K_2 + spot)
    denominator = (K_1 - spot)**exp - spot**exp * ((-exp)*K_1 + spot)
    return (nominator / denominator * K_1_price)

The good news is that the previous ratio formula works also for puts. Let’s see what a price tail option anchored at 15% OTM with different alphas looks like.

strikes = np.arange(min(filteredPuts["strike"]), anchorStrike+1,1)
strikes = sorted(strikes, reverse=True)

def put_price_ratio(K_2, K_1, K_1_price, alpha):
    return K_1_price * ((K_2 - spot) / (K_1 - spot))**(1-alpha)

prices = [anchorPrice]
alpha_values = np.arange(3, 3.8, 0.2)
price_lists = {}

for alpha in alpha_values:
    prices = [anchorPrice]
    for i in range(1, len(strikes)):
        K_prev = strikes[i - 1]
        K_curr = strikes[i]
        C_prev = prices[-1]
        C_curr = put_price_ratio(K_2=K_curr, K_1=K_prev, K_1_price=C_prev, alpha=alpha)
        prices.append(C_curr)
    price_lists[alpha] = prices

For the first plot, it seems that there is the best fit with the blue line, which is for elpha = 3.2. The fit should be assessed closer to our anchor, as it is more likely that these are priced “better.” The whole mechanism is to assess the more OTM with fit to the closer, as surely market participants (and anyone) can’t assess small probabilities and the edge can be found there.

The fit also proves that indeed OTM options can be priced with power laws. The market seem to price them consistent with alpha of about 3 for this particular underlying and expirations. Unfortunetly, as I don’t have a lot of data, I am not able to fit and calibrate the model to different maturities.

It is informative to compare the tail exponents of the asset with the option chain you analyze. In fact, it can be useful tool for OTM options analysis and comparison between models and market prices. It can help with getting the idea about how the market prices the tail and making assymetric bets.

Conclusion

To wrap things up, by exploring power law distributions and their application to tail option pricing, I’ve introduced an alternative approach for pricing of deep out-of-the-money options consistent with the distribution of the underlying’s distribution. Power laws tackle the limitations of assuming normality and stable variance, which can lead to significant mispricing in extreme market scenarios.

The Python code can help you set up your own analysis. In fact, I shared just the scratch of the analysis that can be run. I encourage you to delve deeper into the topic on your own.

I hope you enjoyed this article. If you have any questions, you can reach out to me on LinkedIn. The link is here.

References:
1) “Tail Option Pricing Under Power Laws” — Nassim Nicholas Taleb, Brandon Yarckin, Chitpuneet Mann, Damir Delic, and Mark Spitznagel
Also check out these books for more details:
2) “Statistical Consequences of Fat Tails” — Nassim Nicholas Taleb (there is free pdf)
3) “Unperturbated by volatility” — Adel Osseiran and Florent Segonne

Tail options pricing under power law (with code)

About the model

Stylized facts about Power Law

Implementation

Conclusion

Written by Antoni Smolski

Responses (1)