25  Scenario Generation

Authors

Alec Loudenback

Yun-Tien Lee

“Scenarios are the rehearsal of possible futures. They are not predictions, but pathways that help us prepare for the unknown.” — Peter Schwartz

25.1 Chapter Overview

How to generate synthetic data for your model using sub-models, with applications to economic scenario generation and portfolio composition.

25.2 Common Use Cases of Scenario Generators

Scenario generators are widely used in risk management, investment analysis, and regulatory compliance to model potential future outcomes. For forecasting under the real-world (physical) measure P, use real-world (RW) scenarios. For valuation and hedging, use risk‑neutral (RN) scenarios under the martingale measure Q (e.g., with drift adjusted to the short rate minus dividends for equities and to fit the initial curve for interest rates).

25.2.1 Risk management, especially Value at Risk (VaR) & Expected Shortfall (ES).

RW scenario generators are used to simulate market movements to estimate potential portfolio losses. Basel III regulatory capital requirements have adopted these approaches.

25.2.2 Stress Testing & Regulatory Compliance

RW scenario generators can also be used to generate extreme but plausible market conditions to assess resilience, which is required by central banks and financial regulators (e.g., Federal Reserve and ECB).

25.2.3 Portfolio Optimization & Asset Allocation

RW scenario generators are used to simulate thousands of market conditions to determine optimal portfolio allocations which is commonly used in modern portfolio theory (MPT) and Black-Litterman models.

25.2.4 Pension & Insurance Risk Modeling

RW scenario generators can be used to simulate longevity risk, policyholder behavior, and interest rate movements. They are also used for economic capital estimation under uncertain economic scenarios.

25.2.5 Economic & Macro-Financial Forecasting

Central banks and institutions (e.g., IMF, World Bank) use RW scenario generators to predict macroeconomic trends.

25.2.6 Asset Pricing & Hedging

RN scenario generators help value options using stochastic models (e.g., Black-Scholes, Heston model). They can help simulate future stock price movements under different volatility conditions. They can also be used for hedging purposes to test how a portfolio performs under different inflation, interest rate, or commodity price scenarios.

25.2.7 Fixed Income & Interest Rate Modeling

Yield curve modeling uses RN scenarios to value bonds and interest rate derivatives. Swaps, swaptions, and credit default swaps (CDS) also rely on RN pricing. RN scenario generators can also simulate yield curves for bond and fixed-income pricing. Models like Cox-Ingersoll-Ross (CIR) or Hull-White generate future interest rate paths.

25.2.8 Regulatory & Accounting Valuations

IFRS 13 & fair value accounting uses RN models determine the market-consistent value of liabilities. Solvency II for insurers requires valuation of policyholder guarantees using RN scenarios.

25.3 Common Economic Scenario Generation Approaches

Economic scenario generation involves the development of plausible future economic scenarios to assess the potential impact on financial portfolios, investments, or decision-making processes. Various approaches are used to generate economic scenarios, such as adapting underlying stochastic differential equations (SDEs) for Monte Carlo scenario generation techniques.

25.3.1 Interest Rate Models

25.3.1.1 Vasicek and Cox Ingersoll Ross (CIR)

The Vasicek model is a one-factor model commonly used for simulating interest rate scenarios. It describes the dynamics of short-term interest rates using a stochastic differential equation (SDE). In a Monte Carlo simulation, we can use the Vasicek model to generate multiple interest rate paths. The CIR model is an extension of the Vasicek model with non-constant volatility. It is a square‑root diffusion with state‑dependent volatility that helps keep rates positive. It addresses the issue of negative interest rates by ensuring that interest rates remain positive. Vasicek is defined as

\[ dr(t) = \kappa (\theta - r(t)) \, dt + \sigma \, dW(t) \]

where

  • \(r(t)\) is the short-term interest rate at time \(t\).
  • \(κ\) is the speed of mean reversion, representing how quickly the interest rate reverts to its long-term mean.
  • \(θ\) is the long-term mean or equilibrium level of the interest rate.
  • \(σ\) is the volatility of the interest rate.
  • \(dW(t)\) is a Wiener process or Brownian motion, representing a random shock.

And CIR is defined as

\[ dr(t) = \kappa (\theta - r(t)) \, dt + \sigma \sqrt{r(t)} \, dW(t) \]

where

  • \(r(t)\) is the short-term interest rate at time \(t\).
  • \(κ\) is the speed of mean reversion, representing how quickly the interest rate reverts to its long-term mean.
  • \(θ\) is the long-term mean or equilibrium level of the interest rate.
  • \(σ\) is the volatility of the interest rate.
  • \(dW(t)\) is a Wiener process or Brownian motion, representing a random shock.

The following code shows a simplified implementation of a CIR model. The specification of \(dr\) can be changed to make it a Vasicek model.

using Random, CairoMakie

# Set seed for reproducibility
Random.seed!(1234)

# Function to simulate CIR process
function cir_simulation(κ, θ, σ, r₀, Δt, num_steps, num_simulations)
    interest_rate_paths = zeros(num_steps, num_simulations)
    for j in 1:num_simulations
        interest_rate_paths[1, j] = r₀
        for i in 2:num_steps
            dW = randn() * sqrt(Δt)
            # dr = κ * (θ - interest_rate_paths[i-1, j]) * Δt + σ * dW # for Vasicek
            dr = κ *- interest_rate_paths[i-1, j]) * Δt + σ * sqrt(interest_rate_paths[i-1, j]) * dW
            interest_rate_paths[i, j] = max(interest_rate_paths[i-1, j] + dr, 0)  # Ensure non-negativity
        end
    end
    return interest_rate_paths
end

let

    # CIR model parameters
    κ = 0.2 # Speed of mean reversion
    θ = 0.05 # Long-term mean
    σ = 0.1 # Volatility

    # Initial short-term interest rate
    r₀ = 0.03

    # Number of time steps and simulations
    num_steps = 252
    num_simulations = 1_000

    # Time increment
    Δt = 1 / 252
    # Run CIR simulation
    cir_paths = cir_simulation(κ, θ, σ, r₀, Δt, num_steps, num_simulations)

    # Plot the simulated interest rate paths
    f = Figure()
    Axis(f[1, 1])
    for i in 1:num_simulations
        lines!(1:num_steps, cir_paths[:, i])
    end
    f
end

The exact CIR process has a non-central chi-squared distribution for \(r(t)\) in closed form.

\[ \begin{aligned} d &= \frac{4 \kappa \theta}{\sigma^2} \\ c &= \frac{\sigma^2 \left( 1 - e^{-\kappa \Delta t} \right)}{4 \kappa} \\ \lambda &= \frac{4 \kappa e^{-\kappa \Delta t} r_t}{\sigma^2 \left( 1 - e^{-\kappa \Delta t} \right)} \\ r_{t+\Delta t} &\sim c \cdot \chi'^2_{d}(\lambda) \\ \end{aligned} \]

where \(\chi'^2_{d}(\lambda)\) means the non-central chi-square distribution with \(d\) degrees of freedom and a non-centrality parameter \(λ\). The following code shows an exact implementation of a CIR model.

using Distributions

let

    # CIR parameters
    κ = 0.5 # mean reversion speed
    θ = 0.04 # long-term mean
    σ = 0.1 # volatility coefficient
    r0 = 0.03 # initial rate
    T = 5.0 # years
    Nsteps = 500 # time steps
    dt = T / Nsteps
    nsim = 1000 # number of simulated paths

    # Precompute constants
    d = 4 * κ * θ /* σ)

    # Time grid
    tgrid = collect(0:dt:T)
    rpaths = zeros(length(tgrid), nsim)

    # Seed reproducibility
    Random.seed!(1234)

    for path in 1:nsim
        r = r0
        rpaths[1, path] = r
        for i in 2:length(tgrid)
            # Transition distribution parameters
            c =* σ * (1 - exp(-κ * dt))) / (4 * κ)
            λ = (4 * κ * exp(-κ * dt) * r) /* σ * (1 - exp(-κ * dt)))
            nc_chisq = NoncentralChisq(d, λ)
            r = c * rand(nc_chisq)
            rpaths[i, path] = r
        end
    end

    # Plot
    fig = Figure(size=(800, 500))
    ax = Axis(fig[1, 1], xlabel="Time (years)", ylabel="Short rate", title="Exact CIR Process Simulation (Non-central Chi-square)")
    for path in 1:nsim
        lines!(ax, tgrid, rpaths[:, path])
    end
    fig
end

25.3.1.2 Hull-White

The Hull-White model is a one-factor model that extends the Vasicek model by allowing the mean reversion and volatility parameters to be time-dependent. It is commonly used for pricing interest rate derivatives. Brace-Gatarek-Musiela (BGM) Model extends the Hull-White model to incorporate more factors. It is one of the Libor Market Model (LMM) that describes the evolution of forward rates. It allows for the modeling of both the short-rate and the entire yield curve. It is defined as

\[ dr(t) = (\theta(t) - a r(t)) \, dt + \sigma(t) \, dW(t) \]

where

  • \(r(t)\) is the short-term interest rate at time \(t\).
  • \(θ(t)\) is the long-term mean or equilibrium level of the interest rate.
  • \(a\) is the speed of mean reversion.
  • \(σ(t)\) is the time-dependent volatility of the interest rate.
  • \(dW(t)\) is a Wiener process or Brownian motion, representing a random shock.
using Interpolations, Dierckx, Random, CairoMakie

function hull_white_theta_timevar_sigma(years, prices, σt; a::Float64)
    @assert length(years) == length(prices) == length(σt) "times and prices and sigmas must have same length"

    # Interpolate logP for smooth derivatives (cubic spline)
    logP = log.(prices)
    interp_logP = Spline1D(years, logP, k=3)  # cubic spline

    # instantaneous forward f(0,t) = -∂_t log P(0,t)
    f0(t) = -derivative(interp_logP, t)
    df0(t) = -derivative(interp_logP, t, 2)

    # Precompute Δs_j (left Riemann widths). assumes years[1] == 0 or handles first width as years[1].
    Δs = Vector{Float64}(undef, length(years))
    Δs[1] = years[1] # if years[1]==0 this is zero; else left interval from 0
    for j in 2:length(years)
        Δs[j] = years[j] - years[j-1]
    end

    # Compute integral term I(t_i) = 1/2 * ∫_0^{t_i} e^{-2a(t_i - s)} σ(s)^2 ds
    I = zeros(length(years))
    for i in 1:length(years)
        ti = years[i]
        s = 0.0
        acc = 0.0
        # left Riemann: use sigma at s = years[j] for interval [years[j], years[j+1])
        for j in 1:i
            sj = years[j]
            # width Δs[j] approximates length of [sj, sj+1) (or [0, years[1]] for j=1)
            w = Δs[j]
            if w <= 0
                # if grid is pure points (years[1]==0 and Δs[1]==0), skip zero-width
                continue
            end
            # integrand evaluated at left point sj
            integrand = exp(-2a * (ti - sj)) * (σt[j]^2)
            acc += integrand * w
        end
        I[i] = 0.5 * acc
    end

    # theta vector: df0 + a*f0 + I
    θt = [df0(years[i]) + a * f0(years[i]) + I[i] for i in 1:length(years)]

    return θt, f0, df0
end

let
    # Market curve setup (synthetic)
    years = collect(0.0:0.1:10.0)
    yields = 0.02 .+ 0.002 .* years
    prices = exp.(-yields .* years)

    # Parameters
    a = 0.1 # mean reversion speed
    σt = 0.008 .+ 0.004 .* sin.(0.5 .* years) # sinusoidal time-varying sigma
    θt, f0, df0 = hull_white_theta_timevar_sigma(years, prices, σt; a=a)
    r0 = 0.03 # initial short rate
    nsim = 1000 # number of simulated paths

    # Preallocate
    rpaths = zeros(length(years), nsim)

    # Seed for reproducibility
    Random.seed!(1234)

    # Euler–Maruyama simulation
    for path in 1:nsim
        r = r0
        rpaths[1, path] = r
        for i in 2:length(years)
            dt = years[i] - years[i-1]
            dW = sqrt(dt) * randn()
            drift = (θt[i-1] - a * r) * dt
            diffusion = σt[i-1] * dW
            r += drift + diffusion
            rpaths[i, path] = r
        end
    end

    # Plot
    fig = Figure(size=(800, 500))
    ax = Axis(fig[1, 1], xlabel="Time (years)", ylabel="Short rate", title="Hull–White Simulation with time-dependent θ(t) and σ(t)")
    for path in 1:nsim
        lines!(ax, years, rpaths[:, path])
    end
    fig
end

25.3.2 Equity Models

25.3.2.1 Geometric Brownian Motion (GBM)

GBM is a stochastic process commonly used to model the price movement of financial instruments, including stocks. It assumes constant volatility and is characterized by a log-normal distribution. It is defined as

\[ dS(t) = \mu S(t) \, dt + \sigma S(t) \, dW(t) \]

where

  • \(S(t)\) is the stock price at time \(t\).
  • \(μ\) is the drift coefficient (expected return).
  • \(σ\) is the volatility coefficient.
  • \(dW(t)\) is a Wiener process or Brownian motion, representing a random shock.
let
    # Set seed for reproducibility
    Random.seed!(1234)

    # GBM parameters
    μ = 0.05 # Drift (expected return)
    σ = 0.2 # Volatility

    # Initial stock price
    S₀ = 100.0

    # Number of time steps and simulations
    num_steps = 252
    num_simulations = 1_000

    # Time increment
    Δt = 1 / 252

    # Function to simulate GBM
    function gbm_simulation(μ, σ, S₀, Δt, num_steps, num_simulations)
        stock_price_paths = zeros(num_steps, num_simulations)
        drift =- 0.5 * σ * σ) * Δt
        vol = σ * sqrt(Δt)
        for j in 1:num_simulations
            stock_price_paths[1, j] = S₀
            for i in 2:num_steps
                stock_price_paths[i, j] = stock_price_paths[i-1, j] * exp(drift + vol * randn())
            end
        end
        return stock_price_paths
    end

    # Run GBM simulation
    gbm_paths = gbm_simulation(μ, σ, S₀, Δt, num_steps, num_simulations)

    # Plot the simulated stock price paths
    f = Figure()
    Axis(f[1, 1])
    for i in 1:num_simulations
        lines!(1:num_steps, gbm_paths[:, i])
    end
    f
end

25.3.2.2 Generalized Autoregressive Conditional Heteroskedasticity (GARCH)

GARCH models capture time-varying volatility. They are often used in conjunction with other models to forecast volatility. It is defined as

\[ \sigma^2_t = \omega + \alpha_1 r^2_{t-1} + \beta_1 \sigma^2_{t-1} \]

\[ r_t = \varepsilon_t \sqrt{\sigma^2_t} \]

  • \(σ^2_t\) is the conditional variance at time \(t\)
  • \(r_t\) is the return at time \(t\)
  • \(\varepsilon_t\) is a white noise or innovation process
  • \(\omega\), \(\alpha_1\), \(\beta_1\) are model parameters
let
    # Set seed for reproducibility
    Random.seed!(1234)

    # GARCH(1,1) parameters
    α₀ = 0.01 # Constant term
    α₁ = 0.1 # Coefficient for lagged squared returns
    β₁ = 0.8 # Coefficient for lagged conditional volatility
    @assert α₁ + β₁ < 1 "Stationarity requires α₁ + β₁ < 1"

    # Number of time steps and simulations
    num_steps = 252
    num_simulations = 1_000

    # Time increment
    Δt = 1 / 252

    # Function to simulate GARCH(1,1) volatility
    function garch_simulation(α₀, α₁, β₁, num_steps, num_simulations)
        volatility_paths = zeros(num_steps, num_simulations)
        for j in 1:num_simulations
            ε = randn(num_steps)
            squared_returns = zeros(num_steps)
            for i in 2:num_steps
                squared_returns[i] = α₀ + α₁ * ε[i-1] * ε[i-1] + β₁ * squared_returns[i-1]
                volatility_paths[i, j] = sqrt(squared_returns[i])
            end
        end
        return volatility_paths
    end

    # Run GARCH simulation
    garch_paths = garch_simulation(α₀, α₁, β₁, num_steps, num_simulations)

    # Plot the simulated volatility paths
    f = Figure()
    Axis(f[1, 1])
    for i in 1:num_simulations
        lines!(1:num_steps, garch_paths[:, i])
    end
    f
end

25.3.3 Copulas

Copulas give us a flexible way to model dependence structures between random variables, independently from their marginal distributions. With copulas, we can model each variable’s distribution (marginal) separately, and then “glue” them together with a copula to describe the dependence structure. This is useful when variables have very different distributions (e.g. stock returns vs. interest rates). Besides, copulas can capture nonlinear dependence and tail dependence (how extreme events co-occur), v.s. traditional correlation (e.g. Pearson) only captures linear dependence.

Simulating data using copulas involves generating multivariate samples with specified marginal distributions and a copula structure.

using Copulas

let
    Random.seed!(1234)

    # 2D Gaussian copula with correlation ρ = 0.8
    ρ = 0.8
    Σ = [1.0 ρ;
        ρ 1.0]
    Cg = GaussianCopula(Σ)
    U = rand(Cg, 10_000)  # n × d matrix with U(0,1) margins
    f = Figure()
    ax = Axis(f[1, 1], xlabel="U₁", ylabel="U₂")
    scatter!(ax, U[1, :], U[2, :], markersize=2, color=:steelblue, alpha=0.4)
    f
end

Copulas can also be used to infer combined distributions from data samples.

let
    X₁ = Gamma(2.0, 3.0)
    X₂ = Pareto(3.0, 1.0)
    X₃ = LogNormal(0.0, 1.0)
    C = ClaytonCopula(3, 0.7) # A 3-variate Clayton Copula with θ = 0.7
    D = SklarDist(C, (X₁, X₂, X₃)) # The final distribution

    # Generate a dataset
    simu = rand(D, 1000)
    # Estimate marginals by MLE
1 = fit(Gamma, simu[:, 1])
2 = fit(Pareto, simu[:, 2])
3 = fit(LogNormal, simu[:, 3])
    # Probability integral transform to uniforms
= hcat(cdf.(d̂1, simu[:, 1]), cdf.(d̂2, simu[:, 2]), cdf.(d̂3, simu[:, 3]))
    # Fit copula on uniforms
= fit(ClaytonCopula, Û)
    # Compose the estimated Sklar distribution
= SklarDist(Ĉ, (d̂1, d̂2, d̂3))
end
Copulas.SklarDist{Copulas.ClaytonCopula{3, Float64}, Tuple{Distributions.Gamma{Float64}, Distributions.Pareto{Float64}, Distributions.LogNormal{Float64}}}(
C: Copulas.ClaytonCopula{3, Float64}(
G: Copulas.ClaytonGenerator{Float64}(-0.5)
)

m: (Distributions.Gamma{Float64}(α=1.2297936405142902, θ=0.9789028809154936), Distributions.Pareto{Float64}(α=0.9847322709380394, θ=1.502490671065896), Distributions.LogNormal{Float64}(μ=-0.2623276117827195, σ=1.1226430321293628))
)