26 Scenario Generation

Yun-Tien Lee and Alec Loudenback

“Scenarios are the rehearsal of possible futures. They are not predictions, but pathways that help us prepare for the unknown.” — Peter Schwartz

26.1 Chapter Overview

How to generate synthetic data for your model using sub-models, with applications to economic scenario generation and portfolio composition.

26.2 Pseudo-Random Number Generators

Modern computers utilize pseudo-random number generators (PRNGs) to generate random-like numbers. PRNGs are algorithms used to generate sequences of numbers that appear to be random but are actually determined by an initial value, known as the seed. These generators are called “pseudo-random” because the sequences they produce are deterministic; if you provide the same seed, you’ll get the same sequence of numbers. In addition, they have a finite period, which means that after a certain number of generated values, the sequence will repeat. It’s important to choose or design PRNGs with a long enough period for practical applications.

Financial modelers should understand how PRNGs work because many financial models rely on Monte Carlo simulations, risk analysis, and other stochastic modeling techniques that require random sampling. A good PRNG is essential for robust financial modeling. Choosing the right PRNG ensures results with high statistical quality and efficiency, which is critical when making financial decisions. The ability to specify and control the seed of a random number generator is a fundamental requirement in computational modeling, as it enables exact reproducibility of results. By using a fixed seed, simulations can be rerun with identical random sequences, allowing researchers and stakeholders to verify outcomes, conduct consistent sensitivity analyses, and perform rigorous debugging. This reproducibility is critical for transparency, auditability, and compliance with regulatory standards in fields such as quantitative finance, risk management, and scientific computing. Without a seedable PRNG, the inherent randomness of simulations would preclude the possibility of replicating results precisely, undermining confidence in the validity and reliability of the modeling process.

Different choices of random number generators typically have a low impact on the mean outputs of stochastic models, since most generators are designed to approximate uniform distributions adequately. However, they can have a substantial effect on risk measures such as standard deviation, Value-at-Risk (VaR), and tail quantiles, particularly when the PRNG has a short period or exhibits hidden correlations. In such cases, the variability and uncertainty estimates around risk measures can be severely biased or understated, leading to misleading conclusions about the model’s risk profile. For this reason, it is essential to rerun the stochastic model multiple times with different seeds—or even different PRNGs—to test the stability and convergence of the estimated risk measures and ensure that the results are robust to the choice of random number source.

26.2.1 Common PRNGs

26.2.1.1 Mersenne Twister

For many years, the Mersenne Twister was a standard and highly recommended PRNG for financial modeling purposes. One of its greatest strengths is its exceptionally long period of (2^{19937} - 1), which is crucial for applications requiring a large number of independent random numbers. It is also known for its good statistical properties, passing many standard tests for randomness. Moreover, it was designed with features for creating multiple streams, though ensuring their statistical independence in parallel applications requires careful management.

26.2.1.2 Xorshift

Xorshift is a family of PRNGs known for their simplicity and extremely fast operation. The name “xorshift” comes from the bitwise XOR (exclusive or) and bit-shifting operations that are the core of the algorithm. Xorshift generators are often used in applications where speed is a priority and cryptographic-strength randomness is not a strict requirement. One of the main advantages of xorshift is that its core operations can be efficiently implemented in hardware. However, a typical xorshift generator has a relatively short period compared to the Mersenne Twister.

26.2.1.3 Xoshiro

Xoshiro is a family of high-performance PRNGs with excellent statistical properties. The name “Xoshiro” is a portmanteau of the core operations it uses: XOR, SHIFT, and ROTATE. Xoshiro algorithms, including the highly-regarded Xoshiro256++ variant, use a combination of bitwise XOR, bit-shifting, and addition/rotation operations. They generally have more complex update rules and longer periods than basic Xorshift algorithms.

When selecting seeds at random to initialize multiple instances of the Mersenne Twister generator, there is a significant likelihood of producing streams that are statistically correlated in parallel computations. This arises because the default seeding mechanism may not sufficiently de-correlate the internal states across different instances. In contrast, generators such as Xoshiro256++ offer markedly improved suitability for parallel environments, with carefully managed methodologies for ensuring stream independence.

26.2.1.4 Comparing PRNGs

Generator	Typical Period	Relative Speed	Parallel Safety
Mersenne Twister	Very Long ((2^{19937}-1))	Good	Poor (risk of correlation)
Xorshift	Shorter	Very Fast	Poor (not designed for it)
Xoshiro256++	Very Long ((2^{256}-1))	Excellent	Excellent (designed for it)

Note

Since Julia 1.7, the default random number generator for the language has been Xoshiro256++, which replaced an implementation of Mersenne Twister.

Importantly, Julia’s implementation of Xoshiro256++ is thread safe (meaning you can use the RNG in multiple threads without losing RNG quality).

26.2.2 Consistent Interface

Julia offers a consistent interface for random numbers due to its design and multiple dispatch principles. Consider the following random numbers in different data types.

using Random

rng = MersenneTwister(1234)
rand(Int, (2, 3))

2×3 Matrix{Int64}:
 -4658824367460585506  1903627521512283790  -4349813002638093878
 -4253171797820727306  6036826337004799192  -3019094688802823057

using Random

rng = MersenneTwister(1234)
rand(Float64, (2, 3))

2×3 Matrix{Float64}:
 0.617837  0.363454  0.928998
 0.618813  0.310095  0.576096

using Random

rng = Xoshiro(1234)
rand(Bool, (2, 3))

2×3 Matrix{Bool}:
 0  1  1
 1  0  1

26.3 Common Use Cases of Scenario Generators

Scenario generators are widely used in risk management, investment analysis, and regulatory compliance to model potential future outcomes. If the goal is forecasting actual market behavior, real world scenarios (RW) are commonly used. If, on the other hand, pricing financial instruments is needed, risk neutral (RN) scenarios are often used.

26.3.1 Risk management, especially Value at Risk (VaR) & Expected Shortfall (ES).

RW scenario generators are used to simulate market movements to estimate potential portfolio losses. Basel III regulatory capital requirements have adopted these approaches.

26.3.2 Stress Testing & Regulatory Compliance

RW scenario generators can also be used to generate extreme but plausible market conditions to assess resilience, which is required by central banks and financial regulators (e.g., Federal Reserve and ECB).

26.3.3 Portfolio Optimization & Asset Allocation

RW scenario generators are used to simulate thousands of market conditions to determine optimal portfolio allocations which is commonly used in modern portfolio theory (MPT) and Black-Litterman models.

26.3.4 Pension & Insurance Risk Modeling

RW scenario generators can be used to simulate longevity risk, policyholder behavior, and interest rate movements. They are also used for economic capital estimation under uncertain economic scenarios.

26.3.5 Economic & Macro-Financial Forecasting

Central banks and institutions (e.g., IMF, World Bank) use RW scenario generators to predict macroeconomic trends.

26.3.6 Asset Pricing & Hedging

RN scenario generators help value options using stochastic models (e.g., Black-Scholes, Heston model). They can help simulate future stock price movements under different volatility conditions. They can also be used for hedging purposes to test how a portfolio performs under different inflation, interest rate, or commodity price scenarios.

26.3.7 Fixed Income & Interest Rate Modeling

Yield curve modeling uses RN scenarios to value bonds and interest rate derivatives. Swaps, swaptions, and credit default swaps (CDS) also rely on RN pricing. RN scenario generators can also simulate yield curves for bond and fixed-income pricing. Models like Cox-Ingersoll-Ross (CIR) or Hull-White generate future interest rate paths.

26.3.8 Regulatory & Accounting Valuations

IFRS 13 & fair value accounting uses RN models determine the market-consistent value of liabilities. Solvency II for insurers asks valuation of policyholder guarantees using RN scenarios.

26.4 Common Economic Scenario Generation Approaches

Economic scenario generation involves the development of plausible future economic scenarios to assess the potential impact on financial portfolios, investments, or decision-making processes. Various approaches are used to generate economic scenarios,such as adapting underlying stochastic differential equations (SDEs) for Monte Carlo scenario generation techniques.

26.4.1 Interest Rate Models

26.4.1.1 Vasicek and Cox Ingersoll Ross (CIR)

The Vasicek model is a one-factor model commonly used for simulating interest rate scenarios. It describes the dynamics of short-term interest rates using a stochastic differential equation (SDE). In a Monte Carlo simulation, we can use the Vasicek model to generate multiple interest rate paths. The CIR model is an extension of the Vasicek model with non-constant volatility. It addresses the issue of negative interest rates by ensuring that interest rates remain positive. Vasicek is defined as

\[ dr(t) = \kappa (\theta - r(t)) \, dt + \sigma \, dW(t) \]

where

\(r(t)\) is the short-term interest rate at time \(t\).
\(κ\) is the speed of mean reversion, representing how quickly the interest rate reverts to its long-term mean.
\(θ\) is the long-term mean or equilibrium level of the interest rate.
\(σ\) is the volatility of the interest rate.
\(dW(t)\) is a Wiener process or Brownian motion, representing a random shock.

And CIR is defined as

\[ dr(t) = \kappa (\theta - r(t)) \, dt + \sigma \sqrt{r(t)} \, dW(t) \]

where

\(r(t)\) is the short-term interest rate at time \(t\).
\(κ\) is the speed of mean reversion, representing how quickly the interest rate reverts to its long-term mean.
\(θ\) is the long-term mean or equilibrium level of the interest rate.
\(σ\) is the volatility of the interest rate.
\(dW(t)\) is a Wiener process or Brownian motion, representing a random shock.

The following code shows a simplified implementation of a CIR model. The specification of \(dr\) can be changed to make it a Vasicek model.

using Random, CairoMakie

# Set seed for reproducibility
Random.seed!(1234)

# CIR model parameters
κ = 0.2       # Speed of mean reversion
θ = 0.05      # Long-term mean
σ = 0.1       # Volatility

# Initial short-term interest rate
r₀ = 0.03

# Number of time steps and simulations
num_steps = 252
num_simulations = 1_000

# Time increment
Δt = 1 / 252

# Function to simulate CIR process
function cir_simulation(κ, θ, σ, r₀, Δt, num_steps, num_simulations)
    interest_rate_paths = zeros(num_steps, num_simulations)
    for j in 1:num_simulations
        interest_rate_paths[1, j] = r₀
        for i in 2:num_steps
            dW = randn() * sqrt(Δt)
            # for Vasicek
            # dr = κ * (θ - interest_rate_paths[i-1, j]) * Δt + σ * dW
            dr = κ * (θ - interest_rate_paths[i-1, j]) * Δt + σ * sqrt(interest_rate_paths[i-1, j]) * dW
            interest_rate_paths[i, j] = max(interest_rate_paths[i-1, j] + dr, 0)  # Ensure non-negativity
        end
    end
    return interest_rate_paths
end

# Run CIR simulation
cir_paths = cir_simulation(κ, θ, σ, r₀, Δt, num_steps, num_simulations)

# Plot the simulated interest rate paths
f = Figure()
Axis(f[1, 1])
for i in 1:num_simulations
    lines!(1:num_steps, cir_paths[:, i])
end
f

26.4.1.2 Hull White

The Hull-White model is a one-factor model that extends the Vasicek model by allowing the mean reversion and volatility parameters to be time-dependent. It is commonly used for pricing interest rate derivatives. Brace-Gatarek-Musiela (BGM) Model extends the Hull-White model to incorporate more factors. It is one of the Libor Market Model (LMM) that describes the evolution of forward rates. It allows for the modeling of both the short-rate and the entire yield curve. It is defined as

\[ dr(t) = (\theta(t) - a r(t)) \, dt + \sigma(t) \, dW(t) \]

where

\(r(t)\) is the short-term interest rate at time \(t\).
\(θ\) is the long-term mean or equilibrium level of the interest rate.
\(a\) is the speed of mean reversion.
\(σ(t)\) is the time-dependent volatility of the interest rate.
\(dW(t)\) is a Wiener process or Brownian motion, representing a random shock.

using Random, CairoMakie

# Set seed for reproducibility
Random.seed!(1234)

# Hull-White model parameters
α = 0.1       # Mean reversion speed
σ = 0.02      # Volatility
r₀ = 0.03     # Initial short-term interest rate

# Number of time steps and simulations
num_steps = 252
num_simulations = 1_000

# Time increment
Δt = 1 / 252

# Function to simulate Hull-White process
function hull_white_simulation(α, σ, r₀, Δt, num_steps, num_simulations)
    interest_rate_paths = zeros(num_steps, num_simulations)
    for j in 1:num_simulations
        interest_rate_paths[1, j] = r₀
        for i in 2:num_steps
            dW = randn() * sqrt(Δt)
            dr = α * (σ - interest_rate_paths[i-1, j]) * Δt + σ * dW
            interest_rate_paths[i, j] = interest_rate_paths[i-1, j] + dr
        end
    end
    return interest_rate_paths
end

# Run Hull-White simulation
hull_white_paths = hull_white_simulation(α, σ, r₀, Δt, num_steps, num_simulations)

# Plot the simulated interest rate paths
f = Figure()
Axis(f[1, 1])
for i in 1:num_simulations
    lines!(1:num_steps, hull_white_paths[:, i])
end
f

26.4.2 Equity Models

26.4.2.1 Geometric Brownian Motion (GBM)

GBM is a stochastic process commonly used to model the price movement of financial instruments, including stocks. It assumes constant volatility and is characterized by a log-normal distribution. It is defined as

\[ dS(t) = \mu S(t) \, dt + \sigma S(t) \, dW(t) \]

where

\(S(t)\) is the stock price at time \(t\).
\(μ\) is the drift coefficient (expected return).
\(σ\) is the volatility coefficient.
\(dW(t)\) is a Wiener process or Brownian motion, representing a random shock.

using Random, CairoMakie

# Set seed for reproducibility
Random.seed!(1234)

# GBM parameters
μ = 0.05       # Drift (expected return)
σ = 0.2        # Volatility

# Initial stock price
S₀ = 100

# Number of time steps and simulations
num_steps = 252
num_simulations = 1_000

# Time increment
Δt = 1 / 252

# Function to simulate GBM
function gbm_simulation(μ, σ, S₀, Δt, num_steps, num_simulations)
    stock_price_paths = zeros(num_steps, num_simulations)
    for j in 1:num_simulations
        stock_price_paths[1, j] = S₀
        for i in 2:num_steps
            dW = randn() * sqrt(Δt)
            S = stock_price_paths[i-1, j]
            dS = μ * S * Δt + σ * S * dW
            stock_price_paths[i, j] = S + dS
        end
    end
    return stock_price_paths
end

# Run GBM simulation
gbm_paths = gbm_simulation(μ, σ, S₀, Δt, num_steps, num_simulations)

# Plot the simulated stock price paths
f = Figure()
Axis(f[1, 1])
for i in 1:num_simulations
    lines!(1:num_steps, gbm_paths[:, i])
end
f

26.4.2.2 Generalized Autoregressive Conditional Heteroskedasticity (GARCH)

GARCH models capture time-varying volatility. They are often used in conjunction with other models to forecast volatility. It is defined as

\[ \sigma^2_t = \omega + \alpha_1 r^2_{t-1} + \beta_1 \sigma^2_{t-1} \]

\[ r_t = \varepsilon_t \sqrt{\sigma^2_t} \]

\(σ^2_t\) is the conditional variance at time \(t\)
\(r_t\) is the return at time \(t\)
\(\varepsilon_t\) is a white noise or innovation process
\(\omega\), \(\alpha_1\), \(\beta_1\) are model parameters

using Random, CairoMakie

# Set seed for reproducibility
Random.seed!(1234)

# GARCH(1,1) parameters
α₀ = 0.01      # Constant term
α₁ = 0.1       # Coefficient for lagged squared returns
β₁ = 0.8       # Coefficient for lagged conditional volatility

# Number of time steps and simulations
num_steps = 252
num_simulations = 1_000

# Time increment
Δt = 1 / 252

# Function to simulate GARCH(1,1) volatility
function garch_simulation(α₀, α₁, β₁, num_steps, num_simulations)
    volatility_paths = zeros(num_steps, num_simulations)
    for j in 1:num_simulations
        ε = randn(num_steps)
        squared_returns = zeros(num_steps)
        for i in 2:num_steps
            squared_returns[i] = α₀ + α₁ * ε[i-1]^2 + β₁ * squared_returns[i-1]
            volatility_paths[i, j] = sqrt(squared_returns[i])
        end
    end
    return volatility_paths
end

# Run GARCH simulation
garch_paths = garch_simulation(α₀, α₁, β₁, num_steps, num_simulations)

# Plot the simulated volatility paths
f = Figure()
Axis(f[1, 1])
for i in 1:num_simulations
    lines!(1:num_steps, garch_paths[:, i])
end
f

26.4.3 Copulas

Simulating data using copulas involves generating multivariate samples with specified marginal distributions and a copula structure.

using Random, CairoMakie, BivariateCopulas

# Set seed for reproducibility
Random.seed!(1234)

# Generate a Gaussian copula
gaussian_copula = Gaussian(0.8)

# Show simulated copula
f = scatter(rand(gaussian_copula, 10^4))
f

Copulas can also be used to infer combined distributions from data samples.

using Copulas, Distributions, Random

X₁ = Gamma(2, 3)
X₂ = Pareto()
X₃ = LogNormal(0, 1)
C = ClaytonCopula(3, 0.7) # A 3-variate Clayton Copula with θ = 0.7
D = SklarDist(C, (X₁, X₂, X₃)) # The final distribution

# Generate a dataset
simu = rand(D, 1000)
# We may estimate a copula, or get parameters of underlying distributions, using the `fit` function:
D̂ = fit(SklarDist{ClaytonCopula,Tuple{Gamma,Normal,LogNormal}}, simu)

Copulas.SklarDist{Copulas.ClaytonCopula{3, Float64}, Tuple{Distributions.Gamma{Float64}, Distributions.Normal{Float64}, Distributions.LogNormal{Float64}}}(
C: Copulas.ClaytonCopula{3, Float64}(
G: Copulas.ClaytonGenerator{Float64}(0.7255762179151387)
)

m: (Distributions.Gamma{Float64}(α=1.9509359315325794, θ=3.0668504198367565), Distributions.Normal{Float64}(μ=6.958764796293847, σ=27.415016590130424), Distributions.LogNormal{Float64}(μ=0.01132053842187167, σ=1.0263584835287456))
)