using Random
= MersenneTwister(1234)
rng rand(Int, (2, 3))
2×3 Matrix{Int64}:
-4658824367460585506 1903627521512283790 -4349813002638093878
-4253171797820727306 6036826337004799192 -3019094688802823057
Yun-Tien Lee and Alec Loudenback
“Scenarios are the rehearsal of possible futures. They are not predictions, but pathways that help us prepare for the unknown.” — Peter Schwartz
How to generate synthetic data for your model using sub-models, with applications to economic scenario generation and portfolio composition.
Modern computers utilize pseudo-random number generators (PRNGs) to generate random-like numbers. PRNGs are algorithms used to generate sequences of numbers that appear to be random but are actually determined by an initial value, known as the seed. These generators are called “pseudo-random” because the sequences they produce are deterministic; if you provide the same seed, you’ll get the same sequence of numbers. In addition, they have a finite period, which means that after a certain number of generated values, the sequence will repeat. It’s important to choose or design PRNGs with a long enough period for practical applications.
Financial modelers should understand how PRNGs work because many financial models rely on Monte Carlo simulations, risk analysis, and other stochastic modeling techniques that require random sampling. A good PRNG is essential for robust financial modeling. Choosing the right PRNG ensures results with high statistical quality and efficiency, which is critical when making financial decisions. The ability to specify and control the seed of a random number generator is a fundamental requirement in computational modeling, as it enables exact reproducibility of results. By using a fixed seed, simulations can be rerun with identical random sequences, allowing researchers and stakeholders to verify outcomes, conduct consistent sensitivity analyses, and perform rigorous debugging. This reproducibility is critical for transparency, auditability, and compliance with regulatory standards in fields such as quantitative finance, risk management, and scientific computing. Without a seedable PRNG, the inherent randomness of simulations would preclude the possibility of replicating results precisely, undermining confidence in the validity and reliability of the modeling process.
Different choices of random number generators typically have a low impact on the mean outputs of stochastic models, since most generators are designed to approximate uniform distributions adequately. However, they can have a substantial effect on risk measures such as standard deviation, Value-at-Risk (VaR), and tail quantiles, particularly when the PRNG has a short period or exhibits hidden correlations. In such cases, the variability and uncertainty estimates around risk measures can be severely biased or understated, leading to misleading conclusions about the model’s risk profile. For this reason, it is essential to rerun the stochastic model multiple times with different seeds—or even different PRNGs—to test the stability and convergence of the estimated risk measures and ensure that the results are robust to the choice of random number source.
For many years, the Mersenne Twister was a standard and highly recommended PRNG for financial modeling purposes. One of its greatest strengths is its exceptionally long period of (2^{19937} - 1), which is crucial for applications requiring a large number of independent random numbers. It is also known for its good statistical properties, passing many standard tests for randomness. Moreover, it was designed with features for creating multiple streams, though ensuring their statistical independence in parallel applications requires careful management.
Xorshift is a family of PRNGs known for their simplicity and extremely fast operation. The name “xorshift” comes from the bitwise XOR (exclusive or) and bit-shifting operations that are the core of the algorithm. Xorshift generators are often used in applications where speed is a priority and cryptographic-strength randomness is not a strict requirement. One of the main advantages of xorshift is that its core operations can be efficiently implemented in hardware. However, a typical xorshift generator has a relatively short period compared to the Mersenne Twister.
Xoshiro is a family of high-performance PRNGs with excellent statistical properties. The name “Xoshiro” is a portmanteau of the core operations it uses: XOR, SHIFT, and ROTATE. Xoshiro algorithms, including the highly-regarded Xoshiro256++ variant, use a combination of bitwise XOR, bit-shifting, and addition/rotation operations. They generally have more complex update rules and longer periods than basic Xorshift algorithms.
When selecting seeds at random to initialize multiple instances of the Mersenne Twister generator, there is a significant likelihood of producing streams that are statistically correlated in parallel computations. This arises because the default seeding mechanism may not sufficiently de-correlate the internal states across different instances. In contrast, generators such as Xoshiro256++ offer markedly improved suitability for parallel environments, with carefully managed methodologies for ensuring stream independence.
Generator | Typical Period | Relative Speed | Parallel Safety |
---|---|---|---|
Mersenne Twister | Very Long ((2^{19937}-1)) | Good | Poor (risk of correlation) |
Xorshift | Shorter | Very Fast | Poor (not designed for it) |
Xoshiro256++ | Very Long ((2^{256}-1)) | Excellent | Excellent (designed for it) |
Since Julia 1.7, the default random number generator for the language has been Xoshiro256++, which replaced an implementation of Mersenne Twister.
Importantly, Julia’s implementation of Xoshiro256++ is thread safe (meaning you can use the RNG in multiple threads without losing RNG quality).
Julia offers a consistent interface for random numbers due to its design and multiple dispatch principles. Consider the following random numbers in different data types.
using Random
= MersenneTwister(1234)
rng rand(Int, (2, 3))
2×3 Matrix{Int64}:
-4658824367460585506 1903627521512283790 -4349813002638093878
-4253171797820727306 6036826337004799192 -3019094688802823057
using Random
= MersenneTwister(1234)
rng rand(Float64, (2, 3))
2×3 Matrix{Float64}:
0.617837 0.363454 0.928998
0.618813 0.310095 0.576096
using Random
= Xoshiro(1234)
rng rand(Bool, (2, 3))
2×3 Matrix{Bool}:
0 1 1
1 0 1
Scenario generators are widely used in risk management, investment analysis, and regulatory compliance to model potential future outcomes. If the goal is forecasting actual market behavior, real world scenarios (RW) are commonly used. If, on the other hand, pricing financial instruments is needed, risk neutral (RN) scenarios are often used.
RW scenario generators are used to simulate market movements to estimate potential portfolio losses. Basel III regulatory capital requirements have adopted these approaches.
RW scenario generators can also be used to generate extreme but plausible market conditions to assess resilience, which is required by central banks and financial regulators (e.g., Federal Reserve and ECB).
RW scenario generators are used to simulate thousands of market conditions to determine optimal portfolio allocations which is commonly used in modern portfolio theory (MPT) and Black-Litterman models.
RW scenario generators can be used to simulate longevity risk, policyholder behavior, and interest rate movements. They are also used for economic capital estimation under uncertain economic scenarios.
Central banks and institutions (e.g., IMF, World Bank) use RW scenario generators to predict macroeconomic trends.
RN scenario generators help value options using stochastic models (e.g., Black-Scholes, Heston model). They can help simulate future stock price movements under different volatility conditions. They can also be used for hedging purposes to test how a portfolio performs under different inflation, interest rate, or commodity price scenarios.
Yield curve modeling uses RN scenarios to value bonds and interest rate derivatives. Swaps, swaptions, and credit default swaps (CDS) also rely on RN pricing. RN scenario generators can also simulate yield curves for bond and fixed-income pricing. Models like Cox-Ingersoll-Ross (CIR) or Hull-White generate future interest rate paths.
IFRS 13 & fair value accounting uses RN models determine the market-consistent value of liabilities. Solvency II for insurers asks valuation of policyholder guarantees using RN scenarios.
Economic scenario generation involves the development of plausible future economic scenarios to assess the potential impact on financial portfolios, investments, or decision-making processes. Various approaches are used to generate economic scenarios,such as adapting underlying stochastic differential equations (SDEs) for Monte Carlo scenario generation techniques.
The Vasicek model is a one-factor model commonly used for simulating interest rate scenarios. It describes the dynamics of short-term interest rates using a stochastic differential equation (SDE). In a Monte Carlo simulation, we can use the Vasicek model to generate multiple interest rate paths. The CIR model is an extension of the Vasicek model with non-constant volatility. It addresses the issue of negative interest rates by ensuring that interest rates remain positive. Vasicek is defined as
\[ dr(t) = \kappa (\theta - r(t)) \, dt + \sigma \, dW(t) \]
where
And CIR is defined as
\[ dr(t) = \kappa (\theta - r(t)) \, dt + \sigma \sqrt{r(t)} \, dW(t) \]
where
The following code shows a simplified implementation of a CIR model. The specification of \(dr\) can be changed to make it a Vasicek model.
using Random, CairoMakie
# Set seed for reproducibility
Random.seed!(1234)
# CIR model parameters
= 0.2 # Speed of mean reversion
κ = 0.05 # Long-term mean
θ = 0.1 # Volatility
σ
# Initial short-term interest rate
= 0.03
r₀
# Number of time steps and simulations
= 252
num_steps = 1_000
num_simulations
# Time increment
= 1 / 252
Δt
# Function to simulate CIR process
function cir_simulation(κ, θ, σ, r₀, Δt, num_steps, num_simulations)
= zeros(num_steps, num_simulations)
interest_rate_paths for j in 1:num_simulations
1, j] = r₀
interest_rate_paths[for i in 2:num_steps
= randn() * sqrt(Δt)
dW # for Vasicek
# dr = κ * (θ - interest_rate_paths[i-1, j]) * Δt + σ * dW
= κ * (θ - interest_rate_paths[i-1, j]) * Δt + σ * sqrt(interest_rate_paths[i-1, j]) * dW
dr = max(interest_rate_paths[i-1, j] + dr, 0) # Ensure non-negativity
interest_rate_paths[i, j] end
end
return interest_rate_paths
end
# Run CIR simulation
= cir_simulation(κ, θ, σ, r₀, Δt, num_steps, num_simulations)
cir_paths
# Plot the simulated interest rate paths
= Figure()
f Axis(f[1, 1])
for i in 1:num_simulations
lines!(1:num_steps, cir_paths[:, i])
end
f
The Hull-White model is a one-factor model that extends the Vasicek model by allowing the mean reversion and volatility parameters to be time-dependent. It is commonly used for pricing interest rate derivatives. Brace-Gatarek-Musiela (BGM) Model extends the Hull-White model to incorporate more factors. It is one of the Libor Market Model (LMM) that describes the evolution of forward rates. It allows for the modeling of both the short-rate and the entire yield curve. It is defined as
\[ dr(t) = (\theta(t) - a r(t)) \, dt + \sigma(t) \, dW(t) \]
where
using Random, CairoMakie
# Set seed for reproducibility
Random.seed!(1234)
# Hull-White model parameters
= 0.1 # Mean reversion speed
α = 0.02 # Volatility
σ = 0.03 # Initial short-term interest rate
r₀
# Number of time steps and simulations
= 252
num_steps = 1_000
num_simulations
# Time increment
= 1 / 252
Δt
# Function to simulate Hull-White process
function hull_white_simulation(α, σ, r₀, Δt, num_steps, num_simulations)
= zeros(num_steps, num_simulations)
interest_rate_paths for j in 1:num_simulations
1, j] = r₀
interest_rate_paths[for i in 2:num_steps
= randn() * sqrt(Δt)
dW = α * (σ - interest_rate_paths[i-1, j]) * Δt + σ * dW
dr = interest_rate_paths[i-1, j] + dr
interest_rate_paths[i, j] end
end
return interest_rate_paths
end
# Run Hull-White simulation
= hull_white_simulation(α, σ, r₀, Δt, num_steps, num_simulations)
hull_white_paths
# Plot the simulated interest rate paths
= Figure()
f Axis(f[1, 1])
for i in 1:num_simulations
lines!(1:num_steps, hull_white_paths[:, i])
end
f
GBM is a stochastic process commonly used to model the price movement of financial instruments, including stocks. It assumes constant volatility and is characterized by a log-normal distribution. It is defined as
\[ dS(t) = \mu S(t) \, dt + \sigma S(t) \, dW(t) \]
where
using Random, CairoMakie
# Set seed for reproducibility
Random.seed!(1234)
# GBM parameters
= 0.05 # Drift (expected return)
μ = 0.2 # Volatility
σ
# Initial stock price
= 100
S₀
# Number of time steps and simulations
= 252
num_steps = 1_000
num_simulations
# Time increment
= 1 / 252
Δt
# Function to simulate GBM
function gbm_simulation(μ, σ, S₀, Δt, num_steps, num_simulations)
= zeros(num_steps, num_simulations)
stock_price_paths for j in 1:num_simulations
1, j] = S₀
stock_price_paths[for i in 2:num_steps
= randn() * sqrt(Δt)
dW = stock_price_paths[i-1, j]
S = μ * S * Δt + σ * S * dW
dS = S + dS
stock_price_paths[i, j] end
end
return stock_price_paths
end
# Run GBM simulation
= gbm_simulation(μ, σ, S₀, Δt, num_steps, num_simulations)
gbm_paths
# Plot the simulated stock price paths
= Figure()
f Axis(f[1, 1])
for i in 1:num_simulations
lines!(1:num_steps, gbm_paths[:, i])
end
f
GARCH models capture time-varying volatility. They are often used in conjunction with other models to forecast volatility. It is defined as
\[ \sigma^2_t = \omega + \alpha_1 r^2_{t-1} + \beta_1 \sigma^2_{t-1} \]
\[ r_t = \varepsilon_t \sqrt{\sigma^2_t} \]
using Random, CairoMakie
# Set seed for reproducibility
Random.seed!(1234)
# GARCH(1,1) parameters
= 0.01 # Constant term
α₀ = 0.1 # Coefficient for lagged squared returns
α₁ = 0.8 # Coefficient for lagged conditional volatility
β₁
# Number of time steps and simulations
= 252
num_steps = 1_000
num_simulations
# Time increment
= 1 / 252
Δt
# Function to simulate GARCH(1,1) volatility
function garch_simulation(α₀, α₁, β₁, num_steps, num_simulations)
= zeros(num_steps, num_simulations)
volatility_paths for j in 1:num_simulations
= randn(num_steps)
ε = zeros(num_steps)
squared_returns for i in 2:num_steps
= α₀ + α₁ * ε[i-1]^2 + β₁ * squared_returns[i-1]
squared_returns[i] = sqrt(squared_returns[i])
volatility_paths[i, j] end
end
return volatility_paths
end
# Run GARCH simulation
= garch_simulation(α₀, α₁, β₁, num_steps, num_simulations)
garch_paths
# Plot the simulated volatility paths
= Figure()
f Axis(f[1, 1])
for i in 1:num_simulations
lines!(1:num_steps, garch_paths[:, i])
end
f
Simulating data using copulas involves generating multivariate samples with specified marginal distributions and a copula structure.
using Random, CairoMakie, BivariateCopulas
# Set seed for reproducibility
Random.seed!(1234)
# Generate a Gaussian copula
= Gaussian(0.8)
gaussian_copula
# Show simulated copula
= scatter(rand(gaussian_copula, 10^4))
f f
Copulas can also be used to infer combined distributions from data samples.
using Copulas, Distributions, Random
= Gamma(2, 3)
X₁ = Pareto()
X₂ = LogNormal(0, 1)
X₃ = ClaytonCopula(3, 0.7) # A 3-variate Clayton Copula with θ = 0.7
C = SklarDist(C, (X₁, X₂, X₃)) # The final distribution
D
# Generate a dataset
= rand(D, 1000)
simu # We may estimate a copula, or get parameters of underlying distributions, using the `fit` function:
= fit(SklarDist{ClaytonCopula,Tuple{Gamma,Normal,LogNormal}}, simu) D̂
Copulas.SklarDist{Copulas.ClaytonCopula{3, Float64}, Tuple{Distributions.Gamma{Float64}, Distributions.Normal{Float64}, Distributions.LogNormal{Float64}}}(
C: Copulas.ClaytonCopula{3, Float64}(
G: Copulas.ClaytonGenerator{Float64}(0.7255762179151387)
)
m: (Distributions.Gamma{Float64}(α=1.9509359315325794, θ=3.0668504198367565), Distributions.Normal{Float64}(μ=6.958764796293847, σ=27.415016590130424), Distributions.LogNormal{Float64}(μ=0.01132053842187167, σ=1.0263584835287456))
)