32  More Useful Techniques

Author

Alec Loudenback

“All models are wrong, but some are useful.” - George Box (1976)

32.1 Chapter Overview

A grab-bag of practical techniques for keeping models honest: sanity checks, serialization patterns for reproducibility, validation workflows, and ways to think about whether a model is doing what you think it’s doing.

32.2 General Modeling Techniques

32.2.1 Taking Things to the Extreme

Before trusting any model, ask: what happens at the edges? Set interest rates to zero, or negative. Assume 100% lapse. Perfectly correlated defaults. An illiquid market with zero trades. These extreme scenarios often reveal assumptions you didn’t know you had made.

Consider a simple loan loss model. It might work perfectly well under normal conditions, but what happens when recovery rates hit zero? Does the code handle that gracefully, or does it divide by something that’s now zero? Extreme thought experiments surface these hidden assumptions before production does.

32.2.2 Range Bounding

Sometimes you don’t need the answer—you just need to know that the answer is good enough. If both a pessimistic and an optimistic estimate clear your hurdle, you’re done.

Here’s a classic example from interview lore: you need to determine whether a mortgaged property’s value exceeds the $100,000 loan balance. No appraisal available. But you know that a comparable house in worse condition sold for $100 per square foot, and from the floor plan this house must be at least 1,000 square feet. So:

\[ \frac{\$100}{\text{sq. ft}} \times 1000 \text{sq. ft} = \$100{,}000 \]

The property almost certainly exceeds the loan balance. No complex modeling required.

This technique is particularly useful in early scoping meetings or ad-hoc regulatory requests where a directional answer is all you need.

32.2.3 Pseudo-Monte Carlo Sanity Checks

Before committing to a massive simulation run, do a miniature version first. Fix the random seed, use a handful of scenarios, and verify that everything works end to end. This catches problems like:

  • Configuration files that aren’t being read correctly
  • Aggregation logic that breaks on edge cases
  • Performance bottlenecks that will be painful at scale

A ten-scenario dry run that takes five seconds can save you from discovering bugs halfway through an overnight batch job.

32.2.4 Model Validation

32.2.4.1 Static vs. Dynamic

Model validation is essential. The most common validation approach is static: split your data chronologically, fit on the earlier period, and test on the later period. This tells you how well the model generalizes to unseen data.

using Random, Statistics, LinearAlgebra

T = 200
x = rand(T)
y = 1.0 .+ 2.0 .* x .+ 0.1 .* randn(T)

# Chronological holdout
cut = 150
Xtrain = hcat(ones(cut), x[1:cut])
ytrain = y[1:cut]
Xtest = hcat(ones(T - cut), x[(cut+1):end])
ytest = y[(cut+1):end]

θ = Xtrain \ ytrain
ŷ = Xtest * θ

println("MSE: ", mean((ŷ .- ytest) .^ 2))
println("MAE: ", mean(abs.(ŷ .- ytest)))
MSE: 0.0091421019651262
MAE: 0.07588207314174233

Dynamic validation (sometimes called walk-forward validation) is more demanding: at each time step, you only use data available up to that point. This mimics how the model would actually be used in production.

Random.seed!(42)

T = 200
x = rand(T)
y = 1.0 .+ 2.0 .* x .+ 0.1 .* randn(T)

initial_window = 60
sqerrs = Float64[]

for t in (initial_window+1):T
    Xtr = hcat(ones(t - 1), x[1:(t-1)])
    ytr = y[1:(t-1)]
    θ = Xtr \ ytr
    ŷt = [1.0, x[t]]' * θ
    push!(sqerrs, (ŷt - y[t])^2)
end

println("Walk-forward MSE: ", mean(sqerrs))
Walk-forward MSE: 0.012102241884186694
Note

In some contexts, “static” and “dynamic” validation mean something different: static validation checks whether the model reproduces time-zero prices or balances, while dynamic validation checks whether projected cashflows match historical trends.

32.2.4.2 Implied Rates

Implied rates are a form of model inversion: given an observed price, what rate would produce that price? If your pricing function and your implied-rate function don’t round-trip consistently, something is wrong.

using Zygote

function present_value(rate, cash_flows)
    sum(cf / (1 + rate)^i for (i, cf) in enumerate(cash_flows))
end

function implied_rate(cash_flows, price)
    f(r) = present_value(r, cash_flows) - price
    # Newton's method using autodiff for the derivative
    x = 0.05
    for _ in 1:100
        fx = f(x)
        abs(fx) < 1e-6 && return x
        x -= fx / gradient(f, x)[1]
    end
    return NaN  # didn't converge
end

cash_flows = [100, 100, 100, 100, 1100]
prices = [950, 1000, 1050]

for price in prices
    r = implied_rate(cash_flows, price)
    println("Price $price → rate $(round(r*100, digits=2))%")
end
Price 950 → rate 11.37%
Price 1000 → rate 10.0%
Price 1050 → rate 8.72%
Tip

JuliaActuary’s FinanceCore.jl provides a robust irr function that handles edge cases better than a hand-rolled Newton’s method.

32.2.5 Predictive vs. Explanatory Models

Models serve different masters. A predictive model needs to forecast accurately; an explanatory model needs to tell a coherent story about why things happen. The validation approach should match the purpose.

For prediction, pick a loss function that matches how the forecast will be used. If you’re forecasting claims payments, RMSE or MAE make sense. If you’re estimating Value-at-Risk, use a quantile loss that rewards accurate tail placement. If you’re producing full distributions, consider the Brier score or CRPS.

For explanation, the bar is different. Coefficients should have sensible signs and magnitudes—a lapse elasticity of –0.3 per 100 bps rate change is something you can discuss with product actuaries. The model should be stable across different time periods, and it should remain plausible under counterfactual scenarios (“what if we changed surrender charges?”).

TipFinancial Modeling Pro Tip

Align the loss you optimize with the metric you report. If the risk committee cares about 99th percentile losses, train and evaluate on quantile losses—not just RMSE.

32.2.6 Causal Modeling

Causal modeling addresses an important distinction: most financial models capture correlation, not causation. That’s often fine for prediction, but dangerous for “what-if” analysis. If you want to know what happens when you change something, you need causal reasoning.

Judea Pearl’s work on directed acyclic graphs (DAGs) provides a framework for this. The basic idea: draw arrows between variables to represent direct causal influence, then use the graph to determine what you need to control for (and what you shouldn’t).

A few patterns come up repeatedly:

Confounders drive both the treatment and the outcome. Macro growth affects both lending standards and default rates. If you don’t account for it, you’ll see a spurious relationship between standards and defaults.

Mediators sit on the causal pathway. A capital rule affects lending supply, which affects loan growth. If you control for lending supply, you block part of the effect you’re trying to measure.

Colliders are caused by two other variables. Regulation intensity and market stress both affect media coverage. If you control for media coverage, you create a spurious correlation between regulation and stress.

This matters because financial regulators and boards increasingly ask “what happens if we do X?” Answering that question requires thinking carefully about causal structure, not just fitting the best predictive model. See Pearl (2009) for more on this topic.

32.2.7 Other Techniques Worth Knowing

A few topics we won’t cover in depth but are worth exploring:

Quasi-Monte Carlo uses low-discrepancy sequences (Sobol, Halton) instead of pseudo-random numbers. For high-dimensional integrals like exotic option pricing or nested ALM, this can dramatically reduce variance.

Variance reduction techniques—control variates, antithetic paths, stratification—shrink simulation error without adding more scenarios. Useful when estimating Greeks or tail percentiles.

Scenario reduction algorithms compress thousands of economic scenarios into a representative subset while preserving risk metrics. Kantorovich distance pruning is one approach.

Reverse stress testing inverts the usual question: instead of “what’s the loss under scenario X?”, ask “what scenario produces loss Y?” This can surface vulnerabilities that standard stress grids miss.

32.3 Programming Techniques

32.3.1 Serialization

Serialization is important because in most finance workflows, the slow part isn’t the regression—it’s the data prep, calibration, and scenario generation that come before. If you’re running the same expensive calibration every time you tweak something downstream, you’re wasting compute and making audits harder.

Serialization lets you checkpoint expensive intermediate results. The question is which format to use:

Format Good for Watch out for
Serialization stdlib Quick caches, memoization Breaks across Julia versions
JLD2 Persisting results across sessions Still Julia-specific
Arrow/Parquet Large tables, cross-language sharing Not for arbitrary Julia types
CSV/JSON/TOML Configs, small tables, human-readable Slow, lossy for binary data

Here’s a pattern for saving model state with atomic writes (so you don’t end up with half-written files if something crashes):

using Dates, Serialization

struct ModelState
    θ::Vector{Float64}        # fitted parameters (example)
    seed::Int64               # RNG seed used for the run
    timestamp::DateTime       # when the snapshot was created
    note::String              # short description
end

# Atomic write to avoid half-written files
function atomic_serialize(path::AbstractString, obj)
    dir = dirname(path)
    mkpath(dir)
    tmp = tempname(dir)
    serialize(tmp, obj)
    mv(tmp, path; force=true)
    return path
end

# Example: save/load a state
θ = [1.0, 2.0]                      # pretend these were estimated
state = ModelState(θ, 42, now(), "OLS on 2025-08-11")

path = joinpath("artifacts", "model_state.jls")
atomic_serialize(path, state)
restored = deserialize(path)
ModelState([1.0, 2.0], 42, DateTime("2026-02-09T18:57:26.662"), "OLS on 2025-08-11")

For cross-session persistence where you might share artifacts with colleagues, JLD2 is more robust.

using JLD2, Random, LinearAlgebra, Dates

X = hcat(ones(100), rand(100))
y = X * [1.0, 2.0] .+ 0.1 .* randn(100)
θ = X \ y

meta = (
    julia_version=string(VERSION),
    created_at=string(now()),
    description="OLS fit example",
)

mkpath("artifacts")
jldsave("artifacts/example.jld2"; θ, meta)

θ_loaded, meta_loaded = JLD2.load("artifacts/example.jld2", "θ", "meta")
([0.9770282860377026, 2.059377931238816], (julia_version = "1.12.4", created_at = "2026-02-09T18:57:27.176", description = "OLS fit example"))

The key insight: serialize fitted parameters, calibrated curves, and expensive intermediate results. Don’t serialize raw data—keep that in efficient columnar formats and reference it by path (and ideally by content hash) in your artifact metadata. And remember that the CPU is fast enough that in many cases it’s faster to compute an answer than it is to retrieve it from memory.

32.3.2 Memoization

Memoization is caching function results keyed by their inputs. For expensive computations that get called repeatedly with the same arguments, this can be a huge win.

using SHA, Serialization

function cachekey(label, args...; kwargs...)
    io = IOBuffer()
    print(io, label, '|', args, '|', kwargs)
    bytes2hex(sha1(take!(io)))
end

function memoize_to_disk(f; label="f", cache_dir="cache")
    mkpath(cache_dir)
    function (args...; kwargs...)
        key = cachekey(label, args...; kwargs...)
        path = joinpath(cache_dir, "$key.jls")
        if isfile(path)
            return deserialize(path)
        end
        result = f(args...; kwargs...)
        tmp = tempname(cache_dir)
        serialize(tmp, result)
        mv(tmp, path; force=true)
        return result
    end
end

# Wrap an expensive function
ols = (X, y) -> X \ y
ols_cached = memoize_to_disk(ols; label="ols_v1")

# First call computes and caches; second call loads from disk
θa = ols_cached([ones(3) [1.0, 2.0, 3.0]], [1.0, 3.0, 5.0])
θb = ols_cached([ones(3) [1.0, 2.0, 3.0]], [1.0, 3.0, 5.0])
@assert θa == θb
TipFinancial Modeling Pro Tip

For recurring production runs, use a directory convention like artifacts/YYYY-MM-DD/ and clean old caches on a schedule. Otherwise disk usage creeps up over time.

32.3.3 Automated Benchmarks

If you have a pricing engine or cash-flow projection that runs nightly, maintain a small set of benchmark portfolios with known expected outputs. Run them automatically and alert if results drift. This catches numerical regressions before they reach production—and gives you confidence when refactoring.