4 The Practice of Financial Modeling

Authors

Alec Loudenback

Yun-Tien Lee

“In theory there is no difference between theory and practice. In practice there is.” – Yogi Berra (often attributed)

4.1 Chapter Overview

Having covered what models are and what they accomplish, we turn to the craft of modeling: what distinguishes a good model from a bad one, and what makes an astute practitioner. We’ll also cover some “nuts and bolts” topics like data handling and governance practices.

4.2 What makes a good model?

The answer is: it depends.

4.2.1 Achieving original purpose

A model is built for a specific set of reasons and therefore we must evaluate a model in terms of achieving that goal. We should not critique a model for falling short in ways it was never intended to be used for.

Consider a model created for scenario analysis to value all assets in a portfolio to within half a percent of a more accurate, but much more computationally expensive model. That’s a perfectly good model—for its intended use. But if someone later tries to add a never-before-seen asset class or repurpose it to order trades (a use case requiring much higher accuracy), they’re extending beyond the original design scope and shouldn’t be surprised when predictive accuracy suffers.

We’ve seen this pattern play out many times: a quick-and-dirty prototype built for an analyst’s personal use gradually gets promoted to “the production model” without anyone rethinking the original design decisions. The model doesn’t get worse—it just gets asked to do things it was never built for.

4.2.2 Usability

How easy is it for someone to use? Does it require pages of documentation, weeks of specialized training, and an on-call help desk? All else equal, the heavier the support and training required, the lower the model’s usability. That said, you may sometimes want to create a highly capable, complex model that requires significant experience and expertise. Think of the cockpit of a small Cessna versus a fighter jet: the former is simpler and takes less training to master, but is also more limited.

Figure 4.1 illustrates this concept and shows that if your goal is very high capability that you may need to expect to develop training materials and support the more complex model. On this view, a better model is one that is able to have a shorter amount of time and experience to achieve the same level of capability.

Figure 4.1: Tradeoff between complexity and capability

4.2.3 Performance

Financial models are generally not used for their awe-inspiring beauty—users are results-oriented. All else equal, faster models are also better models. Beyond direct computational costs like server runtime, shorter model runtime means you can iterate faster, test new ideas on the fly, and stay focused on the problem at hand.

Many readers may be familiar with the cadence of (1) try running the model overnight, (2) see results failed in the morning, (3) spend the day developing, (4) repeat step 1. It is preferred if this cycle can be measured in minutes instead of hours or days.

Of course, requirements must be considered here too: needs for high frequency trading, daily portfolio rebalancing, and quarterly financial reporting models have different requirements when it comes to performance.

4.2.4 Separation of Model Logic and Data

When data is intertwined with business logic it can be difficult to understand, maintain, or adapt a model. Spreadsheets are a common example of where data exists commingled with business logic. An alternative which separates data sources from the computations provides for better model service in the future.

4.2.5 Organization of Model Components and Architecture

If model components or data inputs are spread out in a disorganized way, it can lead to usability and maintenance issues. As an example, oftentimes it’s incredibly difficult to ascertain a model’s operation if inputs are spread out across locations on many spreadsheet tabs. Or if related calculations are performed in multiple locations, or if it’s not clear where the line is drawn between calculations performed in the worksheets or in macros.

If logical components or related data are broken out into discrete parts of a model, it becomes easier to reason about model behavior or make modifications. Compartmentalization is an important principle which allows a larger model to remain composed of simpler components where the whole model is greater than the sum of the pieces.

4.2.6 Abstraction of Modeled Systems

At different times we are interested in different levels on the ladder of abstraction: sometimes we are interested in the small details, but other times we are interested in understanding the behavior of systems at a higher level.

Say we are an insurance company with a portfolio of fixed income assets supporting long term insurance liabilities. We might delineate different levels of abstraction like so:

Table 4.1: An example of the different levels of abstraction when thinking about modeling an insurance company’s assets and liabilities.

	Item
More Abstract	Sensitivity of an entire company’s solvency position
	Sensitivity of a portfolio of assets
	Behavior over time of an individual contract
More granular	Mechanics of an individual bond or insurance policy

At different times, we are often interested in different aspects of a problem. In general, you start to be able to obtain more insights and a greater understanding of the system when you move up the ladder of abstraction. Sometimes a problem can only be unwound when you move down the ladder and focus on the details in an analytical take-apart-the-pieces way of thinking. But often, the more abstracted view is more useful for understanding the system and making decisions about it.

In fact, a lot of designing a model is essentially trying to figure out where to put the right abstractions. What is the right level of detail to model this in and what is the right level of detail to expose to other systems?

Let us also distinguish between vertical abstraction, as described above, and horizontal abstraction which will refer to encapsulating different properties, or mechanics of components of the model at the same level of vertical abstraction. For example, both asset and liability mechanics sit at the most granular level in Table 4.1, but it may make sense in our model to separate the data and behavior from each other. If we were to do that, that would be an example of creating horizontal abstraction in service of our overall modeling goals.

This book will introduce powerful, programmatic ways to handle this through things like packages, modules, namespaces, and functions.

4.3 What makes a good modeler?

A model is nothing without its operator, and a skilled practitioner is worth their weight in gold. What elements separate a good modeler from a mediocre modeler?

4.3.1 Domain Expertise

An expert knowledgeable in relevant domains is crucial. Imagine if someone said, “Let’s emulate an architect by having a construction worker and an artist work together.” It’s all too common for businesses to pair a business expert with an IT person and hope for the finished product to be as good as one crafted by someone skilled in all the right domains.

Unfortunately, this means that there’s generally no easy way out of learning enough about finance, actuarial science, computers, and/or programming in order to be an effective modeler.

Also, a word of warning for the financial analysts out there: the computer scientists may find it easier to learn applied financial modeling than the other way around since the tools, techniques, and language of problem solving is already a more general and flexible skill-set. There are more technologists starting banks than there are financiers starting technology companies.

4.3.2 Model Theory

An essential part of financial modeling involves building up the modeler’s expertise. Then, we must characterize that knowledge more explicitly.

The modeler’s knowledge should be regarded as a theory, in the same sense as Ryle’s “Concept of Mind” (Ryle 1949).

A person with Model Theory in this sense could be described as one who:

knows how to operate the model in a wide variety of use cases
is able to envision the implementation of novel features and how those features relate to existing architecture of the model
explains, justifies, and answers to queries about the model and its results¹.
is keenly aware of the small world versus large world distinction (Section 3.2.1)

A financial model is rarely left in a final state. Regulatory changes, additional mechanics, sensitivity testing, market dynamics, new products, and new systems to interact with force a model to undergo change and development through its entire life. Like a living thing, it must have nurturing caregivers. This metaphor sounds extended, but Naur’s point is that unless the model also lives in the heads of its developers, it cannot successfully be maintained through time:

“The conclusion seems inescapable that at least with certain kinds of large programs, the continued adaptation, modification, and correction of errors in them, is essentially dependent on a certain kind of knowledge possessed by a group of programmers who are closely and continuously connected with them.” - Peter Naur, Programming as Theory Building, page 395.

Assume that we need to adapt the model to fit a new product. One possessing a high degree of model theory includes:

the ability to describe the trade-offs between alternate approaches that would accomplish the desired change
relate the proposed change to the design of the current system and any challenges that will arise as a result of prior design decisions
provide a quantitative estimation for the impact the change will have: runtime, risk metrics, valuation changes, etc.
the ability to analogize system behavior to themselves and to others
Describe key limitations that the model has and where it is most divorced from the reality it seeks to represent.

Abstractions and analogies of the system are a critical aspect of model theory, as the human mind cannot retain perfectly precise detail about how the system works in each sub-component. The ability to, at some times, collapse and compartmentalize parts of the model to limit the mental overload while at others recall important implementation details requires training - and is enhanced by learning concepts like those which will be covered in this book.

An example of how the right abstractions (and language describing those abstractions) can be helpful in simplifying the mental load:

Instead of:

The valuation process starts by reading an extract into three tabs of the spreadsheet. A macro loops through the list of policies on the first tab and in column C it gives the name of the applicable statutory valuation ruleset. The ruleset is defined as the combination of (1) the logic in the macro in the “Valuation” VBA module with, (2) the underlying rate tables from the tabs named XXX to ZZZ, along with (3) the additional policy level detail on the second tab. The valuation projection is then run with the current policy values taken from the third tab of the spreadsheet and the resulting reserve (equal to the actuarial present value of claims) is saved and recorded in column J of the first tab. Finally, a pivot table is used to sum up the reserves by different groups.

We could instead design the process so that the following could be said instead:

Policy extracts are parsed into a Policy datatype which contains a subtype ValuationKind indicating the applicable statutory ruleset to apply. From there, we map the valuation function over the set of Policys and perform an additive reduce to determine the total reserve.

There are terminologies and concepts in the second example which we will develop over the course of this section of the book - we don’t want to dwell on the details right now. However, we do want to emphasize that the process itself being able to be condensed down to descriptions that are much more meaningful to the understanding of the model is a key differentiator for a code-based model instead of spreadsheets. It is not exaggerating that we could develop a handful of compartmentalized logics such that our primary valuation process described above could look like this in real code:

policies = parse(Policy,CSV.File("extract.csv")) 
reserve = mapreduce(value,+,policies)

We’ve abstracted the mechanistic workings of the model into concise and meaningful symbols that not only perform the desired calculations but also make it obvious to an informed but unfamiliar reader what it’s doing.

parse , mapreduce, + , value , Policy are all imbued with meaning - the first three would be understood by any computer scientist by the nature of their training (and is training that this book covers). The last two are unique to our model and have “real world” meaning that our domain expert modeler would understand which analogizes very directly to the way we would suggest implementing the details of value or Policy. The benefit of this, again, is to provide tools and concepts which let us more easily develop model theory.

4.3.3 Curiosity

No model, no matter how sophisticated, ever delivers a “final” answer. If anything, a good financial model sparks as many new questions as it answers.

Take the gnawing feeling you get when a model’s output seems “off” but you can’t quite put your finger on why. The untrained eye might chalk it up to randomness or let it slide, but genuine curiosity won’t settle for a hand-wavy excuse. That itch to resolve every weird edge case or apparent contradiction—to ask “what if?” and “why not?”—is what propels a practitioner beyond rote calculation into actual discovery.

Here’s an example from practice: we once inherited a model that produced a small negative value in a particular reserve calculation every 12 months, right around year-end. Everyone had been ignoring it for years because it was tiny and “probably just a rounding thing.” But someone finally got annoyed enough to dig in. Turned out there was an off-by-one error in how the model handled December versus January transitions—and fixing it revealed a larger issue with how certain liabilities were being recognized across calendar years. The “trivial” anomaly pointed to a real problem.

Curiosity in practice looks like:

If two approaches give wildly different answers for the same scenario, don’t sweep that under the rug. Dig until you’ve either found the bug or learned a new subtlety about your assumptions.
If changing an input slightly produces an outsized effect somewhere else, figure out the feedback mechanism causing it. That nonlinearity is telling you something.
Challenge what “everybody knows.” You’d be surprised how many standard assumptions in financial modeling are half-remembered lore rather than principled choices. Ask: where did this formula actually come from? Does it still apply?
When the model spits out something bizarre, treat it like an opportunity rather than a headache. Sometimes the oddball result teaches you more about the system than any routine validation could.

The best modelers I’ve worked with aren’t necessarily the flashiest coders or the most fluent in finance. They’re just relentless in their quest to not leave loose ends untied.

4.3.4 Rigor

If curiosity is the fuel, rigor is the steering wheel. All of that wandering through the thickets of “why?” needs a reliable process to keep from becoming noise or hand-waving. Rigor is what separates “I think it works” from “Here’s why it works, and here are its limits.”

When developing a model, assumptions and parameters need to be explicit, the methodology should align with established theory (or consciously depart from it for good reason), and you should think carefully about how the model will actually be used. Professional actuarial societies, for example, maintain a long list of Actuarial Standards of Practice (“ASOPs”), some of which apply directly to modeling and data handling. But regardless of whether formal standards apply to your situation, the underlying principles are worth internalizing.

Document your thinking as you go. Write it out, whether in a code comment, a README, or your own notebook. If you can’t explain your logic and your parameters, you probably don’t understand them as well as you think. We’ve all had the experience of returning to a model six months later and staring at a formula wondering “why did I do it this way?” Documentation is a gift to your future self.

Demand evidence for your choices. Don’t just trust your gut or yesterday’s industry standard. Check your results against reality, not just an assumed “right answer.” This means test cases, sensitivity checks, and “could we break this?” scenarios. If you’re using a Black-Scholes model, can you reproduce known option prices? If you’re projecting cash flows, do the first few periods match what actually happened?

Don’t hide the warts. Make uncertainty visible, not hypothetical. Annotate what’s based on thin data versus what’s on solid ground. Rigor means being honest about what you don’t know—or what the model simply can’t say. A model that clearly states “this parameter is estimated from only 18 months of data” is more trustworthy than one that presents everything with false precision.

Lean on first principles when it matters. Oftentimes there’s a ‘simpler’ way to model something—a heuristic that says this complex transaction “works like exotic option ABC.” But making explicit all components of an interaction can be illuminating. Model out each leg of a transaction for clarity and confirmation of your understanding, even if you later simplify for production use.

A bad model can be worse than no model at all. It gives false confidence, and people make decisions based on that false confidence. Through rigorous effort, you establish a minimum standard of quality that protects against the worst outcomes.

4.3.5 Clarity

A model is only as useful as it is understandable. This applies both to the model itself and to how you communicate about it.

Consider the term “duration.” To a fixed income analyst, it means interest rate sensitivity. To a project manager, it means how long something takes. To an actuary, it might mean policy duration—how long a contract has been in force. If you’re writing documentation or presenting results and you use “duration” without defining it, you’re inviting confusion. Either define your terms up front or pick less ambiguous words.

The same principle applies to assumptions. It’s not enough to report that a portfolio is worth $X million. What discount rate did you use? What prepayment assumptions? What credit spreads? Spell out what you left in, what you left out, and why. The philosophy and scaffolding matter as much as the final number.

Visual communication helps enormously. A simple diagram showing how data flows through a model, or a chart comparing scenarios, often communicates more than paragraphs of text. When explaining a model to stakeholders, we’ve found that a single well-designed visual can short-circuit twenty minutes of confusion.

Adjust your explanations for your audience. Developers want to know about data structures and edge cases. Business stakeholders want to know what the number means and what decisions it supports. End users want to know which buttons to push. The same model requires different explanations for each group.

Finally, review your documentation periodically. Models evolve, and documentation gets stale. Ask yourself: if I woke up with amnesia, would the next steps seem obvious? Clarity is about making your future self—and your colleagues—thankful, not furious, that you were ever given keyboard access.

4.3.6 Humility

The world is complicated in ways we can sometimes describe but never fully anticipate. A humble modeler tries to understand what the model can and cannot claim—and communicates those limitations in good faith. “We have a lot of data for low-rate environments, but rapidly rising rate environments haven’t been observed in this dataset” is the kind of caveat that should accompany results, not get buried in footnotes.

There’s a useful distinction between two kinds of uncertainty:

Irreducible uncertainty (also called aleatoric uncertainty) refers to the inherent randomness in a system. No amount of additional data or better models will eliminate it. Future market fluctuations, individual policyholder behavior, natural disasters—these are genuinely unpredictable. The best we can do is characterize the range of possibilities.

Reducible uncertainty (epistemic uncertainty) stems from a lack of knowledge. In principle, it can be reduced through more data, better measurement, or improved models. Parameter estimation errors fall into this category: if you had more historical observations, you could pin down that mortality rate more precisely. Model specification errors are similar—maybe you’ve left out an important variable, and including it would improve predictions.

Table 4.2 describes these distinctions in more detail. You don’t need to enumerate every type of uncertainty for every model, but knowing your enemy is the first step in fighting it.

The practical implication is that a humble modeler distinguishes between “we’re uncertain because the world is random” and “we’re uncertain because we don’t have enough data.” The first can’t be fixed; the second might be. Communicating this distinction to stakeholders avoids overconfidence in model predictions and keeps everyone open to new information. It also builds trust: people respect a modeler who says “here’s what we know, here’s what we don’t, and here’s why” far more than one who presents every output as gospel.

Table 4.2: In attempting to model an uncertain world, we can be even more granular and specific in discussing sources of that uncertainty. This table summarizes commonly noted kinds of uncertainty that arise, and whether we can reduce the uncertainty by doing better (more data, better data, better models, etc.) or not.

Type of Uncertainty	Key Characteristics	Reducibility	Example
Aleatory (Process) Uncertainty	- Inherent randomness (aka “irreducible uncertainty”) - Cannot be eliminated, even with perfect knowledge	Irreducible	Rolling dice or coin flips; outcome is inherently uncertain despite full knowledge of initial state
Epistemic (Parameter) Uncertainty	- Due to limited data/knowledge (aka “reducible uncertainty”) - Imperfect information or model parameters	Reducible (more data / improved modeling)	Uncertainty in a model’s parameters (e.g., climate sensitivity) that can be refined with more research
Model Structure Uncertainty	- Uncertainty about the correct model or framework - Often considered a special subset of epistemic uncertainty	Partially reducible (better theory/model selection)	Linear vs. nonlinear models in complex systems; risk of omitting key variables or mis-specified dynamics
Deep (Knightian) Uncertainty	- “Unknown unknowns” - Probability distributions themselves are not well-defined or are fundamentally unquantifiable	Not quantifiable (cannot assign probabilities)	Impact of radically new technology on society
Measurement Uncertainty	- Errors in data collection or instrument readings - Systematic biases or random errors in measurement	Partially reducible (improved measurement methods)	Instrument precision limits in experiments; calibration errors in sensor data
Operational Uncertainty	- Uncertainty in implementation/execution - Human error, mechanical failure, or miscommunication in processes	Partially reducible (better training/processes)	Surgical errors, system failures, or incorrect handling of a financial trade order

4.3.7 Architecture

Any sufficiently complex project benefits from architectural thinking. Think of your model like a house: if you don’t plan the plumbing, you’ll have a mess down the line.

One of the most important architectural principles is separating data from logic. The model itself should not embed substantial data—no hard-coded rate tables, no discount curves pasted into cells, no assumption sets buried in VBA. Instead, dynamically load data from appropriate data stores and leave the “model” as the implementation of datatypes and algorithms (the “business logic”). This makes it possible to run the same model against different datasets, to audit what inputs produced what outputs, and to update assumptions without touching the model code.

Modular design means breaking complex models into reusable, independent components. A valuation routine shouldn’t also handle data parsing and report generation. Each piece should do one thing well. When something breaks (and something always breaks), modular design lets you isolate the problem.

Stable interfaces matter when components need to communicate. If your asset model passes data to your liability model, define clearly what that data looks like. When you change the internals of one component, the other shouldn’t need to know—as long as the interface contract is preserved.

Version control is non-negotiable for any serious model. We’ll cover this later in the book, but the ability to track changes, revert mistakes, and understand the history of a model pays dividends every time something goes wrong (and something always goes wrong).

Performance becomes important as models scale. The difference between a model that runs in 30 seconds and one that runs in 3 hours is the difference between interactive exploration and overnight batch jobs. We’ll discuss data structures and algorithms that help, but the architectural decisions you make early—what to precompute, what to cache, what to parallelize—often determine whether performance optimization is even possible later.

Don’t underestimate the value of a well-organized model: it’s how you scale from small prototypes to systems you can trust in production.

4.3.8 Planning

When tackling a large modeling problem, it helps to think through the project before writing code. The temptation to “just start building” is strong, especially when you’re excited about the technical challenge. But time invested at the planning stage almost always pays dividends.

Start with objectives. What questions does this model need to answer? Who will use the outputs, and for what decisions? A model built for quarterly regulatory reporting has different requirements than one built for real-time trading decisions. If you don’t know the purpose clearly, you’ll make architectural choices you’ll regret later.

Define the scope explicitly. What’s in and what’s out? It’s easy for a project to expand indefinitely as stakeholders think of additional features. Write down the boundaries. “This model covers fixed income assets only. Equities are out of scope for version 1.” That sentence, agreed to early, saves arguments later.

Assess your data situation early. What data do you need? Where does it come from? How clean is it? We’ve seen projects derailed because someone assumed a data feed existed that didn’t, or assumed a field contained one thing when it contained another. Get eyes on actual data samples before committing to a design.

Choose your methodology consciously. Is this a Monte Carlo simulation? A closed-form approximation? A machine learning model? The choice should follow from the problem requirements and the available data, not from what’s fashionable or what the team knows best. Different methodologies have different data requirements, runtime characteristics, and interpretability properties.

Identify your stakeholders and get them involved early. Nothing derails a project faster than finishing it and discovering the business wanted something different. Regular check-ins prevent this. They also surface changing requirements before you’ve built too much to pivot.

Plan for validation and testing. How will you know if the model is working correctly? What test cases will you use? What benchmarks will you compare against? If you can’t answer these questions at the planning stage, you probably don’t understand the problem well enough yet.

Think about maintenance from the start. Who owns this model after it’s built? How will assumptions be updated? What happens when the regulatory environment changes? Models are rarely “done”—they require ongoing care. If you don’t plan for maintenance, you’ll end up with an orphan that nobody understands and everybody’s afraid to touch.

It’s easier to make changes to a well-planned project halfway through, because the necessary accommodations are more clearly defined. Without a plan, changes cascade unpredictably, and you end up rewriting more than you expected.

4.3.9 Essential Tools and Skills

An experienced modeler has a mental toolbox with many different approaches to draw on. Some problems call for a back-of-the-envelope calculation; others require a full Monte Carlo simulation. Some questions are best answered with a statistical model; others with a simple lookup table. The ability to recognize which tool fits which problem—and to switch approaches when the first one isn’t working—is what separates a craftsman from someone who only knows how to use a hammer.

This book will introduce many of these approaches. Table 4.3 lists some of the categories, covering both technical skills and the softer skills that matter in practice.

Table 4.3: A variety of skills have their place in the proficient financial modeler’s toolbelt.

Category	Examples
Diverse Modeling Techniques	Statistical methods (e.g. regression, time series analysis, machine learning) Optimization techniques (e.g. linear, non-linear, black-box) Simulation methods (e.g. Monte-Carlo, agent-based, seriatim)
Software Proficiency	Programming languages Database and data handling Proprietary tools (e.g. Bloomberg)
Financial Theory	Asset pricing Portfolio theory Risk Management frameworks
Quantitative techniques	Numerical methods and algorithms Bayesian inference Stochastic calculus
Soft Skills	Verbal and written communication Stakeholder engagement Project Management

4.4 Feeding The Model

The lifeblood of the model is its data. In practice, a model’s fate is often sealed not by the sophistication of its algorithms, but by the quality of the data it consumes. We’ve seen beautifully architected models produce nonsense because someone fed them stale prices, or because a data field meant one thing in one system and something slightly different in another. Even the most elegant model is helpless in the face of bad inputs.

4.4.1 “Garbage In, Garbage Out”

Every experienced modeler has a story where a subtle data quirk led to a dramatic miscalculation—a column header shifted by one, a stale price feed, or a single outlier that quietly cascaded into a million-dollar mistake. The lesson: treat the data with every bit as much skepticism (and care) as you give the model itself.

Example: The JPMorgan ‘London Whale’

In 2012, JPMorgan Chase suffered over $6 billion in losses, partly due to errors in a Value-at-Risk (VaR) model. The model relied on data being manually copied and pasted into spreadsheets, a process that introduced errors. Furthermore, a key metric was calculated by taking the sum of two numbers instead of their average. This seemingly small data handling error magnified the model’s inaccuracy, demonstrating that even the most sophisticated institutions are vulnerable to the ‘Garbage In, Garbage Out’ principle.

4.4.2 A Modeler’s Data Instincts

Rather than thinking of data handling as a rigid checklist, approach it as a series of habits and questions.

Know your sources. Where did this data come from? Who collected it, and how? Is it raw, or has someone already “cleaned” it in ways you need to understand (or undo)? We once inherited a dataset where someone had helpfully replaced all the missing values with zeros—which made it impossible to distinguish “no data” from “actually zero.” Data provenance is not a formality; it’s the first step in understanding what can go wrong.

Trust, but verify. Never take a dataset at face value, even if it comes from a trusted system. Run summary statistics. Plot the distributions. Check for the bizarre and the mundane: are dates reasonable, units consistent, and identifiers unique? A quick histogram can show you if someone entered dollars when they should have entered millions of dollars.

Expect messiness. Real-world data is rarely pristine. Missing values, odd encodings, duplicated rows, and outliers are the norm, not the exception. The best modelers are part detective, part janitor: they track down wonky values, document their triage decisions, and know when to escalate a data quality concern upstream rather than quietly “fixing” it themselves.

Feature engineering is judgment, not magic. Choosing which fields to keep, combine, or discard is where domain expertise shines. Sometimes a new ratio or flag column, born from your understanding of the business, makes all the difference. But beware of “kitchen sink” modeling—too many features can obscure, rather than reveal, the truth. If you can’t explain why a feature should matter, think twice before including it.

Be wary of temporal traps. Mixing data from different time periods, or accidentally leaking future information into a model (a classic error), can invalidate results without any warning sign. If you’re building a model to predict defaults, and you accidentally include a field that was populated after the default occurred, your model will look great in backtesting and fail completely in production. When in doubt, plot your data against time and look for jumps, gaps, or trends that defy explanation.

Keep data and logic separate. As mentioned earlier: don’t hard-code data into the model. Keep sources external, interfaces clean, and ingest paths well documented. If someone wants to rerun last year’s scenario, they shouldn’t have to guess which tab or variable held the original rates.

4.4.3 Data Is Never “Done”

Data handling is not a one-time hurdle to clear and move on from. Markets move, data feeds change, formats drift, and the vendor you relied on for reference data gets acquired and changes their API. A model that worked perfectly last quarter can silently break this quarter because an upstream system started encoding dates differently.

Build routines to check for “data drift”—changes in the statistical properties of your inputs that might indicate a problem. Have a plan for periodic validations and refreshes. Some practical tips:

Maintain a simple data log or data dictionary—even if informal—so others can trace what each field means and where it came from.
Automate the boring parts: validation scripts, input checks, and sanity tests pay off a hundredfold.
Version your datasets, just as you do your code. Nothing is more frustrating than trying to reproduce a result only to discover “the input file changed.” See Section 12.5.4

Data is unruly, idiosyncratic, and absolutely central to every model’s fate. Treat it as a first-class concern, not an afterthought, and your models will be far sturdier for it. As a methodical guide, Table 4.4 lists key steps to follow when bringing data into the model.

Table 4.4: Typical Steps in the Data-to-Model Process.

Step	Key Actions	Purpose / Notes
Data Collection	Identify sources Acquire data (e.g., APIs, databases, scraping)	Ensures data is relevant, reliable, and timely
Data Exploration & Understanding	Summary statistics Visualization Data profiling	Uncovers initial insights, errors, distributions, and relationships
Data Cleaning	Handle missing values Detect/treat outliers Data transformation/formatting	Improves data quality, reduces noise and bias
Data Preprocessing	Scale/normalize features Encode categorical variables Augment data (if needed) with other datasets	Prepare data so it fits the format and requirements of the model
Feature Engineering	Select important features Create new features (e.g., ratios, aggregates)	Enhance or create new variables that improve model performance
Data Splitting	Divide into training, testing, (validation) sets Apply cross-validation or static/dynamic validations	Prevents overfitting and enables robust performance assessment
Data Storage & Management	Store in databases/data lakes Maintain version control	Supports reproducibility, scalability, and reliable access
Ethical Considerations	Evaluate bias and fairness Ensure privacy and regulatory compliance	Avoids perpetuating bias and protects sensitive information
Continuous Monitoring & Updating	Monitor model/data performance Detect data drift Retrain/update as needed	Maintains accuracy and relevance as data and conditions change

4.5 Model Management

4.5.1 Risk Governance

Risk governance is about preventing costly mistakes before they happen. The 2008 financial crisis, the London Whale incident, and countless smaller disasters have demonstrated what happens when models aren’t properly overseen. Organizations that take this seriously typically have written policies delineating responsibilities: management or board-level committees set high-level objectives, while operational teams handle day-to-day processes.

A model inventory—a catalog of all models in use—sits at the heart of any governance framework. Each entry should detail the model’s purpose, its key assumptions, and its current status (prototype, production, deprecated). Without this inventory, organizations don’t actually know what models they’re running or what their cumulative exposure to model error might be. “We have a model for that somewhere” is not a governance strategy.

Many firms adopt tiered risk classifications to decide how much scrutiny a model warrants. A simple calculator that helps an analyst estimate bond duration doesn’t need the same validation rigor as an enterprise valuation engine that determines capital requirements. Classification schemes might range from “low impact” to “mission-critical,” with validation and testing requirements scaling accordingly.

For highly critical models, this means extensive backtesting, benchmarking against alternative approaches, and sensitivity analyses that get escalated to senior management. It also means ongoing monitoring—not just initial validation—with scheduled reports about model health. The goal is to create a culture where potential failures get surfaced early and openly, rather than hidden away until a crisis forces them into the light.

4.5.2 Change Management

Change management: no model remains static for long. Assumptions evolve, new asset classes appear, regulations change, and software libraries update (sometimes breaking things in the process). A firm’s change management process should standardize how modifications are proposed, evaluated, and documented.

A central repository or version control system is essential. Whenever the model or its associated data structures shift, the changes and their justifications should be recorded. This makes it possible to track lineage (“when did we start using this assumption?”) and revert to a prior version if an update proves problematic. Later in this book, we’ll introduce version control systems and workflows that make this practical for code-based models.

Equally important is assessing ripple effects. Simplifying a routine or adjusting a discount rate assumption may seem minor in isolation, but can have broad implications when that routine or assumption is used across multiple components. “I just tweaked this one function” can cascade into unexpected changes in reports that stakeholders rely on. Up-front impact assessments help determine which historical results need recalculating and whether stakeholder communication or training is needed before deployment.

We’ll describe package and model version numbering schemes in Chapter 23. The basic idea is that version numbers communicate what kind of change occurred: a patch fix that shouldn’t affect results, a minor enhancement that adds functionality, or a major change that might break compatibility with prior versions. This convention, combined with good release notes, lets users understand what they’re getting when they upgrade.

Communication around changes should be systematic. Concise notes on new features, potential risks, and recommended practices should reach both internal users and (where relevant) regulators. Well-handled change management enables innovation without sacrificing reliability.

4.5.3 Data Controls

Sound data controls matter because flawed inputs will undermine even the sturdiest model architecture. “The numbers looked fine” is not a defense when an audit reveals that a key data feed was stale for three months.

Most organizations define data quality standards addressing accuracy, completeness, and timeliness. These standards help detect common pitfalls: inconsistent formatting (did the vendor switch from MM/DD/YYYY to DD/MM/YYYY?), delayed updates (are we using yesterday’s prices or last week’s?), and incorrect mappings (does this security identifier actually correspond to the security we think it does?). Automated checks at ingestion points catch many issues before they propagate—out-of-range values that might indicate corruption, suspicious spikes that suggest input error, or missing records that break downstream calculations.

Security and access protocols add another layer of protection. Role-based permissions minimize the risk of data tampering, accidental deletions, or unauthorized access to confidential information. Not everyone who needs to view model outputs needs write access to the underlying data.

Data versioning applies to financial datasets just as much as it applies to code. Keeping a record of each dataset’s evolution allows managers and auditors to pinpoint when and how anomalies first appeared. If someone asks “what inputs produced last quarter’s results?” you should be able to answer definitively, not approximately.

Where regulations like GDPR come into play, data controls must also reflect requirements about personal information, consent, and retention periods. Coordinating these efforts under a unified data governance approach ensures that model outputs stand on a solid factual foundation—and that you can demonstrate this to auditors and regulators when asked.

4.5.4 Peer and Technical Review

Even the most experienced modelers benefit from additional eyes on their work. Peer review and technical review are essential for quality assurance. We all have blind spots. A peer reviewer who wasn’t involved in building the model will ask questions that never occurred to the original developer—not because the developer was careless, but because familiarity breeds assumptions.

Peer review can be informal (“hey, can you look at this before I send it?”) or systematically mandated (formal sign-off required before production deployment). Some organizations require independent reviewers who have not contributed to the original model; smaller teams may rely on a rotating schedule of internal experts. The key is cultivating a culture where questioning assumptions is welcomed rather than resented. “Why did you use this discount rate?” should be a normal question, not an accusation.

Technical review goes deeper, focusing on verification of the computations themselves. For complex spreadsheets, this might mean walking through formulas cell by cell. For code, it might mean reviewing logic, running test scenarios, and confirming that edge cases are handled. The goal is to verify that the model actually does what it’s supposed to do—not just that the output looks reasonable.

This process should generate documentation: who performed the review, what methods they used, which issues surfaced, and how they were resolved. If challenges are identified, revisions loop back into the change management system. The documentation then serves as an audit trail, demonstrating that due diligence was performed.

Conceptual soundness also merits review. Does the model align with economic theory? Are the assumptions consistent with domain-specific knowledge? A model can be technically correct but conceptually flawed—using the wrong framework for the problem at hand. Catching this requires reviewers who understand the business context, not just the code.

Peer and technical review, conducted seriously, reinforce consistent quality and catch errors before they reach production. Conducted perfunctorily, they’re just bureaucratic overhead. The difference lies in organizational culture.

4.6 Conclusion

We’ve covered a lot of ground in this chapter: what makes models good (or bad), what distinguishes skilled practitioners, how data flows into models, and how organizations govern and maintain their modeling efforts. These aren’t separate concerns—they’re all interrelated. A well-architected model is easier to validate. Clear documentation makes governance feasible. Curiosity leads to better data practices.

If there’s one theme running through all of this, it’s that financial modeling is fundamentally a human activity. The code and the math matter, but so do the judgment calls, the communication, and the institutional practices that surround them. A model that’s technically correct but incomprehensible is not much better than one with bugs. A model that’s well-documented but built on bad data is still dangerous.

The rest of this book will introduce the technical foundations—the programming concepts, the numerical methods, the domain-specific libraries—that make sophisticated financial modeling possible. But those tools only become useful in the hands of someone who thinks carefully about what they’re modeling and why. That’s the craft we’re trying to develop.

The idea of “model theory” is adapted from Peter Naur’s 1985 essay, “Programming as Theory Building”.↩︎