Accuracy over Precision in Forecasting

Here’s a targeted interview question to test a job candidate’s financial modeling skills:

“Can you build a model even when you don’t have all the relevant data?”

The answer should be Yes. You don’t need all the relevant data, as long as you have identified what type of information you need, and the minimum information that you need to complete your analysis. So, it depends whether the information that’s available is sufficient to achieve your objective, and whether you’re comfortable with the level of accuracy of your model. In fact, you never want to have all the relevant data in your model. A model is just that – a simplified representation of a specific part of reality. It should be as simple as possible, and have the minimum level of information, so that calculations are fast and you can quickly analyse different scenarios and make changes to your model. You should avoid making it more precise than necessary, as precision rarely correlates with accuracy because a forecast is based on assumptions layered upon assumptions.

To understand how you can build a model on incomplete data, I have used 2 examples below:

Example 1 – Loan Analysis

You may have five similar loans as below, and you want to model the loan operation such as the interest expense you pay each month. However, you do not have information about each individual loan balance, interest rate and term. So, you cannot complete your analysis on the individual loan level.

Individual Loans

Loan Principal Balance Rate Term
1 1,000,000 4.0% 360
2 35,000 14.0% 240
3 500,000 5.5% 360
4 400,000 7.0% 280
5 750,000 5.0% 180

However, if information on the aggregate pool of the five loans was available, such as the sum of all the balances, the weighted average rate, and the weighted average terms, then you might still have enough information to complete your analysis. If you just want the total financial impact of all five loans, and the loans are similar, then you can simply create a fictitious loan that is representative of the five actual loans in the pool.

Representative Loan

Sum Rate Avg Term
2,685,000 5.1% 296

If later on, the missing detailed information becomes available, you can still keep your existing model without any structural changes. You can RETROFIT the detailed loan-level model into your representative loan model by calculating the aggregate values from the individual loans, instead of using estimated values for the fictitious loan.

Example 2 – Billing & Collections

Let’s say you need to forecast your trade debtor balance. You don’t have any information besides historical balances, so you can simply use a moving average and assume the balance always reverts to that long-term average. In any given month, your model is expected to be accurate within $12m (based on Feb-15 which had the largest variance). Are you comfortable with this level of accuracy? If you are, then you’re done with building your model. There’s no need to add complexity that has no impact to your actions or end-objective.

If, however, a moving average assumption above is inadequate, then you’ll need to further breakdown your debtor balance into its different drivers. In the model below, we’ve chosen revenue as the driver, and the debtor balance is a percentage of your revenue. So, instead of averaging the balance, you’re now averaging the revenue and the percentage.

We can further breakdown revenue into the average number of users in the month, multiplied by the average revenue per user.

What if later on, you learn about drivers in addition to revenue? For example, you learn that the debtor balance is actually driven by cash receipts and billing adjustments during the month, which proves to be a more accurate driver than the percentage of revenue? Do you now need to maintain two models, one based on percentage of revenue, and the other based on the monthly movements? Do you have to choose one model over the other?

No. You can still maintain one model without losing any information. If you’re on a tight deadline and can’t decide whether to keep your original model, then create a separate model which outputs the Debtor % of Revenue and use that as an input to your original model. So, if you update your new model, your original model will be automatically updated as well. Later on, if you’re comfortable with your new model, you can delete your original model, and just use the Debtor % of Revenue output as a reasonableness check.

Leave a Comment