Imagine the most delicious meal you could ever eat. For starters, it’s Kalamata olive tapenade on garlic ciabatta or mussels roasted in lemon juice and coriander. Then it’s rack of lamb on a bed of puréed sweet potatoes or pan-fried Aberdeen Angus steak or saffron rice with marinated lobster and a fizzy cola-bottle garnish. Yes, sweets are delicious but they’re totally out of place on a dinner plate. This is not an ad for a grocery shop. This is cutting-edge statistical modelling.
The easy availability of data makes it tempting to add just one more variable to your model. You never know, it might tick your r-squared up another percent or so. When people had to toil in the fields to grow some potatoes, carrots, and onions, they made a perfectly adequate stew. With the befuddling scale of supermarketing, people were no longer satisfied with simplicity and had to add one more sprig or a pinch more gummy-bear essence. It’s part of the reason why statistical models like economic forecasts leave such a bad taste in your mouth.
Too many data spoil the broth
Statistical modelling is about using historic data to predict a future outcome. When you have a small but robust dataset, you can predict the future with a similarly small degree of certainty. Adding more data can make a difference to the explanatory power of your model but each additional variable is subject to another economic principle: diminishing marginal utility. When you buy a new games console, you get a high return of gaming pleasure. A second console has some new games so you get a bit more fun, but there are still only so many gaming hours in a day so you get a good deal less than twice the benefit. Were you to buy a third, it would end up providing an even smaller gain in real-time multi-player kicks. The same thing happens when you throw more and more variables into a model, until it’s saturated and nothing can make it better.
Economic forecasts have an unfortunate history of over-estimating growth and under-estimating recession to the point that most forecasters completely failed to predict the 2008 crash. In the event that a forecast is shown to be wrong, the forecaster can just blame extraneous variables or unreliable data. They’re basically saying that if everything had happened exactly as they assumed it would then the model would have been perfectly fine. What they’ll also do is try to think of some other variables they could have added to the model, variables with increasingly tenuous links to the outcome that become increasingly difficult to explain.
That’s the thing about models. You kind of have to explain them, figuring out how every variable contributes to the outcome. In the case of economics, the variables in question are as whimsical as the kind of celebrity chef who prides himself on being unpredictable. People just don’t do what you expect them to do and they’re subject to all kinds of irrationality and changeability. Even when you think you’ve managed to give a plausible explanation of your model, somehow justifying how both anxiety and confidence are significant positive predictors of the outcome, people will just go and react before your prophecy has had a chance to come true.
Economic forecasting also risks creating the future it’s trying to predict, a bit like the opinion polls post. For example, when headlines blare “Economy set to grow by 5.1% next year” people might be more inclined to start looking forward to some benefit and to start spending some money now. All that extra economic activity might just make the economy grow. Contrariwise, the words “triple-dip” and “recession” are enough to make anyone fear for the future of their payslips and think twice about splashing out on that car/postcard/bottle of washing-up liquid/new book on the philosophy of aesthetics they’d really like to buy. This reticence itself is a recipe for recession.
We’re left with a couple of problems. While modelling is a valuable tool, models saturated to the point of uselessness are an increasing problem as more data become available and are haphazardly thrown into the pot. The underlying attitude is a perfect combination of desperation to avoid failure and what appear to be simple solutions. On top of all that, people won’t just stand still while you’re measuring them and then proceed in a straight line: they annoyingly unpredictable, like finding a humbug in your soup. If there’s a solution it probably involves being a bit less experimental when cooking up a model in the first place.