How Marketing Mix Modelling Works
(and when it doesn't)
The name for our analysis work depends on who you ask. In advertising, 'marketing mix modelling', 'econometrics', 'media mix modelling' and 'MMM' all mean the same thing: A statistical model that measures how many sales you get from your ads. Whatever you choose to call them, these models are a key tool for advertising effectiveness measurement.
If you're going to apply marketing mix models then you need to understand how they work. You don't need to understand the maths because most of the time even marketing mix modelling practitioners aren't actually doing the maths. We did it once to prove to a university that we could, but day-to-day that's what computers are for. You don't need to understand the maths but you do need to have a feel for how the models work, how they generate their measures and what they can and can't do.
Especially what they can't do.
Keep reading and we'll explore just that. We're going to create a small econometric model to understand how the technique works, and then we're going to use it to see how and why marketing mix modelling can struggle to answer certain questions.
We're not going to use a made-up brand and some invented sales data for our model though, we're going to use something else...
Welcome to the Lake District
At the north end of Ullswater in the Lake District lies Pooley Bridge.
There is a water level measurement gauge located on the steam boat jetty. Just about here...
Part 1:
How marketing mix modelling works
The depth of Ullswater at Pooley Bridge varies between around 1m and 3m and if you squint at this chart of its levels since 2021, doesn't it look a bit like the sales line for a brand that's running lots of promotions?
We're going to create a model to explain what's changing the depth of the water, in exactly the same way as an econometrician would build a model to explain what's driving the sales of a brand.
A brand's sales are driven by the four P's - product, price, place and promotion. Water in a lake is driven by just one P - precipitation. For this model, rain is our advertising. More rain is more water in the lake is more sales.
You'll see in a moment that even though there's only one driver in our model, getting it to work properly is actually surprisingly tricky.
When our model works properly, it's able to explain why the water level has changed in the past. The blue line is the actual depth of the water and the red line is a sketch of roughly what we hope our model's shape might look like. If movements in the red model line mostly fit the shapes in the blue actual line (and you didn't cheat) then you've got a decent model that explains what's happened in the past. Some statisticians reading this are recoiling in horror but let's just say that 'and you didn't cheat' can get quite complicated and move on.
Before we get to measuring rainfall, what if we tried something simpler?
Water levels in the lake are generally higher in winter and lower in summer. What happens if we make a model and don't tell it anything about rainfall at all? We only give the model some basic seasonality data that says sometimes it's winter (October to March) and sometimes it's not winter and then we'll see how well it can explain the water's depth.
In technical language this is called a dummy variable. Our seasonality dummy variable has a value of 1 when it's winter and 0 all of the rest of the time.
It works! Sort of. The red line is the model's understanding of the water level and we can see that it's higher during winter. Ideally you want the red model line to track all of the changes in the blue actual line but this very limited model has only captured some basic shapes.
The model hasn't explained any of those big spikes, only that the water level is generally higher in winter and lower in summer.
It's important to think about what we just did because we made a choice. We modelled that the water level is higher in winter 'just because'. Not because it's caused by anything that we've explained, it's higher in winter just because it is.
This fairly silly model is even potentially useful in a limited way. If you were an alien who knew nothing about lakes, now you'd know - on average - how much more water they have in them during winter.
Analysts make choices like this all the time and that's fine. It's part of the job, but it also illustrates that an econometric model isn't at all a standardised thing.
Who you commission to build your models is extremely important, as one analyst might measure that winter seasonality is causing the water depth to rise (and we should acknowledge that they wouldn't be strictly wrong in that statement) but another analyst - a better, more diligent analyst - would take the next step and use rainfall data instead of seasonality to make a much more useful model, like we will now.
Of course we already know that rainfall will end up in the lake and if we grab the data and chart it then that's exactly what we see.
The depth of the water spikes upwards during periods of rain.
We can use this rainfall data to build a better model. We're still trying to understand changes in the daily water depth, but now instead of a seasonality variable to explain why it's changing, we give the model daily rainfall data.
Is our new model any good?
Well... It's different. You can see it's trying to track the actual water depth but it's not working properly yet.
Marketing mix models compare the changes in one variable with the changes in another. If they're 'positively correlated' meaning they rise and fall together, or they consistently oppose each other - one always goes up when the other goes down - then the model will use them.
So what happened to our model? It looks a bit weird.
The model has tried to fit the spikes in water depth but hasn't done a very good job of it and its baseline is much too high. When the water level is lower in summer, the model doesn't understand that at all.
It's obvious to us that the depth of water rises when it rains so why does our new model not really work?
It's tough to see the problem when you're looking at three whole years of daily data. We need to focus on one of the problem periods.
The issue is with how the water depth reacts to rainfall. When rain starts, the lake takes some time to reach peak depth but even more importantly, the depth doesn't immediately go back to normal when the rain stops. There's a delay.
Measuring advertising has exactly this problem. You can't simply put impressions or spend into a model of sales because the sales don't happen precisely when the advertising happens. Sales from advertising can take a bit of time to get going and then when you switch off the ads, sales might stay higher for a while before returning to normal.
In fact, if you're running a really successful brand campaign, sales might not return to their previous level at all. Thoughts like that aren't helpful at the moment though, we'll come back later to why long term effects like that can be a big problem for econometrics.
For now, we need to do something to those rainfall numbers to make them look like the shapes that they create in the water depth. I said earlier that building an econometric model doesn't use much maths day-to-day and it doesn't - the experience is much more like doing a jigsaw puzzle. We're always looking for the next shape that we need and at the moment our rain data is much more spiky than our lake depth data.
We want a shape that builds up gradually when it rains and then takes some time to die back down again.
We do that with a decay rate. We make a new variable (an input) for our model, where the value of 'rainfall' isn't just that day's rainfall, it's that day's rain plus some carried-over amount from the past. It might help to think about our new variable as a leaky bucket - you put water in and it slowly leaks out again, so it's always got today's rain in it but also some of yesterday's and the days before that...
Your next question might be to ask how we know the amount of rain to retain from the past. What decay rate do we use?
Our analyst is making choices again. Guided by their experience and by the behaviour of the model they're creating, they look for a value that fits the data. Going back to our leaky bucket analogy, they try buckets with big holes and buckets with little holes and they see what fits.
When we're modelling water in a lake this is fairly easy because only rain is affecting the depth of the water and the impact of rain is big, so you can quickly see whether you've got it right or not.
When we're building a marketing mix model it's not easy at all. There's loads going on - competitor activity, price changes, promotions and multiple advertising channels, plus the uplifts from advertising are usually smaller than the spikes from promotional discount offers and so are harder to see. Analysts lean a great deal on their experience, on benchmarks and on continually iterating and improving their model to get to their best estimate of the truth. Again, who builds your model is extremely important because statistics alone cannot do these things for you.
It turns out there isn't one decay rate that will make the model work. We need two of them.
The shape that rain creates is an immediate spike in the depth of the water and a slower rise and fall that endures over a longer period of time.
We often see a similar shape when we look at brand search data while TV adverts are running. On the day that you have a big TV spot, searches are much higher but there is also a more subtle, sustained rise in searches throughout the length of the campaign and extending on after that campaign has ended.
Once we give the model these two different shapes to work with, suddenly it can explain the depth of the water.
It's not perfect - if you look carefully then you'll be able to spot times when the red model and blue actual lines don't match - but it's pretty good. We could stop here and be happy that we'd built a reasonable picture of how rainfall affects the depth of Ullswater.
Why does it sometimes not fit? There are all sorts of effects we might not have captured. Temperature is probably important because rain and snow behave differently, and we're using local rainfall data but maybe sometimes its only raining further upstream and that water still ends up in the lake. Models are never perfect, we work on them until they're capable of the job that we need them to do.
"Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful"
George Box
It's worth pausing here to think about what econometric models can do and how hard we had to work to make this one fit. We were only trying to explain how water gets into a lake but we still needed to capture two dynamic effects - a faster one and a slower one - and the model still doesn't fit all of the time. Water flowing into a lake is much simpler than acquiring customers in a competitive marketplace.
There are limits to what econometric models can do, to the levels of detail that you can coax out of them and to the insights that they are capable of providing. They're tremendously useful things but it pays to be sceptical; if it sounds too good to be true then it probably is.
Part 2:
When marketing mix modelling doesn't work
Now that we have our model, what do we do with it? We'll want to ask some 'what if?' questions and see what might have happened under different scenarios.
What would be the result of an extended period of rain? We can invent some new rainfall data, put it back into our model and ask for a prediction.
Let's pretend that July 2023 was a complete wash-out. We'll invent our own rainfall data that says it rained every day for the entire month and ruined everybody's summer holidays. What would have happened to Ullswater?
As you'd expect, the model predicts that the water level would have been much higher. Marketing mix modelling is good at questions like this, where we play with scenarios that don't extend too far outside the scope of the data that's been used to create the models in the first place.
In advertising, we can play with campaign budgets and different combinations of media channels to get predictions for what would happen to sales. For example we could produce a prediction for what might happen to July's sales if we run bigger promotions and more advertising than last year. As long as we don't go too far outside of the campaign budgets that the model has seen before, we should get sensible predictions.
People tend to try to use marketing mix models in two ways. We've just seen the first one - running 'what if?' scenarios that predict what would be likely to happen with a change in the model's inputs. The models are pretty good at doing this.
The second way that people try to use marketing mix models is to justify an advertising budget. Often we talk about return on investment (ROI) and hope that our models will measure a positive number because that would mean advertising is adding more to sales than it costs to run.
How do we measure ROI? We tell the model to turn off the ads and ask it to predict what would happen. Using our Ullswater model, it's the direct opposite of the exercise we just did. Instead of turning up the rain, we turn it off and ask for a new prediction.
What would have happened if July 2023 had no rain at all?
Our model can generate a new prediction that the water level would drop throughout July, hitting a level below 1 metre before it started to rain again in August. There would still be a lower water level for quite some time afterwards, due to the decay rates we saw earlier. A lack of rain now will have an impact into the future.
When we're modelling advertising rather than rain, this is great news. We can measure the benefit of advertising now and measure the benefits that it creates into the future.
If we were only trying to predict what happens for the next few months, we'd be in a great place. We've got a model that explains the past and we can use it to make predictions about what will happen if we change its inputs. That's very useful.
But we want to measure ROI. All of the benefits created by advertising. And that's a problem.
What would happen to Ullswater if it stopped raining today and never, ever rained again?
Think about a drought that lasts forever.
It would take a while but I think the lake would dry up completely.
Unfortunately, our model doesn't agree...
If you turn off the rain forever, starting from January 2022, then our model thinks this happens...
Our model says that if it never rains again, the water level will drop to 0.75m and stay there, never dropping any lower. It doesn't suggest any clues for where the 0.75m of water is coming from but it's not capable of forecasting that the water could drop below that level.
Depending on the question we're asking, that isn't necessarily a problem. Want to know how much rain will cause a flood? It's good at that. Want to know how low the lake will drop in a (realistic) drought? It's good at that too.
Want to know the full ROI of rain? Now you're in trouble.
The model doesn't understand that eventually the lake would be empty because it was only measuring changes in the data that we gave to it. Ullswater hasn't come close to drying up completely in our data, so the model will never predict that it could dry up completely, even when it's obvious to us that we've turned off the only input.
Hold onto that thought and return to advertising. What happens if you turn off all of the ads and leave them turned off? If those ads were doing a long term support job for your sales base - if you're an established brand - then marketing mix modelling would underestimate, potentially dramatically underestimate, the drop in sales that would cause. It would get the immediate answer right and tell you the sales level for the next few months but these types of models can't predict a slow, long term decline.
In your marketing mix model, you have a very useful tool to understand what is likely to happen if you make modest changes but if you believe that advertising has very long term effects then you do not have a full measure of advertising's benefits. You have a clever, useful model but you have not measured advertising's true ROI.
Conclusion:
What do we do now?
If you've just found out that marketing mix modelling can't actually measure advertising's full ROI then don't panic. Econometricians know that already and most will freely admit it. It's not a secret.
Nothing else can measure full ROI either.
The most useful thing you can do with that knowledge is to avoid the ROI measurement trap.
Anyone who's watched marketing mix modelling projects being debriefed to finance directors will have seen the ROI measurement trap. You fall into it when you say that you say that you've commissioned MMM to measure advertising's true ROI, present the number and the FD calculates that it's not big enough to be profitable.
Now you have to explain that your project doesn't really measure true ROI, it only measures the short to medium term part. There might also be a long term part that we can't measure. Only we're not sure how big the long term part might be. Your budget for next year is looking shaky...
It's a trap. Don't fall into it.
Use econometrics for what it's good at. Play with different campaign scenarios and choose the best one, predict sales for the next quarter, simulate price changes. But don't expect it to capture the full value of advertising because it can't do that.
What can you do instead to understand the long term?
Let's answer that question with a question. How did you know that Ullswater would dry up if it never rained again? The model didn't understand that but you did.
You knew because you understand at least the basics of how lakes work and where the water comes from. You got that from intuition and you got it from experience - Ullswater hasn't dried up but other lakes have.
We can do the same for our sales scenarios. We need benchmarks and examples from brands that did stop advertising. We need large cross-industry studies because all of the answers aren't in your own data. And we need a mental model of how we believe advertising actually works. One that we can't get from a marketing mix model.
If you'd like to read more (perhaps after a break and a cup of tea), I've collaborated with two colleagues to produce a guide to integrating marketing mix models with other techniques.
My company, Sequence Analytics, exists to weave together different advertising measurement tools and techniques because no one tool can do everything on its own. Not even marketing mix modelling.