RelayMag
Explainer/opinionNo. 83

What Incrementality Testing Is, and Why It Beats Attribution

RelayMagJune 20265 min read
Key takeaways

Attribution tells a marketer which channels showed up before a sale. Incrementality testing tells them which channels caused the sale. The difference sounds academic until a team turns off a channel that every attribution model loved and watches revenue refuse to drop. That gap, between credit taken and demand actually created, is the entire reason incrementality testing exists, and it is the question attribution was never built to answer.

The question attribution cannot answer

Every attribution model, from last-touch to the fanciest data-driven version, works from the same raw material, which is a record of touches that happened along the path to a conversion. It then divides credit among them by some rule. What that record cannot contain is the counterfactual, meaning what would have happened if a given touch had never occurred. A buyer who was always going to purchase still clicks the retargeting ad on the way to checkout, and the log dutifully records the click as part of the winning path.

This is why a channel can look essential in the dashboard and be nearly worthless in reality. Branded search is the classic example. People searching a brand name are usually already convinced, so the ad mostly intercepts demand other channels built, yet it sits at the end of conversion paths collecting credit. Attribution sees the correlation and reports a star performer. Only an experiment can reveal that pausing it barely moves sales.

How incrementality testing works

Incrementality borrows the logic of a clinical trial. To learn whether a treatment works, you need a control group that does not receive it, then you compare outcomes. Marketing incrementality does the same thing by deliberately withholding marketing from part of the market and measuring the difference.

If the exposed group converts at 10% and the held-out group converts at 8% under otherwise similar conditions, the marketing produced two points of incremental lift. The 8% would have converted regardless, so crediting the campaign for all 10 would overstate its effect by a wide margin. Attribution would happily have credited the full 10.

The two main designs

Two approaches dominate, and the right one depends on how the marketing can be controlled.

Geo testing has matured into the more trusted of the two for media that resists user-level control. The challenge is that no two regions are identical, so a naive before-and-after comparison can mislead. The modern answer is synthetic control. Meta's open-source GeoLift library, built on synthetic control methods, constructs a weighted blend of untreated regions that closely mirrors how the test regions behaved before the campaign started. Once the synthetic control tracks the test markets in the pre-period, any divergence after the campaign launches is a credible estimate of lift. The method is open and reproducible, which is more than most attribution models can claim.

Why it beats attribution

The advantage is not that incrementality is more precise or more convenient. It is frequently neither. The advantage is that it measures the right thing. Attribution answers a descriptive question, which is how credit should be divided among the touches that were observed. Incrementality answers a causal one, which is how many conversions a channel produced that would not have existed otherwise. Only the second question maps onto the decision a marketer is actually making when they choose whether to keep funding a channel.

It also sidesteps the data problems that have been eroding attribution from underneath. Apple's App Tracking Transparency, live since iOS 14.5 in 2021, cut off a large share of user-level signal, and walled-garden platforms report their own conversions without sharing the underlying paths. Incrementality testing does not need to reconstruct an individual's journey across devices and platforms. It needs aggregate outcomes for a treated group and a control group, which survive the loss of granular tracking far better than path-stitching does. The fragmentation that quietly degrades attribution leaves a well-designed lift test largely intact.

The costs, honestly

This is not a free upgrade, and pretending otherwise is how teams get burned. A holdout means deliberately not marketing to part of the addressable market, which is real forgone revenue during the test. Experiments take time, often weeks, to gather enough signal to separate lift from noise, so they cannot answer questions at the speed a daily dashboard can. They require enough volume for the result to clear statistical significance, which rules out very small programs. And they measure the channel or campaign under test, not every line item at once, so a team cannot run one experiment and walk away with a clean number for everything.

There is also the discipline problem. A lift test will sometimes report that a beloved channel produced almost no incremental sales, and acting on that result means cutting spend that an attribution report would have defended. The hard part is rarely the statistics. It is being willing to believe the experiment over the dashboard everyone is used to.

How the two fit together

The sharper way to run measurement is not to abandon attribution but to demote it. Attribution is fine as a fast, cheap, directional read on which channels appear in conversion paths, useful for day-to-day monitoring and spotting changes. Incrementality is the slower, more expensive instrument reserved for the decisions that actually move budget, where being wrong is costly. When a channel looks like a hero in attribution, the right next move is not to fund it harder. It is to hold it out and find out whether the heroics are real.

R
RelayMag is an independent publication on marketing, search, and how companies get found.