MeasurementJun 3, 2026 · 3 min read

Measuring CRM uplift with a control group

If you cannot say what your CRM would have earned without the campaign, you cannot say the campaign worked. A control group is the only honest answer - here is how it works.

Every CRM team can show you a dashboard. Opens, clicks, revenue attributed to the last campaign, a chart that goes up. What almost none of them can show you is the one number that actually matters: how much of that revenue would have arrived anyway, with no campaign at all.

Without that number, "the campaign drove revenue" is a claim, not a measurement.

Attribution is not uplift

The trap is attribution. A player receives a win-back email, then deposits, so the deposit is credited to the email. But many of those players were going to deposit regardless. Attribution counts them all; uplift counts only the ones the campaign actually changed. The gap between the two is where marketing budgets quietly die.

You cannot close that gap by looking harder at the same data. You have to withhold.

The control group is the whole trick

A control group is a randomly held-out set of players who receive no model-driven action. They are your true baseline - what happens with no intervention. Compare the treated players against them, and the difference is uplift. Real, causal, defensible.

Retivo runs this as a three-arm test on every deployment:

Control. A held-out group that receives no model-driven action. This is the honest baseline.
Manual. Your current CRM and team, running exactly as they do today.
Retivo. The same players, driven by the model.

With all three running on the same population at the same time, you can read two gaps at once: how much the model beats doing nothing, and how much it beats your current manual effort. If it does not beat manual, that shows up plainly - and you should know that before you scale it.

Why three arms instead of two

A simple A/B against control tells you the model beat nothing. Useful, but not the decision you are making. The decision is whether to switch from what you do now to something new, so you need the manual arm in the same test. The three-arm design answers the real question - is this better than my status quo - not the easy one.

The discipline this forces

Running a control group is uncomfortable, because it can tell you your campaign did not work. That discomfort is the point. A team that measures with a control group cannot fool itself with attribution, and cannot sell you a lift figure it did not earn.

That is the standard we hold ourselves to. We publish no invented accuracy or uplift number on this site, because the only honest number is the one measured on your players, against a control group. When we say we measure our own uplift, this is what we mean.

If your current reporting cannot separate what the campaign caused from what would have happened anyway, that is the first thing worth fixing - before any new model, before any new tool.

Every Retivo deployment includes three-arm control-group measurement. The number you get is measured on your own players.

Measuring CRM uplift with a control group

Attribution is not uplift

The control group is the whole trick

Why three arms instead of two

The discipline this forces

See it on your own numbers