How to Use Causal Targeting to Save Money on Promotions
Omri Perez, PhD - Head of Machine Learning / AI @ Haus
February 1, 2023
In the “Are promotions growing your business or losing you money?” blog we discussed why promotions, despite their potential power and ubiquity, are actually hard to execute well. We then described four different kinds of customers in terms of how they respond to promotions, and we found that there is in fact only one type of customer that you should offer promotions to. They are the customers who will have a favorable outcome if and only if they are treated (i.e. Compliers).
In this blog, we discuss the emerging paradigm of Causal Targeting and how we can leverage it to specifically target these customers, allowing us to execute promotions in an informed manner that is beneficial to the business.
Before we describe what Causal Targeting is, I’d like to get a few acronyms out of the way.
When running a standard A/B test we aspire to estimate the Average Treatment Effect (ATE), which is essentially the difference between the averages in the treatment and control groups. For example, we could estimate whether a specific global promotion (Treatment) is better or not than no promotion (Control) and by how much. However, to do Causal Targeting we need to estimate the effect of the treatment on each individual, which is aptly called Individualized Treatment Effect (ITE)1.
To create an ITE estimator takes two steps:
- Run an A/B test to compare the treatment options we are considering (e.g. control vs. a specific promotion).
- Train a customer-level ITE model2 on the above test data, the attributes of the customer, and their test group assignment (control vs treatment). The resulting model lets us estimate the treatment lift (or uplift) for each individual.
With an ITE estimator in hand, to identify the compliers, all we need is to look for customers whose ITE estimate is larger than zero, meaning that if we were to treat them, the outcome variable, such as repurchase or revenue, would likely increase. Pretty neat right?
Causal targeting setups
We can use the concept of targeting customers by their ITE estimates in increasing degrees of sophistication.
- Choosing which customers to target with a single promotion- This is identical to what we just discussed and lets us identify which customers are likely to respond positively to the treatment. Customers who are predicted not to respond, or to respond negatively are not targeted.
- Choosing which promo, out of multiple possible promos, to target which customer- similar to #1, here we compare multiple treatments at the same time and decide, for each individual customer, what is the best treatment. Similarly, customers who are not predicted to respond positively to any of the treatments are not targeted. For example, Assign treatment B to customer 1, Treatment A to customer 2, and no-treatment to customer 3.
- Choosing which promo, out of multiple possible promos, to target which customer, while keeping known costs within budget - Similar to #2, but here we also consider the known costs of running the promotion and strive to only target a subset of customers that: a) will respond positively, and b) will respect the allotted budget. Known costs can be the impression cost, i.e. the cost of getting an impression from the targeted customer (e.g. through direct mail or online advertising), and a triggered cost, which is the cost of the promotion itself, and is only applied when the customer uses the promo (e.g. a $10 discount). We can take these costs into account and assign treatments to the customers in a manner that complies with some predefined budget.
- Choosing which promo, out of multiple possible promos, to target which customer, while keeping variable costs within budget- This is the most sophisticated variant, and publications suggest that it is already in use in the industry. For example, Booking.com uses it for add-on promotions, and Uber uses it for customer attrition prevention. It is similar to #3, where we find the best treatment (or no treatment) per customer, while respecting some predetermined budget. The difference is that in this case, the cost of the promotion is not fully known in advance. For example, this can be because the promotion is a % discount and as such we don’t know the true $ cost until the customer uses the promotion, at the time of transaction. To achieve this we need two different ITE estimations per customer: the increased likelihood of taking the desired action after the treatment (e.g. repurchase), and the additional revenue (or cost). Because some customers will yield more additional revenue than the cost incurred, we can actually enforce a budget of zero, and only treat a subset of customers that will increase the desired behavior at no additional aggregated cost. Of course if we would like to treat a larger set of users, we can assign some predetermined budget. This can be desirable for example in an attrition prevention setting.
As we discussed above, there is an emerging field of Causal Targeting which enables companies to target the right customers with the right treatments in a way that reduces negative reactions (e.g. from defiers), doesn’t waste money on customers that are likely to convert anyway (always-takers), and with a controlled budget - which can even be set to zero in some cases.
Promotions are a tool that brands need for continued growth, now more than ever, due to the dramatic impact recent privacy changes have had on advertising and attribution. However, promotions are not a cure-all. You can't just throw them at the wall and hope they'll stick. You'll usually end up with more questions than answers.
Just as Haus can bring clarity to advertising, we can also help brands deploy promotions with precision, measurability, and transparency - get started here.
1ITE is also known as Conditional Average Treatment Effect - CATE
2This is done by using Uplift Modeling, which is a recent paradigm that leverages standard machine learning models, training them on experimental data to estimate ITE. This is in contrast with how machine learning is usually used to simply predict some outcome of interest (e.g. propensity modeling)