Pega's approach for true next best action outbound
Pega's approach for true next best action outbound
|Description||This design pattern explains how clients can apply Pega's Customer Decision Hub for Outbound marketing in a practical and cost efficient way.|
|Version as of||8.4|
|Application||Pega Customer Decision Hub|
|Capability/Industry Area||Outbound Marketing|
Author: Shoel Perelman
Date: November, 2021
Pega Customer Decision Hub (CDH) is recognized in the industry as being #1 on the Forrester Realtime Interaction Management (Q4 2020 RTIM) Wave. While some think of RTIM as primarily serving Inbound decisioning use cases (such as which action to present on a Mobile Banking Portal), several of Pega’s clients, such as Commonwealth Bank of Australia and Royal Bank of Scotland, have employed the same technology to achieve unified Inbound and Outbound with a single brain, using a single set of Actions. Rob Walker and Matt Nolan’s excellent whitepaper, Crossing the Chasm, explains the philosophy and business benefit that comes from embracing next best action in practice for all customer interactions. It should be thought of as a pre-requisite read for this paper.
Within the CDH product, Pega’s approach to using next best action for Outbound is embodied by CDH’s out-of-the-box Strategy Framework and User Interface called Next-Best-Action Designer. NBA Designer provides an alternative to traditional Segment-driven Campaigns that accomplishes large-scale Outbound communications while staying true to the principles of next best action.
How does the next best action driven Outbound work?
Campaigns are a well-accepted concept for managing units of work in marketing organizations. They encapsulate something to say to customers (or prospects) and to whom it should be said. They also provide a vernacular.
It is common for clients to say things like: I have 500 Campaigns that I run every month. What this usually means is that each of these Campaigns consists of a Segment (to look for people using a database query) and one or more Offers, each with one or more Treatments attached (to send to those people). By saying I run them every month, that usually means each of these queries is run as jobs (those select customers – and their data – using a where clause, and then output the qualifying records), scheduled to execute one or more times over the course of the month.
In the next best action world, we speak often of Actions, which can be Offers, Service Suggestions, Nudges, or any other word that means “something to reach out to a customer about” – with the intention of stimulating them to do something.
Each Action has an Engagement Policy which defines who the Actions may be sent to. The Boolean logic in Engagement Policies plays the role of the Segment in a Campaign. Instead of a marketing operations professional having to create a new Campaign consisting of an Offer and a Segment to fulfill a request from the marketer, the next-best-action specialist creates an Action with an Engagement Policy (Eligibility, Applicability, and Suitability rules), a Contact policy (so it is sent once and only once) and one or more channel-specific Outbound treatments (such as an email or SMS treatment). There’s no need to add this offer to an applicable campaign, or schedule a specific job to run because there’s already a master “always on” job scheduled and running which will automatically “pick up” and evaluate the new Action.
So what makes the Actions get “sent”? An Outbound schedule.
To configure Outbound communications using NBA Designer, an administrator clicks on the Channels tab and configures the master Outbound schedule. The primary (master) Outbound schedule is a perennially recurring job that loops through ALL customers and, for each one, executes an out-of-the-box strategy shipped with the product that evaluates what the next best action should be.
For many of the customers on a given day, there might not be any next best actions because there’s nothing worthwhile to communicate, or the Contact policy prevents saying anything (for example, a Contact policy that limits outbound communications to once per week per customer). For a subset of the customer base, however, an Action might make its way all the way through the Engagement Policy (the customer is Eligible for it, it's Applicable to the customer, it's Suitable for them, and it doesn’t violate the Contact policy) and there will be an email or SMS treatment associated with the Action.
That gets us to a final set of Actions that *could* be sent to the customer we are evaluating – but which one will actually win (for example, be ranked high enough in priority to be sent)? This is where we ask our AI to predict the likelihood of that customer to accept each of these Actions if offered (the customer’s Propensity for that Action), based upon everything we know about that customer at that moment (we will discuss this part more later). We then balance that Propensity against the Value to the enterprise of the Action (the Value) and apply any configured weightings to "tip the scale" (we call these Levers). The remaining (surviving) Actions that come out on top will win the chance to be presented to the customer.
It's worth noting that there doesn’t have to be only a single "winner." An email could communicate the top 3 Actions by assembling them into a single email body.
These emails will be queued up for delivery by an ESP (Email Service Provider – we will talk about this later).
That sounds too expensive to be practical. Is it feasible?
Compared to simply running an SQL query against a database to filter for customers to send an email to, the next-best-action approach does entail more work to be done for each customer. Fortunately, Moore’s law compounded has made evaluating next best actions for every customer feasible at scale, especially when considering the benefit.
Any discussion about costs must start with: who is paying the bill? When CDH is run on Pega Cloud, clients pay by the number of customers, not by the number of CPUs, so Pega bears the operational costs. Therefore, the answer to the question of cost is baked into the price of using the service (which a Pega Account Executive would be happy to quote).
Pega is unique in the RTIM industry because we offer the same CDH product code delivered either by using Cloud Choice or by our Pega Cloud. Cloud Choice means clients pay for and operate the cloud infrastructure that CDH runs on. So, in this scenario, a discussion about costs is relevant. This document is not meant to be a formal sizing guide nor to produce accurate total costs for operating a CDH system (it doesn’t cover, for example, labor, storage, or multiple environments for testing). It is meant to illustrate conceptually how the costs, specifically for calculating next best actions for millions of customers every day, can be calculated – and that these costs are reasonable.
Every organization has different IT costs, depending on which public cloud provider they’re using, the savings plans they have negotiated, or even their internal cloud operational costs. As a reasonable proxy for a "fair" cost, we’ll use Amazon Web Service (AWS) EC2 pricing. The rationale is that if an organization’s IT costs were much higher than AWS, there would be long-term pressure to either use AWS or bring costs down to the same ballpark as AWS.
Performance tests of the NBA Designer Strategy Framework have benchmarked a typical next best action calculation to take 150 milliseconds, assuming all data is stored within the CDH infrastructure (Cassandra and a “local” RDBMS, such as Postgres or Oracle). This includes:
- Fetching a customer’s profile, whittling down approximately 500 available actions to approximately 15 (by using engagement and contact policies).
- Asking our Adaptive Model to calculate the propensity for the customer using the latest customer profile data.
- Executing our arbitration strategy to pick the best actions remaining.
With a customer base of 10 million customers, it would take 150 ms * 10 million customers / 1000 ms per second / 60 seconds per minute / 60 minutes per hour = 416 hours = 17 days.
Even with the most restrictive contact policy, it’s unlikely that any client would decide to only communicate with its customers once every 17 days. But wait – that is a naïve calculation before applying parallelism. Over the past few years, the CDH decisioning technology has been honed to be highly scalable across both threads within a Node (we call this Vertical scalability) and across Nodes in a cluster (we call this Horizontal scalability).
A performance rule of thumb for Horizontal scalability (parallel threads) we’ve found to hold up is one thread per virtual CPU. That means an Amazon Web Service c4.4xlarge, with reserved pricing, as of March 2020, costing $0.47/hour, can handle 16 NBA evaluations in parallel, as it has 16 vCPUs at full load. To be conservative, let’s assume we’d only run 10 parallel threads, to leave some headroom and to account for less than linear scaling, although Pega’s practice of using partitioning on Customer ID has proven to scale close to linearly.
Amazon Pricing as of March 2020:
Based on these costs, let’s calculate the cost to calculate an NBA for 10 million customers every day. If it would take 17 days to run through all the NBA’s with one thread, it would take 1.7 days (40 hours) on one Node to process them with 10 threads running in parallel. But 1.7 days is still too long. Let’s assume we want to get all these calculations done in 8 hours, to leave ourselves plenty of time in the day to actually send these messages out. To get from 40 hours down to 8 hours, we’d need 5 Nodes. 5 Nodes, at $0.47/hour would cost us $56/day, or $20,586 per year in AWS compute for 10 million customers.
Is $20k/year too much? If we are following Forrester’s Total Economic Impact (TEI) study’s findings for CDH (February 2020) which documents that, in certain industries, CDH can generate in excess of $15/customer/year in lift, we stand to generate $150 million per year. $20,000 for the Outbound compute portion of a holistic Next Best Action program seems reasonable in this case – that comes to 0.013% of the return.
If we scale this up to 100 million customers, we would be facing an AWS compute cost of $205,860, for 50 nodes. Again, not unreasonable against a TEI extrapolated $1.5 billion lift. Of course there are more costs than only compute (see the TEI analysis of total costs), especially storage costs for the knowledge our models need to learn from responses, but those storage costs would be there even if we used Segment driven Campaigns to power out Outbound communications. From Pega’s experience operating Pega Cloud, we’ve seen that compute resources account for roughly 85% of the total infrastructure cost, so it is fair to focus on compute costs.
Can we reduce the $205,860 compute cost for 100 million customers further?
So far, our calculations have been based upon the assumption that every customer should be evaluated for their Next Best Action every day. In practice, we know that some clients will enforce Contact policies that only allow a customer to be contacted once or twice a week at most anyway. So, why not relax the calculation frequency?
Although it is a practical (and perfectly good practice) to evaluate 10-20 million customers every day within the span of a few hours, by configuring multiple Outbound Schedules, we can partition a much larger customer base (naively using ranges of Customer IDs) into 3 roughly equally sized groups and evaluate each group twice per week. With this approach, we could handle 100 million customers using only 50/5 = 10 nodes, costing us only $68,620 per year in compute.
Can we reduce the need for periodic evaluation by listening to events?
So far, we’ve assumed that every customer needs to be evaluated at the same frequency. This assumption stems from the 25+ year old world of batch database marketing in which customer profile data is updated in an RDBMS from external feeds and database triggers were rarely, if ever, used to detect and act on changes. Modern marketing data management stacks incorporate event streams.
Since we can listen to event streams (typically via a REST API), we don’t have to wait for a customer to be evaluated by a periodic job when we know something interesting is happening – we can evaluate the Next Best Action immediately. With customers’ NBAs being re-evaluated in response to an event (such as the arrival of a Call Detail Record saying that a subscriber’s prepaid balance has dropped below a threshold, or the customer just visited the Terms & Conditions web page for cancelling a contract), the situations where we are left to rely upon looping through every customer on a periodic schedule are reduced to situations where we know there will be no event to trigger an evaluation.
What role would a traditional Campaign Management tool and ESP play – and why?
Pega is not an Email Service Provider (ESP) – we don’t actually deliver emails, so our clients need to contract with an ESP for that email delivery service. Full service ‘Marketing Clouds’ like SFDC’s Email Studio (formerly Exact Target) include Segmentation and targeting logic. Since CDH is determining who should get which email every day (or throughout the day), the reason to use the Segmentation features of a Marketing Cloud is to offer an extra level of protection to make double sure that people who have Opted out are never e-mailed. For this reason, Pega CDH clients often choose “pure play” email delivery ESPs such as SendGrid or Mailgun/MailJet. To facilitate integration with these ESPs, Pega publishes Connectors on our Pega Marketplace, such as the connector for SendGrid.
In organizations where traditional Campaign management products such as Adobe Campaign are used for Segmentation already, the simplest integration pattern is for Pega to output Next Best Actions to a database (or file) and for the Campaign tool to handle the logistics of delivering them to the ESP. The Campaign tool can handle ‘last mile’ mechanics such as Auto-replies, or even higher value decisions like choosing the exact hour to send an email based upon when recipients are known to open email. To facilitate using Adobe Campaign in conjunction with CDH, Pega publishes a design pattern document on Pega Wiki.
Which decisions should be made by Pega and which by the ESP or Campaign Tool?
While Pega has robust support for email treatments, including the ability to personalize those Treatments using attributes from the customer profile, it is common for the ESP, or another Campaign management tool to handle the assembly of the final HTML treatment that is sent to customers.
In this integration pattern, Pega outputs flat files throughout the day (hourly is typical) where each row represents a message that Pega would like to be sent to a customer. That row contains attributes from the customer profile as well as decision details (such as WHY the decision was made). It is a perfectly reasonable partnership for a Campaign management system to consume those files and execute Campaigns that process the files, turning each row in the file into an email. In this scenario, it is the Campaign management system that chooses the final “copy”, consisting of the personalized HTML and creative elements. These same systems, either because they are ESPs, or in conjunction with ESPs they connect to, also handle response tracking, which they feed back to Pega. It is important for customer behaviors (especially click throughs) to be sent back to Pega, in order for Pega’s AI to learn from the decisions made.
While Pega’s AI feedback loop (Adaptive Models) is able to learn more accurately if Pega is choosing which email treatment should be sent to a customer, Pega can still learn even if the Campaign management system is taking Pega’s Action and picking one of several suitable email treatment templates using its own heuristics. Nevertheless, the Pega return on investment will be higher if the AI is able to learn directly from detailed decisions, including treatment level decisions. If another system is choosing the email template, the AI may learn some false truths (maybe the customer rejected the Action because they didn’t like the image, rather than because the offer wasn’t interesting to them). These two modes of learning -- at the Action level vs at the Treatment level -- are directly supported on the Arbitration tab of NBA Designer.
How would I send NBA’s to an Outbound call center?
Sending a list of pre-calculated Next Best Actions to an Outbound call center is similar to email, except the Actions sit in a queue until they are “pulled” by the call center application.
For this scenario, we adopt the same approach as for email, where we loop through all customers every day. We would apply an additional level of filtering to the outputted NBA’s to make sure the propensity is above a certain threshold. As of v8.7 (planned release in late 2021 / early 2022), this filtering is built into the Out of the Box strategies along with a spot in the NBA Designer UI to enable configuring what we call a “Propensity Threshold”.
The Actions that pass would be loaded into an Output table, which an Agent UI would query for the “next 50 highest-priority actions”. Each Action has a customer associated with it which tells them who to call. At the moment before the Customer is actually called, the Agent UI should ideally get a fresh NBA for that Customer (by making an inbound real-time container call directly to CDH). Pega does not provide an out-of-the-box Agent UI for Outbound calling as of November 2021, so this would need to be a custom built UI (as several clients have built).
The time to adopt true Next Best Action to drive outbound customer engagement is now – we are no longer waiting for enabling technology or simpler configuration interfaces. The benefits are already understood based upon pioneers in the industry who have used CDH’s underlying decisioning engine, and built the wrap-around outbound solution processing themselves. The capability is now available out-of-the-box in the latest version of CDH for others to benefit from and to achieve true “channelless” customer centricity.