Pinpointing Customer Behavior Predictors — A Step-By-Step Deconstruction Process
by Derick Jose
Originally published March 8, 2011
Increasingly, in the insurance, retail, telecom and banking industries, the ability to accurately pinpoint customers who have a greater chance of exhibiting certain behavior and proactively putting interventions in place is becoming a competitive differentiator. One of the most important tasks in trying to model customer behavior is to precisely zero in on predictors which influence a behavioral outcome using advanced analytical models. For example, in the insurance industry, a company might want to identify the top five influence levers that determine whether or not a policyholder would cancel his policy. Some questions that arise are:
In all the business scenarios across industries, the search for behavioral predictors is almost a forensic science where the behavioral investigator needs to methodically sift through terabytes of raw electronic and digital data trails left behind by customers with every interaction. He or she needs to investigate the sequence of transactional patterns leading to the outcome (such as policy cancellations), and to investigate if there are statistically significant patterns or triggers in the electronic data trail, which could have served as an early warning sign. Once the most important predictors are determined, their influence on customer behavior should be ranked and sensitivity analysis should be performed to quantitatively measure expected impact on customer behavior for changes in predictors.
While we have used the process of modelling cancellations in the insurance industry thus far, this process can be extended to other industries like telecom or retail or health care business scenarios in which they are also trying to predict customer behavior.
The following schematic provides a three step process which can be used to investigate behavorial drivers. The first step is to research the factors which influence a behavior. The second step consists of assembling quantitative data measuring the specific behavior, and the third step consists of administering statistical tests to confirm linkage between causal factors and the behavorial outcome.
Source 1: Field Immersion
In field immersion, the behavioral modelling/investigation team spends a day in the life of an agent or customer trying to get direct experience of the factors at play while an agent is trying to persuade an existing subscriber to keep his or her policy. The rationale for direct customer immersion is based on the fact that direct observations of these interactions can offer a window into agent-customer dynamics and yield new insights as to what causes a policyholder to renew a policy. This can then be statistically modeled into the policy scoring model, provided the data exists. Also, through field immersion, a behavioral investigator gets access to nuances and behavioral dynamics at play, which we may not be privy to with other methods.
For example, observing the agent-customer interaction during a field visit led us to hypothesize that agents had a greater influence over a policyholder if the agent-customer relationship was longer than three years and the customer’s age was greater than 47 years. This was statistically validated by a simple discriminant analysis and hypothesis testing and finally fed as an input to the scoring process. This was one practical example of a modeling variable received from field immersion where the statistical test of significance (chi square value) was better than those received from hypothesis workshops. Similarly, in another interaction, it was observed that in certain zip codes, there was a unique word-of-mouth referral campaign being executed that was causing policyholders to cancel. This was an instance in which data to model this phenomenon was not readily available and the modeling team brainstormed with the customer team to institutionalize a new data collection process to capture details about competitive campaigns that fuelled cancellations.
Source 2: Contact Center (Text Mining Conversations)
Frequently mining customer conversations can yield clues as to triggers for customer behavior. For example in this method, clues for policy cancellations are searched for among a subset of call center transcripts (sub-sample of 90 days of customer conversations in which an outbound customer service representative records the key points of conversation regarding a policyholder’s reasons for not subscribing). Since the volume of conversations is in the thousands, it’s difficult to manually filter these conversations. However, they can be fed through the unstructured text mining process, which can extract the top 10 themes. For example, we can track the frequency of occurrence of certain “Watch List” keywords where a sudden increase in keyword frequency or themes such as "Poor Service" or "Premium Amount" can signal a subscriber’s intent to not renew the policy.
Source 3: Exit Survey
Another technique that can be utilized is to design a simple questionnaire that can probe for the causal factors that resulted in the behavior. For instance, in the insurance example, we could explore agent experience as a possible cause, competitive policy structures as another cause, etc.
Source 4: Advanced Analytics (Automatic Predictor Ranking)
In this method, a set of behavioral dimensions are identified, and a set of quantitative measures of behavior are assembled relevant to the outcome being predicted. In the case of the cancellation modelling problem in the Insurance industry, it would consist of capturing payment frequency, premium share of salary, policy tenure, frequency of outbound reminder calls, and recency of the last inbound complaint call, in a cancellation-specific analytical data mart.
The following list outlines a sample set of variables which are assumed to influence the probability of a policyholder cancelling. Each of these information components will have to be sourced from a myriad of IT applications to create a 360-degree view of the policyholder. Once a 360-degree view of the policyholder is built, it becomes easier to run the statistical process on the integrated dataset to rank the variables on the influence they exert independently on the cancellation rates.
Cancellation Trigger – Predictor Bank
Once a 360-degree view of the policyholder is assembled with regard to crucial behavioral elements, we can run two important analytical techniques that provide insight into the most important influencing elements. The first technique is attribute ranking, and discriminant analysis can be used to explore the data. For example, the objective of discriminant analysis is to investigate differences between cancellations and retainers by identifying the most important discriminating variables like policy tenure, timing of (during tax period or not), financial advisor channel/agent channel, premium amount/share of salary, etc.
The second technique is to run a logistic regression model using tools like SAS, SPSS and Oracle data miner to figure out the relative rank of each influence. The logic used to interpret logistic regression is beyond the scope of this article, but a brief overview is given below:
Is the R square value greater than the statistically significant threshold?
Identify additional internal predictors, external predictors, structured predictors and unstructured predictors (keyword frequency), which can be introduced to boost models predictive power.
Source 5: Deep Dive Workshops
Here, some of the most knowledgeable field agents who understand the markets very well huddle together for a brainstorming workshop, in which a facilitator guides the conversations trying to explore the reasons for a policyholder not renewing his or her policy. The deliverable at the end of this workshop is a prioritized set of behavioral policyholder predictors which can then be used for statistical testing.
The rational for employing this technique is that the ability to tease out subjective behavioral hunches is essential to behavioral modeling. In many respects, a skilled behavioral analyst, while grounded in analytics, also employs the powers of perception and intuition. This allows him to consider the significance of other causal factors for cancellations that others would either fail to consider or quickly dismiss as insignificant.
Source 6: Field Hypothesis Harvester
The biggest source of intelligence in any industry are the sales “foot soldiers” (in the insurance industry the financial advisors and agent) on the ground who are interacting every day with their customers. If there is a mechanism by which they can share what they are hearing from their customers, it can dramatically improve the number of causal factors one can model.
Behavioral “Triangulation" of Predictors
After the independent predictor harvesting exercise across different processes is completed, the behavioral investigation team collates the top 10 predictors. The top 10 behavioral predictors are broken up by the process used to collect them—Field Immersion, Contact Center Mining, Automated Detection of Patterns from Policy 360 Data Mart and Cancellation Exit Surveys. The objective is to see if certain policy cancellation predictors are being reinforced by multiple methods; if so, then it increases the confidence level in that predictor’s accuracy.
For example, in the scenario given below, the X axis outlines all the cancellation predictors and the Y-axis outlines the process used to surface each predictor. The cancellation predictors identified by the behavioral research team are policy age, premium amount band, agent tenure, etc. The methods used for surfacing the predictors are field visits with agents, mining call center conversations, etc.
Key observations from the above predictor map are as follows:
As can be seen from the examples above, increasing the breadth of sources you listen to to build the analytical model can yield rich dividends in the form of a more robust predictive model. The behavioral triangulation process reinforces the case for introducing a behavioral parameter into the analytical model to increase the predictive lift of the analytical model. As Abraham Lincoln famously said, “If I had six hours to chop down a tree, I'd spend the first four sharpening the axe.” Similarly, in trying to analytically model customer behavior, the bulk of the time must be spent not on the process of statistical modelling itself, but in the preparatory process of researching causal factors and stitching together a 360-degree view of the customer’s behavior to quantitatively test the same.
Recent articles by Derick Jose
Copyright 2004 — 2020. Powell Media, LLC. All rights reserved.
BeyeNETWORK™ is a trademark of Powell Media, LLC