We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Analytics: Techniques for Defining Customer Classes

Originally published October 31, 2013

Customer segmentation is a process that examines selected characteristics of an individual customer to provide some insight about the ways that customers with similar characteristics will behave in certain business scenarios. In the best situations, the determination of a customer’s segment will inform the operational business processes in influencing the customer to make choices that lead to a result that is both desired by the company and beneficial to the customer.  A simple example would involve segmenting customers by age groups, such as 18-34, 35-50, 50 and older, with the intent of customizing the presentation of marketing material and offers based on presumptions about individual preferences influenced by age.

Supervised Approach

There are two basic approaches to employing analytic techniques for customer classification, categorization and segmentation. When using a supervised approach, the analysts make presumptions about the predispositions, preferences or desires of individuals within a group based on research. Our prior example of age segmentation might have been the result of a directed approach based on analyzing purchase transactions for certain products suggesting a distinction in buying patterns by customer age.

The approach is “supervised” because the analysts already have some hypotheses they are looking to test and validate, such as “younger people buy more boxes of diapers.” They will select the specific dependent variable (“number of boxes of diapers purchased”), and they may direct the selection of potential independent variables (age groupings).

The more precise the hypothesis, the simpler it is to test by making specific queries and comparing the answers. If the hypothesis were “Customers that are 25 purchase more diapers than customers who are 49,” then two queries could be executed, one tallying the count of sold diaper boxes when the customer age=25 and one tallying the count of sold diaper boxes when the customer age=49.
The challenge is when the hypothesis is less precise and the question changes somewhat. Instead of a direct comparison of sales volume by specific age, the analyst might want to look at the corresponding sales volumes by customer ages from 18 to 65, examining the trends, and identifying the customer age bands based on levels of diaper sales. The resulting bands are the segments for diaper marketing based on the specific independent variable (in this case “age”).

Additional characteristics can be layered into the segment as candidate independent variables to see if they impact the results, such as gender, geographic location and/or income level. If these variables do affect the purchasing trends, their relative impacts can be folded into the business processes and marketing campaigns with more precise targeting.

The benefit of this supervised approach is that the analysts retain some level of control over the selection of what they believe to be are the independent variables. That selection provides some level of guidance as to the steps to take after the analyses are done, specifically in adjusting the business process or campaigns in reaction to what was learned.

Unsupervised Approach

The second approach is unsupervised, in which the analysis is not predicated on the use of predetermined variables. While the dependent variable may still be asserted, the customers to be segmented are subjected (along with a full complement of attributes) to one or more clustering algorithms that attempt to create the groups. In some cases, the algorithm is told the number of groups to create. In many cases, the analysts must review the decisions made by the algorithms to assign a customer into a particular group in relation to the proposed independent variables and their specific values (or value ranges). The algorithms iteratively formulate groups based on a proposed set of attribute values; after each iteration, the groups are evaluated to assess their “stability,” which might be measured in terms of the sensitivity of the assignment of a customer to a particular group if one of the attribute variables is slightly changed.

The benefit of the unsupervised approach is that it may uncover latent patterns and dependencies that are “interesting” in that they can motivate evaluation of opportunities for business advantage. An example might be discovering a small yet discretely discernible segment of individuals over the age of 50 that buy a significant number of boxes of diapers. The logical next step would be to evaluate the potential root causes or motivating factors associated with that cohort (such as “they are grandparents with custodial responsibility for their grandchildren”) and determine whether what you can learn about the cohort can be used to create specialized customer engagement touch points for members of that cohort.

Begin with Supervised Analysis

Both of these approaches are valid and can add value, yet still require engagement with the business process owners to ensure that actions can be taken in relation to what is learned. One good way to examine the potential benefits of a comprehensive customer segmentation program is to start simple: begin with supervised analyses and gradually expand the program in terms of analytics flexibility and capability. Ensure that your business partners are engaged with the program and that pilot business activities can be designed and executed to provide proper testing and validation of the results.

Recent articles by David Loshin



Want to post a comment? Login or become a member today!

Be the first to comment!