Do Demographics Matter for Customer Segmentation?
by David Loshin
Originally published December 19, 2013
It has pretty much become conventional wisdom that a significant aspect of customer data analytics involves attribution based on demographic data elements as a prelude to any kind of clustering and segmentation analysis. We naturally blend our perception of a marketing strategy in terms of the groups to whom a product is peddled based on their similarities. Those similarities are stated in terms of well-defined demographic dimensions – sex, age, location and income are typical.
Decision demographics provide a more granular level of insight. For example, grouping individuals based on their residential location implies that they (presumably) have made a decision to live in that particular place, and that characteristics of that location hold some appeal to all of the individuals that are in that cohort.
Lastly, the causality demographic variables provide another layer of dimensionality since the causal relationships can be explicit (“the 33-year-old female executive has a salary of $130,000/year”) or implicit (“the 33-year-old female executive probably has a six-figure salary”), allowing some greater flexibility in extending the use of demographics.
It is worth comparing these different types of variables. Although the chance demographics are good for dissecting larger groups into smaller sub-groups, they are probably going to exercise a lower amount of influence over the types of analytical determinations for precision-segmentation purposes. Decision demographics are somewhat more interesting in that they reflect some conscious decisions that potentially influence clustering and segmentation.
On the other hand, the causality demographics may themselves be influenced by the other two types of demographics. You could infer that these types of demographics are less useful for segmentation because they are a byproduct of other conditions. Adding them into the variable set for a clustering process might not detract from the results, but may not necessarily lead to more precise segmentation.
Instead, the causality demographics may be more useful for two other machine learning and analytics techniques: description and prediction. Patterns of relationships that show correlation between a combination of independent chance and decision demographics and an “as of yet undetermined” (yet presumably dependent) variable can be investigated to determine if the relationship between the independent variables and the dependent one is truly causal. If so, experimental models can be developed that incorporate inferencing the probable existence of the causal variable given the existence of the independent variables.
In other words, not all demographic variables are created equal. Understanding these subtle differences can inform the analytics processes to potentially reveal interesting methods for inferencing and prediction.
Recent articles by David Loshin
Copyright 2004 — 2019. Powell Media, LLC. All rights reserved.
BeyeNETWORK™ is a trademark of Powell Media, LLC