Understanding segmentation, and making the most of your data


Customer segmentation is a tried and true part of the marketer’s arsenal. Widely used to drive differentiation in brand communications both above and below the line, segmentation is traditionally sample-based and insight-focused — designed to build marketers’ understanding of who their customers are and how they relate to brands and categories at an aggregate level.

As the volume of data available to marketers has increased over time, so too has the sophistication and granularity of segmentation methods available. When talking about segmentation today, the emphasis is less commonly on customer insight, and more on generating segments that act as an operational tool for one-to-one engagement and customer experience management.

A data-driven approach to segmentation — one that sources data organically from live systems — differs considerably from traditional research-based segmentation. Both have their strengths and weaknesses, and smart marketers would do well to leverage both in their customer engagement strategies. Here are a few thoughts to consider when looking at how your business might approach customer segmentation.

Segmentation is not ‘one size fits all’

A good segmentation provides insight beyond the directly observable, and the insight it provides should also be fit for purpose. This means the segments created are differentiated across a dimension you can use operationally. For example, if your aim is to group shoppers around their fashion purchasing patterns, you need behavioural data. Want to understand how to reposition a car brand? You need to segment based on attitudes and beliefs about vehicle brands. The above points — the strategy aspects if you will — are not impacted by the manner in which segmentation data is collected and analysed.

Some description

The two main dimensions along which segmentations can be categorised are type and methodology. Type refers to the dimensions along which individuals are segmented, the most common being demographic, geographic, behavioural and attitudinal. Methodology is the technical approach, the experimental design, statistical techniques and plain old programming that go into delivering a functional segmentation. Decision Analyst offers useful primer on the methodologies available for those who want more detail.

Both segmentation type and methodology have profound implications for the usefulness and longevity of segmentation initiatives, and this holds true across both research-based and data-driven segmentations.

Choosing a data source: research or organic?

The most overt difference between traditional research-based segmentation and segmentation driven by operational data is that research is intentionally sample-based whilst a data-driven approach uses population-level data. By using solicited sample data, the researcher can design questions to capture exactly the information they want, in exactly the structure they want, allowing efficient analysis. Research is also the only manner in which to explicitly source information on attitudes, motivations and beliefs, and should always provide the foundation for a segmentation relying on such data.

By contrast, a data-driven approach must deal with the inherently messy and unpredictable nature of real population data. Expect a data-driven segmentation project to involve a large piece of exploratory analysis and data preparation — this is the price you pay for having access to a full population of data. Data-driven segmentation’s strengths are a direct counterpoint to research-based segmentation — specifically transactional data, unsullied by human filters, is a far richer and more reliable source of data for behavioural segmentation.

To create a really powerful segmentation, melding these two approaches together gives the best of both worlds: in-depth insight into consumer motivation and beliefs, paired with an accurate view of their behaviour. Of course, finding a linking mechanism brings with it another set of interesting challenges, but a good analytical team will easily overcome them.

The methodology divide: statistical or logical

The main division in methodology is between statistical segmentation using mathematically derived relationships, and a priori segmentation, which creates segments based on logic rules and domain knowledge.

For research-based segmentation, the answer to this question is simple: research is designed with balanced scales, forced responses and weighting matrices to ensure the resultant data set facilitates ‘proper’ statistical analysis. When working with organically sourced data, the answer is not quite so straightforward. How should you treat missing data and does its absence have meaning? What time window should be used for analysis? How should data be scaled and aggregated?

The statistical purist’s approach generates a statistically meaningful picture of the relationships between traits using factor analysis or similar techniques, then groups individual customers according to similarity across trait dimensions.

This isn’t the place to get into a technical debate about grouping techniques. Suffice it to say there is huge variety, ranging from simple cluster models through to highly sophisticated machine learning approaches.

What all statistical approaches have in common though, and which differentiates them from a prioriapproaches, is that segments are generated from an array of mathematical relationships and frequently have subtle multi-dimensional differences, which can hard to articulate — and even harder to operationalise in the brutally simplistic world of traditional marketing communications.

The polar opposite approach is a priori segmentation. A priori can be done with no statistical analysis at all (although that’s not a recommended route).

It takes business knowledge about patterns of customer behaviour and uses this to formulate hypotheses or rough segments, which are then mathematically refined and optimised. This approach is often taken by — and executed poorly by — technically unsophisticated analysts, and hence gets a bad rap. But when done well by a team with strong domain knowledge it can result in powerful, easily communicated and operationally useful segmentation models.

The practical benefit is obvious: rather than trying to torture algorithms into giving an operationally meaningful outcome, one can write clear logic to define segments that fit with business strategy and market realities.

What next — operationalising a segmentation project

The third area that should be considered when developing a segmentation project is how it is to be operationalised. When segments are to be regenerated repeatedly, it’s important that the approach lends itself to automation or, at least, as little manual process as possible. When talking data-driven segmentations, an a priori approach tends to be easier to build into marketing automation — segments are generated through logical rules, and these rules can be designed to align clearly to campaign treatments, CRM offers or engagement triggers.

Statistical segmentations can of course still be pushed into marketing automation — and frequently are with much success. However, to do this, each segment allocation needs to be transformed into a separate scored model, with prioritisation based on relative scores. An additional step, though certainly worth it if segments are being defined relative to market rather than based on fixed trait sets.

There are of course many additional things to consider when undertaking a segmentation initiative. However, if you carefully contemplate the above dimensions, the risk of producing a segmentation that is interesting but not particularly useful is considerably reduced.

About the Author

Anna Russell is a director at Polynomial, a Sydney-based analytics and strategy consultancy. Polynomial works with businesses to drive value from technology investment and develop effective data driven strategies for marketing and customer engagement. See more.

Previous post

Twitter. It's celebrity, stupid... and stupid celebrities

Next post

Time’s up for analogue leadership