Customer segmentation is the process of separating customers into groups where the members of each group share a set of demographics, psychographic and/or behavioral characteristics. Customer segmentation is an ongoing exercise and segments evolve dynamically to match the changes in the customer portfolio.
Segmentation helps us understand the nature of our customer portfolio. It helps us improve customer service across channels, realize measurable ROI to marketing spend, lower operational cost, ensure increased visibility for regulatory compliance and increase revenues through up-selling and cross-selling opportunities.
Customer Segmentation is the tool that can be used a wide variety of functions, and segmentation forms the basis of decisions taken by a bank or a financial services organization. These could include marketing initiatives, targeted sales, customer life-cycle management, devising fee structures and charges, developing credit and collection strategies, portfolio and policy management, risk management, product innovation etc.
In order for segmentation to be successful, the segments need to be drawn using the data that is already in place and by creating contiguous groups based on a predetermined set of benefits that the bank seeks to either enhance or reduce, within a group of customers.
The idea of sampling is core to segmentation. The quality of the sample often depends on the quality of the segmentation.
Getting Started with Sampling
In our last discussion (Is your data lying to you?), we introduced a few initial points about sampling, the concept of a sampling distribution and highlighted some initial reasons why sampling is pertinent to predictive analytics.
In this discussion, we will introduce a few more advanced sampling techniques, highlight the relevance of these techniques and address the question of sample size. We will conclude with an overview of statistical clustering for efficient customer segmentation.
Simple Random Sampling
The simple random sample is fundamental to the employment of statistical procedures. The goal of any statistical procedure is to either draw conclusions about or make future predictions for a larger population. These two outcomes are commonly referred to as predictive analytics.
A representative sample must not introduce bias for any group or collection of observations from the population. If bias is present, the results obtained from the analysis will also most likely also be biased as well. Thus, a simple random sample is one in which every observation from the population has an equal chance at being chosen as part of the sample. This is very important, because analyzing the whole population is often not feasible in the time available to make relevant business decisions.
The simple random sample, or SRS, is most easily envisioned as the process of blindly and randomly choosing observations from a population up to a predetermined sample size. For example, if a bank is trying to analyze the performance of 10,000 clients acquired in 2014, then an SRS of size 100 would mean any of the 10,000 customers would have the same opportunity of being selected i.e. the probability of any client being selected in 1 in 10,000.
Moreover, the resultant set of 100 clients should be spread evenly over the population of 10,000. We should not have more clients in the range of 1 to 100 in the sample as we would for clients between 500 and 600 or 9800 and 9900 respectively.
Such an analysis is typically helpful to get high level information regarding a set of clients and help identify attributes or behaviors. This is commonly used when sample sizes are not too large and where there are standard attributes across the entire portfolio.
Stratified Random Sampling
All statistical procedures rely on the quality of the representative sample to provide meaningful insights and drive business decisions. The method in which we draw a sample from the population depends on the outcomes that we seek. There exists an inherent assumption that we make when we draw a simple random sample. That assumption is that all of the observations, for instance all members of a credit union, are equally relevant to the outcome that we are trying to explain. Often, this assumption is false.
For instance, we may have customers who have individual attributes (characteristics) that make them more or less desirable. We may want to treat these customers differently by segmenting them out for our analyses. When we have inherent groups within a population that we wish to consider together we must create our sample sets so that these groups are represented in proportion. This type of sampling is called Stratified Random Sampling.
When constructing a stratified random sample we first ‘stratify’ or divide our population into meaningful groups and then within each group choose a simple random sample. For instance, suppose you are a bank that is seeking to optimize your mortgage product offerings the current portfolio which is a 30/70 proportion of residential and commercial mortgage customers. Your sample must also consist of a mix of 30% residential and 70% commercial accounts.
To ensure non-bias within the sample, you first draw a simple random sample from the residential group and then draw another simple random sample from the commercial group with sample sizes in proportion to the overall portfolio. The idea of stratified sampling is to preserve the inherent ‘structure’ within your data to reduce bias in the results.
Permuted Block Design
After building a statistical model we test and validate the proposed model on actual data to evaluate its efficacy. For instance, we test a new marketing program or consumer loan offerings and to evaluate the quality of the results for each as it relates to the other. In this scenario, we have an ‘A’ vs. ‘B’ testing situation or trial. We therefore wish to randomize customers to the trial assigning each to either group ‘A’ or to group ‘B’ and evaluate the outcome.
In this situation we often do not know the number of total customers we will have in advance and so the proportion of customers, in order to remain unbiased, needs to be evaluated often. The permuted block design which is a variant of the simple random sample allows this evaluation to happen in real time resulting in a more evenly distributed representative sample of customers to each testing group.
For any predictive analytics exercise it is important to know the required sample size. Sample size governs the assessment of whether or not a model is indeed valid or a result obtained is verified. When making a ‘yes’ or ‘no’ decision there are four (4) groups involved. It is not simply the ‘yes’ (true positive) group and the ‘no’ (true negative) group. In addition, we also have false positives and true negatives in the mix.
Suppose, after conducting a predictive analytics exercise for a new consumer loan offering, the results of the analysis indicate a previously unidentified segment within your current customer portfolio that exhibit an abnormally low risk for the loan offer. With this knowledge we may quickly want to extend offers to these customers. After a period of time we may wish to verify if the decision to extend the additional offer was more profitable than extending the offer to the average customer. We test the decision by comparing the two groups.
Sample size now becomes important. Firstly, we must have enough data to verify that the results are not mere chance. Too much data or too little data can lead to spurious results. Sample size is therefore a balancing act. Beyond the decision, we must also take into account the ability to measure a positive result or control for the probability of an adverse result. These additional groups are often referred to as Type 1 and Type 2 error.
In addition to having enough of a sample having more is not always better. Increasing sample size beyond what is actually needed can lead to statistically significant conclusions that are not relevant to the target group. In other words, everything becomes significant. Therefore, one must be careful to not have too much data or too little and the size relies on each individual problem that you are attempting to solve.
Clustering can be described as performing a stratified random sample over dynamically changing attributes. The idea is to segment based on the problem at hand. Since we often do not know inherently which attributes will be more indicative of a given scenario we need to rely on statistical procedures to find these hidden gems. Like needles in a haystack, significant attributes are often very difficult to find. Data is often elusive and misleading. The gems we seek are often hidden within a combination of things we can measure.
Clustering is often an exercise in humility. It is a tool of the expert. How we define segmentation is a custom process and often involves a large toolbox of statistical procedures to find the optimal segmentation to solve the problem at hand.
In this discussion, we introduced customer segmentation using predictive analytics, the importance of sampling and several sampling strategies that can be used prior to segmentation. In our next article, we will discuss wallet-share maximization, using predictive analytics and customer segmentation to identify new opportunities within an existing customer base.