Cluster analysis is the process of grouping similar variables within the application of business analytics and data mining.
Retail clustering groups data and transforms it into information that you can use and understand. It allows you to implement any insights generated to improve and optimise your business processes.
BOOK YOUR CUSTOM EXPLORATORY CONSULTATION
Looking for a clustering software solution that helps you to understand the shopping behaviour of your customers? Or a cluster optimization service underpinned by our leading software and proven methodology?
After booking your free online exploratory consultation, you can expect us to research and evaluate your unique context. We’ll create a personalised agenda to match your business’ interests, goals and context.
We’ll reach out to you before your consultation to share your personalised agenda. You’re welcome to request any changes or additional points.
We’ll consult with our internal experts to curate and customise conversation points, content, and the presentation so that it’s in line with your personalised agenda. Depending on the outcome of our consultation, we’ll schedule a separate custom advisory consultation to share our proposed recommendations with you.
TABLE OF CONTENTS
WHAT IS CLUSTER ANALYSIS?
Cluster analysis is the process of grouping similar variables within the application of business analytics and data mining. You’d plot a data set on an axis and then visually map it into smaller groups based on the correspondences between them.
Retail clustering groups data and transforms it into information that you can use and understand. It allows you as a retailer or supplier to implement any insights generated to improve and optimise your business processes. This practice aims to retrieve information in the fastest manner for the discovery of knowledge and unidentified patterns.
Anomaly detection, data classification and cluster analysis are typical tasks conducted in the data mining process.
Clustering can also be referred to as data segmentation because this process partitions the data points into homogeneous groups.
To achieve meaningful results, you should use a clustering algorithm tailored to the market environment. It means that the algorithm must compute the amount and type of data in a time-efficient manner to produce accurate results.
As a retailer or supplier, you can use clustering to understand who your customer is and what drives their purchase decisions. It can help you to tailor your product offering and marketing strategies to your target market.
CLUSTER ANALYSIS APPLICATIONS AND BENEFITS
With a retail environment, cluster analysis has many applications and benefits.
Firstly, you can use it for customer segmentation. It allows you to group consumers according to similarities, demographics and purchase behaviour.
You can use different techniques to create clusters of consumers and classify them differently according to demographic variables, needs, wants and purchasing patterns.
You can also use store and category-related attributes such as geographic location and available shelf space to cluster a product category. Once you create your clusters, you can better describe, understand and target your main shopper segments per cluster and develop category-level strategies to ensure a profitable outcome for your business.
When implementing cluster analysis in business, it’s critical to understand the difference between standardisation and localisation.
A standardised strategy focuses on a single-assortment, mass-market approach. Meanwhile, a localised strategy focuses on a store-specific approach. A clustered approach is the best mix between the two strategies since it allows you to understand your customers and the financial drivers along with effective resource management to create a profitable result.
The clustering model or algorithm you select should be capable of processing large datasets efficiently, effectively and timeously.
The most commonly used clustering algorithms for retail applications are:
- Partition-based clustering; and
- Hierarchical clustering.
After selecting your clustering model, you should analyse the category performance according to your Fact (sales), Market, Product and Period data. You can obtain this information through your point of sales (POS), loyalty and market data.
The clustering algorithm you chose must ensure that the data points within a cluster are similar while the data points in different clusters are dissimilar.
K-means clustering is an example of a partitioning algorithm most commonly used because it is simple, efficient and flexible. When using this algorithm, you will need to select the number of clusters ‘K’ that you would like to use.
You can use the Elbow method or industry knowledge to determine the optimal number of clusters for your product category.
THE ELBOW METHOD
You can calculate the number of clusters by minimising the within-cluster sum of squares (WCSS). You can plot data points on an axis where the X-axis represents the number of clusters and the Y-axis represents the WCSS for each cluster.
As the number of clusters increases, the WCSS decreases. To begin with, the rate of decreasing WCSS is steep. When the rate slows, it is shown by an ‘elbow’ or curve in the plot. The number of clusters at the elbow in the plot represents the optimum number of clusters for the dataset.
For retail application, this number should be considered against industry-level knowledge of the market and your business requirements for you to make a final decision.
When the cluster analysis runs, you allocate each data point to the nearest centroid. The algorithm will continue to assign data points to clusters until it reaches an average.
This clustering model groups the clusters into a hierarchical tree to form a graph called a dendrogram.
When using this algorithm, you can select the number of clusters you would like to use based on the Elbow method or industry knowledge. Hierarchical clustering is the algorithm to use when you use a random dataset. It is because it produces clusters in a dendrogram that are easier to interpret than k-means clusters.
As the numbers of clusters increases, the accuracy of the hierarchical clustering algorithm improves compared to the K-means algorithm which becomes less accurate as the number of clusters increases.
Agglomerative clustering is a bottom-up method that begins where each data point starts in an initial cluster and then merges into other clusters as they move up the hierarchy.
Divisive clustering is a top-down approach that begins with one initial cluster divided into groups as the data points move down the hierarchy.
You can use different methods and variables to create clusters derived from data, reports, spreadsheets and speciality statistical analysis software.
When considering which method to use for your business, it is critical to consider your access to resources such as clean retail data, information technology (IT), marketing managers and buyers.
It is also critical to consider the integration and implementation of the cluster analysis. Once you receive the results, this information must be accessible to all business functions to used and implemented.
Before selecting your clustering technique, you must understand clustering principles:
Stores/categories within the same cluster must be as similar as possible in terms of consumer behaviour.
Clusters must be as far apart as possible in terms of consumer behaviour.
You must group the maximum number of stores within each cluster for a particular product category.
The selected clustering technique must reflect the variables most important to you. It must also reflect the strategic objectives of your organisation. For example, if you are a retailer and would like to focus on creating customer-centric product ranges, a product assortment-focused clustering technique would be most beneficial.
Below are a few clustering methods worth considering:
Stores and/or categories are grouped based on seasonal weather patterns.
Each store branch/category will receive the same product assortment.
Stores and/or categories are grouped based on their available shelf and floor space. These figures may be measured in available floor space m², shelf space (length x height x depth), SKU count or other detailed space planning metrics.
Stores branches are grouped based on the store format (hypermarket, grocery store, convenience store, speciality store etc.) Sales outlets are grouped based on a salient characteristic of their local market.
Stores and/or categories are clustered based on a combination of historical and forecasted data as well as capacity (available floor and shelf space).
Each sales channel type (e.g. brick and mortar store, online store etc.) will be within the same cluster and receive the same product assortment.
Store branches and/or categories are grouped based on the presence and intensity of competition within the same market.
Store branches and/or categories are grouped based on statistical demographic data about the target market.
Stores and/or categories are clustered based on historical sales data of the product assortment.
CLUSTER ANALYSIS DATA
Before conducting a cluster analysis, it’s best to select an algorithm and method that is in line with the organisational goals of your business as well as the availability of clean data and clustering software.
Some clustering algorithms are simple and require only one data type (e.g. POS data) or variable (e.g. sales). However, other algorithms are more advanced and require multiple data types (e.g. POS and loyalty data) or variables (sales and demographics).
Point of sale data is referred to as POS data. You can collect this data at your till point where transactions occurs. It is the type of data that is created and you can store directly from the retail POS system, which is comprised of software and hardware.
POS data can provide you with information about sales and units movement as well as the average retail selling price for specific products.
Store-related factors such as store code and store name may also be important to use for cluster analysis. If you only have access to POS data, you can use it to cluster products according to sales and units movement as well as the average retail selling price.
You can collect loyalty data from your customers when they use their loyalty card at a point of sale. This data allows you to collect demographic information about your consumers who provide it when they sign up for the loyalty scheme.
You can use this information along with your POS data and shopper basket data to profile and segment consumers based on their demographics and purchasing patterns. Shopper basket data allows you to understand which consumers buy which products, and what products are frequently purchased together.
With such data, your buyers can develop a product offering that satisfies the wants and needs of shoppers. It can also help you with as well as with product bundling. You can use this type of data for almost all types of clustering. That means you can select which attributes are most important to you and your business.
You would typically use third-party market data providers to understand market conditions. You can use this type of data to cluster stores according to market conditions and competitor action.
STORE-BASED VS CATEGORY-BASED CLUSTERING
Store-based and category-based clustering are the predominant methods used by the retail sector to create customer-centric merchandising and product assortment tactics.
Originally, retailers adopted a store-based approach using top-down attributes such as store size, sales figures and geographic location to boost your operational efficiency. Store-based clustering can also be referred to as ‘store grouping’. This method is simple to understand and implement across a retail business. Entire stores are grouped based on similarities among them such as LSM, size, store format and performance data. This method may work well for retailers with a few store branches with distinct characteristics. For example, a retailer that has convenience stores, grocery stores and hypermarkets may choose to group their stores according to store format as the customer base for each will be relatively consistent within each format.
However, this method does not consider the different categories within a store, which are approached differently due to the different customer purchasing patterns, wants and needs of each.
As you can see in the image below, eight store branches have been grouped into three clusters. Within each cluster, the categories and product ranges will be exactly the same. This is because stores within the same cluster are said to serve the same consumer market.
Example of store-based clustering
Today, retailers have moved towards a category-based approach to clustering which uses data across all store branches to cluster stores based on similarities in chosen variables. This means that each store may fall within a different cluster for each product category.
By using this method, you can create customer-focused assortment plans aimed at satisfying the needs of the target market. Category-based clustering is a more complicated method that takes more time and effort to implement. However, this will allow you to cater to the different customer markets that shop at various store branches, resulting in increased customer satisfaction and loyalty.
In the image below, you can see that we have clustered eight stores by category. This means that store 1, 3, 4, 5 and 7 will receive the same product range for hair care. This is because these stores are similar in terms of performance data, target market, LSM etc. for the hair care category.
Example of category-based clustering
CLUSTER ANALYSIS IMPLEMENTATION
If you want to implement cluster analysis in your retail business, there are a few actions you can take. Presented in phases, within each phase are steps that can help you your cluster analysis efforts.
PHASE 1: PREPARE THE DATA AND DEVELOP YOUR PLAN
- What data sources do you have access to and would like to use?
- What are the benefits you want to achieve by implementing this process into the business’ category management plan?
- What capabilities does the business have or lack to execute the clustering process?
PHASE 2: ANALYSE THE DATA / DETERMINE THE BEST CLUSTERING METHOD
- Select the clustering algorithm to be used
- Determine the clustering method by deciding which variables would be most effective to use for clustering (e.g. to maximise revenue, cluster-based on store/category sales).
PHASE 3: EXECUTE
- Run the cluster analysis and evaluate the results
- Group stores into clusters for each product category
- Update store layouts, create new planograms and update strategies
- Clustering should be a dynamic process where clusters are re-evaluated periodically (e.g. 3/6 months).
RETAIL CLUSTERING MISTAKES TO AVOID
USING STORE-BASED INSTEAD OF CATEGORY-BASED CLUSTERING
Store-based clustering fails to consider the differing category performance and customer purchasing behaviour across the store’s categories.
Category-based clustering can help you to make strategic decisions based on consumer behaviour. You can use different strategies within the same store to ensure the performance of each category is optimised.
IGNORING CATEGORY PERFORMANCE
Category performance is an important indicator that you can use to cluster stores for the same product category. You can analyse category performance to identify patterns and trends across various store branches.
Once you have considered all the store-level factors, you must further analyse how each category performs to understand the shopping behaviour of your target market.
IGNORING STRATEGIC ALLIANCES AND SALES CHANNEL PARTNERS
Industry role players such as manufacturers and suppliers play an important role in the clustering process. They provide market-related category expertise and cross-retailer insights. Both will help you develop targeted assortments and merchandising strategies.
You usually base clusters on the most current internal and external information from across the market.
FAILING TO ANALYSE DATA TO CONDUCT A CLUSTERING EXERCISE
Many retailers have adopted a subjective approach to clustering where stores they think have similar customers are grouped for a particular product category.
However, cluster analysis conducted using factual sales and POS data will be the most accurate and effective for boosting operational efficiency and customer loyalty.
FAILING TO PRIORITISE CATEGORIES TO CLUSTER
Cluster analysis and implementation can be a time-consuming process. You can use the Pareto principle to help you understand where to begin.
This relates to the fact that 20% of your categories contribute to 80% of your income. focusing on these categories will have a noticeable impact on sales
ASSUMING CLUSTERING IS A ONCE-OFF EXERCISE
Clustering is a dynamic exercise. Category performance, industry conditions, market trends and consumers are constantly changing.
Therefore, it’s best practice to redo a clustering exercise every 3 to 6 months to ensure that the strategies used are still applicable and will produce optimised category performance.
CLUSTER ANALYSIS INTERPRETATION
CLUSTERING FOR CRM
Focus on delivering a highly personalised shopping experience for customers.
Customer relationship management (CRM) uses information technology (IT) to acquire, maintain and grow customer segments of the target market. CRM also assists you in building relationships and boosting customer loyalty by implementing customer-centric strategies built on data analytics. You can collect information on each of your customers to analyse their purchase history and buying behaviour.
The practice of CRM focuses on customer identification, attraction, development and retention. You can identify customers using customer segmentation and data mining techniques such as clustering.
Customers are attracted and developed using predictive analysis that determines future consumer behaviour. You can retain customers because CRM is a customer-centric approach to category management where a business develops a deep understanding of their customers to target them effectively.
CLUSTERING FOR CONSUMER SEGMENTATION
You can use consumer segmentation to segment group a market into smaller groups according to the needs of consumer’s, defining characteristics and buying behaviour. You can also use this practice to collect information about your target market so that you can offer the right products at the right time, place and price.
You can segment your consumers according to various characteristics such as demographics, geographic location, psychographics or behavioural variables.
Here, clustering algorithms are worth using for efficient and effective consumer segmentation to produce groups that exhibit similar consumer behaviour. You can, therefore, assume that consumers who fall within the same cluster will respond similarly to your strategies and tactics.
CLUSTERING FOR ASSORTMENT PLANNING
In a time when convenience and personalisation are becoming more important, the survival and profitability of a retail business depend on tools such as assortment planning.
Consumers are demanding that you create data-driven assortment plans to provide more targeted offerings. Product ranges are constantly changing. It highlights the need to move from single-assortment and store-specific assortment plans. It's because both are capital, time and human-resource intensive.
A clustered approach to assortment planning is a more scalable, predictive and resource-efficient method to manage this function.
CLUSTERING FOR INVENTORY MANAGEMENT
Once you have optimised your assortment planning function, you can use cluster analysis to improve your inventory management.
After sending the product range to your stores, you can also use cluster analysis to predict and manage the stock turn of each product. It requires shelf space dimensions, product dimensions, weekly movement of each product and overall days of supply.
Stores within the same cluster are likely to require the same inventory management strategies due to the similar target market and their consumer behaviour for the particular category. Therefore, you can also predict and monitor stock movement to avoid stock-outs or overstocks, resulting in profit optimisation.
CLUSTERING FOR PREDICTIVE ANALYSIS
Cluster analysis uses the principle of association to identify existing relationships and sequences between data points.
That means that by conducting a cluster analysis, you can draw inferences and make predictions about your business using historical sales data.
With this information, you can describe patterns in consumer behaviour over time. Therefore, you can better understand and anticipate the consumer behaviour of your target market and obtain a competitive advantage.
THE CASE FOR USING CLUSTER ANALYSIS
BENEFITS CUSTOMER RELATIONSHIP MANAGEMENT
Within the context of CRM, retailers who implement clustering in their business will enjoy decreased customer acquisition costs and improvements in customer understanding and service, resulting in increased customer satisfaction, retention and loyalty.
The profitability of lucrative market segments increases as your business identifies and targets them.
BENEFITS CONSUMER SEGMENTATION
With the increased understanding of the target market’s needs and wants through cluster analysis, your buyers can tailor the product assortment to the consumer and achieve a competitive advantage.
You can, therefore, offer the right products at the right place, price and using the right promotion techniques to result in benefits for both the retailer and shopper.
BENEFITS THE ASSORTMENT PLANNING FUNCTION
Retailers and suppliers who implement cluster-based consumer segmentation will experience optimised profit and customer satisfaction due to the improved understanding of their consumer behaviour.
You can experience financial benefits, such as higher ROI on marketing schemes, increased customer retention, increased shopper in-store spend and overall basket share.
BENEFITS THE MICRO AND MACRO SPACE PLANNING FUNCTION
Using internal and external market data that you analyse and group using cluster analysis, your buyers can assess their product range in more detail.
You can remove poorly performing products from the range to provide more shelf space for profitable items. Overall, you can re-organise shelf and floor space for optimised space allocation and category profitability.
BENEFITS THE MARKETING FUNCTION
Using cluster analysis within the marketing function allows you to reduce your advertising costs since you can target your adverts and implement specific strategies.
You can also use clustering to generate customer profiles which you can analyse and target effectively for a profitable response.
IMPROVES THE ACCURACY OF DEMAND FORECAST DATA
The accuracy of data of an entire cluster is more accurate than data for a single store. Therefore, with cluster analysis, you can identify existing relationships and correspondences among variables over a period, thus improving the accuracy of demand forecast data.
With cluster mapping, cluster analysis can present complex data groups and patterns in an easily understandable manner.
LET DOTACTIV HELP YOU BRING YOUR CATEGORY PLANS TO LIFE
Gain in-depth insights into the behaviour of your target market and improve your strategic decision-making.
With DotActiv's cluster optimization services, you'll get access to category-based cluster optimization opportunities, which will enable you to understand and adapt to shopper behaviour at scale.