Clustering is one the simplest yet powerful AI feature in Power BI. Clustering helps you discover the natural grouping in your data set before you dig deeper into details. Power BI allows you to cluster your datasets by finding meaningful similarities in your data. Power BI intelligently detects clusters in data and also allows you to specify clustering requirements.
Clustering can be performed in Power BI with:
- Scatter chart visual.
- Table visual.
- Custom visuals.
In this blog post, clustering using scatter chart has been discussed.
For this blog post we will be using a simple sales dataset. The data model consists of a fact table (Sales_Table) and three dimension tables (Product_Table, Country_Table, Date). An additional virtual table contains the measures that will be used for clustering.
Before we move any further, lets define the business question under consideration
- Which products have a low revenue but a high profit margin?
- Which products have a low revenue but a low profit margin?
- Which products have a high revenue but a high profit margin?
- Which products have a high revenue but a low profit margin?
- Which products are performing the best?
- Which products are performing lower?
Clustering of the data for the given matrices can provide answers to all the questions listed above. Let’s see how this can be done.
Begin by creating a scattered chart.
1. Click on Scatter chart visual from the Visualizations Pane.
Drag and drop the required fields to the fields pane.
2.Drag and drop the Product from Product_Table to the Details
3.Drag and drop the Revenue from All Measures table to the X Axis
4.Drag and drop the Profit Margin from All Measures table to the Y Axis
5.Drag and drop the COGS from All Measures table to the Size field.
Finally lets find clusters
6.Now click on three dots at the top right corner of scatter chart.
7.Click on Automatically find clusters.
8. Enter “4“ in the blank field under number of clusters.
9. Click OK
A new field Product (clusters) is created and added to the legend of scatter chart.
Let’s analyze each cluster separately
10.Now draw a Bar Graph with Product (clusters) in the axis value and Count of Product in the values.
11.Click on the bar against Cluster 1.
By using the clustering with scatter chart you have created Four clusters
The data points in this cluster are colored green and contain products that have relatively greater revenue and have provided a high profit margin.
The data points in this cluster are colored black and contain products that have lower revenue and have provided low profit margin.
The data points in this cluster are colored red and contain products that have high revenue but have provided a low profit margin.
The data points in this cluster are colored yellow and contain products that have lower revenue and have provided a high profit margin.
Every time the algorithm is run a different color is assigned to the clusters.
To edit clusters:
12. Click on the arrow next to Product (clusters) in the legend value.
13.Click on Edit Clusters.
A pop-up menu appears.
14. Type “3” in the number of cluste
You have now categorized products into three clusters: “Best Products”, “OK Products”, “Poorly Performing Products”.
Clustering is a powerful algorithm used to find the hidden relationship between different data points. It acts like magic wand by identifying the meaning full relationship between different data points. Clustering can be performed in Power BI by using scatter charts and the table visual. In this bog post a step-by-step guide for creating clusters using scatter chart has been discussed.
Are you a data analyst and want to learn more about Power BI?
Why not sign up for Power BI training in Australia. We provide our services in the following regions: