Power BI

Ignore Case Option in Fuzzy Matching

ignore-case-option-in-fuzzy-matching

Microsoft Power BI has a large user base in Australia. Power BI trainings in Sydney and other parts of the Australia focus on using real data for trainings. This makes it easier for Power BI beginners to understand power bi and relate its features to their work. One of the business solution taught in Power BI trainings in Sydney, Melbourne, Brisbane and other cities is the fuzzy Matching technique.

Fuzzy matching has various applications such as spelling correction, data similarity checking and data cleaning.

Fuzzy Matching is the technique of finding strings that match a pattern approximately (rather than exactly) and allows to create links between data sets that may be less than 100% perfect. It provides you with the option to tune matching algorithm.  

In this blog post, we will see how This feature in Power BI allows you to merge queries. For this we have two tables, one containing the clean data and other contains the unclean data. This blog post close looks into the tweaking the ignore case option for fuzzy matching. Fuzzy merge comes in with other options which can be tuned.

Let us look at the options Power BI provides you to tune your matching algorithm.

For this blog post we have two tables, one containing the clean data and the other containing the unclean data.

This table contains a single column with two values: Sydney and Melbourne. These are the target values.

Now let’s see the un-clean data.   

This table contains a City column containing many incorrect variations for the two cities. Each variation is assigned an ID to track its matching result in the ID column.

This blog post assumes that a simple fuzzy merge has been performed.

Link to Fuzzy Matching Basics:

https://powerbitraining.com.au/using-fuzzy-matching-to-merge-queries-and-find-data-similarity-in-power-bi/

Lets now tweak the fuzzy matching options..

1.Click on Fuzzy merge query in the Queries Pane.

2.Now click on Gear icon next to Source in the Applied Steps Pane.

3.Uncheck the box to the left of Matching by combining text parts. Click OK.

Make sure that the Ignore case is checked. 

4.Click on the double headed arrow next to Cities_UncleanData. Click OK.

5.In the pop-up, Click on

Notice that only 11 rows have been returned.

Exploring Fuzzy Merge options in Power BI

Power BI allows you to work with different fuzzy merge options. These options allow you to set the criteria for the matching that the algorithm performs.  

Let us look at the options Power BI provides you to tune your matching algorithm.

Similarity Threshold

This option allows you to choose the Range of unclean data that can be considered as true match.

Ignore Case

This option is used for String Matching where case sensitivity is not required. Using this option Sydney and SYDNEY will be considered as similar.

Match by combining Text Parts

Using this option, both the parts of the word are considered while algorithm looks for a match. Using this option, Sydney and Si dney will be considered similar.

Maximum number of Matches

This option allows you to limit the Number of Values that your algorithm returns. The value entered in this option returns results with the first result denoting the highest match and so on. If no value is entered the algorithm returns all rows in the same order.

Transformation Table

Transformation tables act as Mapping Tables for the algorithm when the dataset under consideration is highly dissimilar.

 

Conclusion:

Fuzzy matching is a technique in Microsoft power bi that is used to merge data sets that are slightly similar. The level of similarity between the two data sets is adjusted according to the input parameters by the user. These parameters include matching threshold, Ignore Case, Match by combining Text Parts, Maximum number of Matches and Transformation Table.

Are you a data analyst and want to learn more about Power BI?

Why not sign up for Power BI training in Australia.  We provide our services in the following regions:

 

Back to list

Leave a Reply

Your email address will not be published. Required fields are marked *