Power BI

The Data Preparation Stage of Data Modeling

We often spend a lot of time in our Power BI advanced training in Sydney focusing on the creation of a perfect data model before starting any reporting task. Data model consists of the right number of tables with the right relationship between them. A typical data model consists of a fact table and a dimension table.

If you are new to the concept of data modeling, here’s a list of blogs to support your understanding:

This blog is also from the archive focused on various data modeling problems and solutions. In the last blog post, the concept of data model was introduced and the first two stages in the creation of a perfect data model were discussed. Before we jump to the third step, here’s a quick recap of the first two steps in creating a perfect data model.

Getting Started

For this blog, we will disable the auto detection of relationships by Power BI. This is done to create an environment where you can dive deeply into model creation. Disabling relationship detection is done purely to enhance concept building and is not preferred in ideal conditions.

Here is a step-by-step guide on Getting the right number of tables for a good data model 

Importing Data

The Importing Data phase of a data modeling involved procuring the right tables that can answer the business question at hand.

To Gathered Data for Data Model contains:

  • Product Sheet: It consists of three columns addressing the product details, i.e., Product Code, Product and Product Type. Each Product has a distinct Product Code and falls under only one Product Type.
  • Product Line Prod Type Sheet: Each Product Type is a subcategory of Product Line. This sheet records the subcategory to category mapping.
  • Product List Sheet: Product list contains the details of each Product and the list price associated per unit.
  • Sales Person Per Product Type Sheet.: This sheet contains details of Sales Person associated with the Sales. There exists a many to many relationships between the sales person and the product.
  • Sales Sheet: Sales sheet contains transactional Data for the Sales.

Now you have five queries loaded in your Query Editor. At this stage your data model should contain five un connected tables as shown in the image below.

Data Model – Common Issues And Solution | Power BI Training In Sydney | Power BI Training In Melbourne

Data Preparation

Data Preparation involves ensuring that the data is capable of delivering value.

Data Preparation involves:

  • Checking the data types and data categories.
  • Making sure that all the required columns are present in data set.

Power BI guesses the data type of the column based on top 1000 Rows.

Continue using the Previous file for this exercise…

For Data Preparation:

1. Click on Edit Queries to open the Query Editor.

2. Click on Product Query. Click on Transform>Use First Row as Header.

3. Click on Product Line Prod Type Click on Transform>Use First Row as Header.

4. Click on Sales Person Per Product Type Click on Transform>Use First Row as Header.

This Query contains a many to many relationships between the Sales Person and Product. This many to many relationship is not suitable for dimension table.

5Right click on Sales Person Per Product Type Click on Reference. Rename this new query as Sales Person.

6. Right Click on Sales Person Per Product Type Click on the tick at the left of enable load to Disable Load.

Enabling load option Power BI training in Sydney

After the load is disabled, the name of query is displayed in italic font.

7.Click on Sales Person Remove the Product Type Column.

8. Now click on Email in the Sales person query, remove the Duplicates.

9. Now open the Report file. Click on Sales Table in the Fields Pan.

10. Create a new column Sales Amount by multiplying the Quantity Sold with Sale Price.

Data Enrichment in Power BI training in Sydney

Conclusion:

A data model consists of the right number of tables with the right relationship between them. Although the creating of data model may seem as easy as importing everything into Power BI, but in reality it is not. Many real data issues start to occur right after this stage. Therefore, we have dedicated an archive to address the common issues and the solution for those issues in Power BI. This blog is the second blog from the “Data Modeling – Common Issues and solutions” archive and focuses on Data Preparation.

Back to list

Leave a Reply

Your email address will not be published. Required fields are marked *