Duplicate vs. Reference Queries: What’s the Difference?

You are currently viewing Duplicate vs. Reference Queries: What’s the Difference?

Ever wondered if you could just copy all the data transformation steps from one table to another? If you’re someone who has been struggling with redoing the annoying data cleaning steps each time you want to create a report and are looking for a way fix this. Then this blog post is definitely for you!

Power BI provides multiple ways to create a copy of a query using Power Query Editor. The two most well-known approaches are Duplicate and Reference. However, there are many differences between Duplicate and Reference Queries. The explanation is straightforward, yet it is important to comprehend it correctly. For this reason, our every Power BI training in Australia covers all concepts related to Power BI Queries based on the level of the Power BI training. In this blog post, we will go over some main differences between Duplicate and Reference Queries and suggest the use case for each of these in Power BI.

Before we start, lets quickly understand the basics of Power BI Query and the essential steps to open Power BI Query Editor.

Power BI Query Editor

Transform Data in Power Query Editor:

A query in Power BI refers to the series of steps performed on the raw data to convert it into a format that is best suited for reporting and analysis. This process is known as data transformation. Data transformation is performed in the Query Editor.

To open the Query Editor:

  1. Click on the Home tab in the Ribbon Menu.
  2. In the Queries section, click on Transform Data.
  3. Now click on Transform Data to enter the Query Editor.

A new window opens showing the interface for Microsoft Power BI Query Editor.interface-micosoft power BI Query Editor

How To Make A Copy Of Query In Power BI Editor

It is often tedious to perform the same steps on different queries repeatedly. This is especially true if queries contain similar but independent data. There are two solutions to this:

  • Duplicate a query.
  • Reference a query.

In this blog post, we will walk through each of these methods, analyze a use case for each of these and compare the two methods towards the end.

Duplicating a Query

Power BI allows you to Duplicate a Query with its complete steps. This saves time and increases work efficiency.

Let’s see this in action.

 

To completely follow the below steps, download the Duplicate_vs_Reference.pbix from the following link and open it in Power Query Editor: https://powerbitraining.com.au/downloads-duplicate-vs-reference-queries/

A series of steps have been performed on the Sales table Query. The aim of this blog post is to demonstrate two methods to make a copy of these steps. The steps are listed under the Applied Steps pane in the Query Settings:

To duplicate a query:

  1. Right-click on Sales in the Queries Pane.
  2. Click on Duplicate.
  3. Rename Sales (2) to Duplicated Sales.

Under the query settings pane, observe that all the query steps are duplicated.

Duplicating a Query will be useful if you want to:

  • Replicate exact steps on similar data set.
  • Add more steps to an existing query without having to change the primary query.
  • Create a copy of the original query.

In order to better understand when a duplicate query is useful, let us see an example:

Changing the souce without altering the orignal query

In contrast to the Reference query, Duplicated query allows you to alter the query source without changing the original query.

 

To change the source of the Duplicate query:

  1. Right-click on Duplicated Sales in the Queries Pane.
  2. Click on Source option under the applied steps in the Query Settings Pane.
  3. Select the new path.
  4. Click Ok.

All the steps in the query will now be performed on the new data set.

Referencing a Query

By Referencing a Query, you create a link between the new query and an existing query. This allows the data to flow in sequential models. Any change in the original query will be reflected in the referenced query.

Let’s see this in action.

 

To reference a query:

  1. Right-click on Sales in the Query Pane.
  2. Click on Reference.
  3. Rename it as Referenced Sales.

Under the applied steps, observe query steps only refer to the source which cannot be changed.

Referenceing a Query will be useful when you want to:

  • Create a copy of the query and add more steps in the query without doing the custom steps again.
  • Create a copy of an initial query that is directly referenced but not isolated from it.
  • Don’t want to produce a new object in memory, because a reference query does not require any additional processing.

Let’s consider a use case of Reference query where changing data type in the original query is reflected in the referenced query.

Creating a change that propagates:

  1. Click on the Sales in the Queries Pane.
  2. Right Click on the Quantity Sold column header.
  3. Click on Remove Errors.
  4. Now click on Referenced Sales in the Queries Pane.

Note that the data has been automatically updated and all the errors from Quantity Sold column have been removed but there no step indicated in the Applied Steps Pane.

How Copy-Paste is neither a Reference nor a Duplicate.

It is commonly misunderstood that Copy and Paste are the same as Duplicate. Although it appears to be a duplicate query, it is neither a Reference nor a Duplicate action!

  • In practice, when you copy-paste a table from a simple query i.e., the one that does not rely on other queries, it will result in a Duplicate query.
  • But if you copy-paste a table that that is sourced from other queries, like reference table, every dependent query will be copied as well.

We will show you what happens if you choose the Copy-Paste option instead of the Duplicate or Reference options in the example below.

Case 1: When Copy-Pasting produces a duplicate query

To Copy-Paste from a Duplicate Query:

  1. Click on Duplicate Sales.
  2. Press Ctrl+c on the keyboard.
  3. Press Ctrl+v on the keyboard.

Or do it as shown in the image below.

Notice that the copy-paste produces results similar to Duplicate Query with all the steps copied.

Case 2: When Copy-Pasting does not produce a duplicate query

To Copy-Paste from a Reference Query:

  1. Click on Referenced Sales.
  2. Press Ctrl+c on the keyboard.
  3. Press Ctrl+v on the keyboard.

Or do it as shown in the image below.

Notice that the copy-paste for the reference query produces two tables i.e. one with the applied steps and one with the source information.

Difference between Duplicating and Referencing a Query

Duplicating and referencing is a form of querying in the Query Editor. Both of these create a new query as a result.  Duplicating a Query creates a copy of the original query with all the steps copied in the same order. Whereas, Referencing a Query creates a query with a single step referencing its parent source query. These differences can be visualized using the Query Dependency feature.

Duplicate

Reference

General Behavior

A duplicate query behaves as an autonomous query that is independent of its parent query.

Reference query is directly linked to its parent query. 

Change Propagation

Any changes in the parent query configuration does not impact the duplicate.

Any changes in the parent query directly impacts the referenced query.

Processing Power

Each step needs to be performed independently. This requires more processing power.

Referenced queries act as pointers in the memory. Therefore, less processing power is required.

Circular References

Duplicate queries act independently so Circular Referencing is easy to avoid.

Reference queries cannot be used extensively as they can result in looping or circular references.

Memory Requirements

Duplicating a query creates an independent object in memory.

Referenced query is dependent on the source used in the parent query.

Viewing Query Dependency

Viewing Query Dependency helps you to understand what is going behind your clicks in Power BI. Let’s use the Query dependency feature to understand the difference between Duplicate and Reference.

To open the query dependency feature:

  1. Click on the View tab in the Ribbon
  2. Click on Query Dependencies in the Dependencies section.

A pop-up menu appears.

Duplicate Query is at the same hierarchal level as the original query. Whereas Reference Query is one step lower.

Conclusion:

A query is a series of steps applied to the raw data to make it suitable for data analysis. These steps can be duplicated or referenced based on the requirement. Duplicating a query means creating an isolated copy of the steps performed. Whereas, referencing a query refers to creating a reference to another query. This blog post summarizes the concept of duplicating and referencing queries and compares both methods of query transformation.

Download Power BI desktop files for this blog from the link given below:

https://powerbitraining.com.au/downloads-duplicate-vs-reference-queries/

Are you a data analyst and want to learn more about Power BI? Why not sign up for Power BI training in Australia. We provide the following Power BI trainings:

Leave a Reply