Solving Data Explosion resulting from CROSSJOIN() in DAX

You are currently viewing Solving Data Explosion resulting from CROSSJOIN() in DAX

DAX is Data Analysis Expression language developed by Microsoft to interact with data in various ways. It is a formula language consisting of functions, operations and expressions that return some value. Although DAX is used for optimization, but sometimes inefficient coding techniques result in expensive operations which are not memory efficient. As an example,

CROSSJOIN() allows you to build Cartesian product between two tables and brings all possible combinations from both tables.CROSSJOIN()

To see how CROSSJOIN() works:

  1. Click on the Modeling tab in the Ribbon.
  2. Click on New Table.
  3. Type the following expression in the formula bar and press Enter.expression

In this Expression:

  • A new table containing the Cartesian product of Customer and Product table is calculated.
  1. Click on the Data tab in the left Ribbon.
  2. Click on Testing Cross Join (CT) in the FieldsTesting Cross Join
  3. Scroll down to the end of table.end-of-table

CROSSJOIN() with simple tables returns 7.4M rows and becomes an expensive operation to perform. We can use VALUES() function with Cross join to get all possible combinations of customer city and product subcategory.

Solving the problem of Data Explosion:

CROSSJOIN() and VALUES() can be used together to solve the problem of data explosion mentioned above. The first table function that you will be using to calculate tables is VALUES().Data Explosion

The result of VALUES() depends on the input argument:

  • When the input argument is a table, all the rows of the table are returned. Duplicate rows are preserved.
  • When the input argument is a column, a single-column table is returned containing distinct values. Duplicates are removed in this case and only unique values are returned.

To learn more about VALUES() in DAX: Using VALUES() in Calculated Tables – Power BI Training Australia

To use VALUES() with CROSSJOIN() to get all possible combination of customer city and product subcategory:

  1. Click on the Testing Cross Join (CT) tab in the Fields.

Table tools open.

  1. Type the following expression in the formula bar and press Enter.Testing Cross Join

In this Expression:

  • A new table containing the Cartesian product of distinct Customer cities and distinct Product Subcategories is calculated.
  1. Click on the Data tab in the left Ribbon.
  2. Click on Testing Cross Join (CT) in the Fields pan.
  3. Click on the drop down next to City.
  4. Click on Sort Descending.Sort Descending
  5. Scroll down to the end of table.Result

Notice the reducion in the number of rows.

Conclusion:

CROSSJOIN() as the name implies produces a cartesian product of all the rows of the tables used in the argument. This results in a data explosion as the number of rows in the resultant table are a cartesian product of all the rows in all the tables used. This operationally intensive calculation can be simplified using a simple DAX function VALUES(). In this blog post, the problem resulting from using CROSSJOIN() alone is addressed with a step by step guide.

Are you looking for an instructor-led Power BI or Power Apps training in Australia? We provide classroom as well as online live trainings for the following courses:

Leave a Reply