How to remove duplicates on MS SQL, Power Query & DAX.

Removing data is a crucial step When cleaning data of any new data set. It’s essential in any dimension table to have unique values to be able to create one-to-many relationships on power bi. In addition to that, it’s useful to get only unique values at any stage of the data visualization process.

In this post let’s discuss how we remove duplicates in SQL, in Power query, and DAX.

There is no definitive place to remove duplicates in the Data analysis process. You can use these methods according to the situation.

Table

We use a table of famous movie characters and their ages.

How to remove duplicates on SQL Server

For SQL Server scripts removing duplicates is straightforward. The DISTINCT keyword is used to get only distinct values from a column.

How to remove duplicates Power Query.

Let’s try the same thing in Power Query.

Removing duplicates in Power Query is a built-in function in Power Query. It can be done without writing any function.

Let’s use the Movie_char table and select the column we need to remove duplicates from.

Go to Home >> Remove Rows >> Remove Duplicates.

How to remove duplicates on DAX.

Let’s try it again in Power BI DAX.

Here is the imported Movies_char with duplicates.

Go to >> Home >> New Table.

Add this formula.


New_Table = DISTINCT(Movies_char)

Wrap

Remove duplicates from a column can be done,

  • on SQL
  • on Power Query
  • on DAX

using a few simple steps.