Remove Duplicate Rows Based on Selected Columns in Google Sheets

0
136
Remove Duplicate Rows Based on Selected Columns

I am sure that until now you were in the dark on this type of duplicate elimination in Google Sheets. Here you can learn how to remove duplicate rows based on selected columns in Google Sheets. Let me make this concept more clearer to you. In a multi column table, with the help of this tutorial, you can remove duplicates based on one column, two columns etc. You can use SORTN instead of your familiar UNIQUE function in this case. Yup! You have heard it right. I am talking about SORTN formula, not SORT.

multi column duplicates and SORTN instead of Unique

If you use a UNIQUE formula in this range to remove duplicates, it would obviously return all the rows as it’s.

=UNIQUE(A2:C6)

For this Unique formula, there is no duplicates in the above range and it’s true. Actually the highlighted rows are duplicate entries only if the “year published” are same. Here the years are different.

But we may want the result in a different way. I want a formula that treats the above two rows as duplicates that based on the author’s name and book title. I want to just ignore the third column that containing the year of publication.

You can remove duplicate Rows based on selected columns in Google Sheets. I’ll come to that custom formula and the explanation part at later part of this tutorial. It’s very easy to learn. Before that I would like to bring your attention to two of my earlier tutorials regarding or similar to this subject.

I have already detailed different types of duplicate removal techniques in Google Sheets. First let’s see what are they.


Query in Duplicate Elimination

With the help of this tutorial using Query, you can remove duplicates in selected columns. But it has one limitation. What’s that? See the below image.

selected column duplicate removal using Query in Google Doc

Here if you apply the Query formula as per the tutorial, the result would be as below with the Qty. column summed.

Apple 03/02/2018 1450
Mango 03/02/2018 500

In our first example, we don’t want to sum the last column as it’s the year of publication of the book. So this type of selected column duplicate removal is not ideal in all situations.


Distinct Rows

Here is another type of duplicate elimination. If you want to remove all the above blue coloured rows and return row no.3, which is distinct, you can follow this tutorial. In this tutorial I’ve used a combination of formulas including Query, Split and ArrayFormula.


Now let’s back to our tutorial.

How to Remove Duplicate Rows Based on Selected Columns in Google Sheets

See the example below to understand how to remove duplicates in selected columns in Google Sheets.

Eliminate Duplicates Based on First One Column in Google Sheets

one column based duplicates in google sheets

In this example I want to find and remove duplicates based on book title in column 1 regardless of year of publication in column 2. Here as you know UNIQUE formula won’t work. So I’ve used SORTN here.

Must Read: How to Use Google Sheets SORTN Function (detailed tutorial)

Formula:

=sortn(A2:B6,5,2,A2:A6,FALSE)

Formula Explanation:

Here A2:B6 is the data range and 5 is the number of rows to return. You can use the total number of rows in your data as it is because we are unaware about the number of duplicates in the range. The formula would only return the correct number of rows after removing the duplicates.

Then the number 2 is the tie mode to remove duplicates in SORTN. There are different tie modes in SORTN. But for our duplicate removal technique we should use the number 2.

Now the important part is the sort range A2:A6. You should sensibly use this range as the SORTN formula removes duplicates based on this range.

Now finally you can use TRUE or FALSE to decide the sort order. As you may know TRUE denotes ascending order and FALSE descending. Sorting is must as the function SORTN is basically to use for sorting.

Remove Duplicates Based on First Two Columns in Google Sheets

Remove Duplicates Based on First Two Columns in Google Sheets

If you could understand how to remove duplicates based on one column, this two column duplicate removal would be easy for you.

Formula:

=sortn(A2:C6,5,2,A2:A6&B2:B6,FALSE)

Formula Explanation:

Here A2:C6 is the data range. Here also 5 is the number of rows to return and 2 is the tie mode. The tie mode will be always same.

Then the important part is the sort column. We want to remove duplicates based on two columns and they are A2:A6 and B2:B6. So just join them using an ampersand symbol. That’s all.

Conclusion

Hope some of you may require such type of duplicate removal techniques in your daily job. If you find the above tutorial on how to remove duplicate rows based on selected Columns in Google Sheets useful, pleas drop in comments. Enjoy!

LEAVE A REPLY

Please enter your comment!
Please enter your name here