HomeGoogle DocsSpreadsheetCompare Two Tables and Remove Duplicates in Google Sheets

Compare Two Tables and Remove Duplicates in Google Sheets

Published on

It’s not easy to provide a tailor-made formula for comparing tables and removing duplicates in Google Sheets.

It’s mainly because the requirements may vary from person to person.

Here I will share a formula to help you compare two tables and remove duplicates in Google Sheets.

This tutorial can help you in the following way.

Suppose you have two tables in a Google Sheets file containing the invoice raised, and payment received details.

Assume the corresponding tab names are ACCOUNT RECEIVABLE and PAYMENT COLLECTED.

How to match the invoice numbers and remove the matching ones?

You can compare two tables for duplicates and remove them using a formula in Google Sheets.

Let’s prepare the sample data first.

Sample Data for Comparing Two Tables and Removing Duplicates

Table 1: ACCOUNTS RECEIVABLE

I have the following records in the accounts receivable tab.

ACCOUNT RECEIVABLE - Table 1

Table 2: PAYMENT COLLECTED

I have just one record in the payment collected tab.

PAYMENT COLLECTED - Table 2

You can see now how to compare these two tables for duplicates and remove them.

As you can see Invoice Number II000025 repeats in both tables.

We can handle it in two ways!

  1. Remove all the occurrences of the matching rows.
  2. Keep one record of the matching row.

How do we match the records?

It’s up to you whether to consider only the invoice number for matching (method A) or the total row (method B).

Formulas for Comparing Two Tables and Removing Duplicates

1. Removing All Occurrences of the Matching Row

Matching Total Row (method B):

The following formula will remove matching rows from both tables.

=unique(vstack('ACCOUNTS RECEIVABLE'!A2:D,'PAYMENT COLLECTED'!A2:D),,true)

The VSTACK appends both tables vertically. The UNIQUE returns the distinct rows.

Table 1 and Table 2 comparison for duplicate rows

Here we can also use the following QUERY formula.

=query(query(vstack('ACCOUNTS RECEIVABLE'!A1:D,'PAYMENT COLLECTED'!A2:D),"Select Col1,Col2,Col3,Col4, Count(Col1) where Col1<>'' group by Col1,Col2,Col3,Col4"),"Select Col1,Col2,Col3,Col4 where Col5=1")

How Does this Query Formula Compare Two Tables (Datasets) and Remove Duplicates?

Before using this formula, it’s good for you to understand its logic first.

The above is a nested Query formula. The inner Query formula is the main one. What does it do?

=query(vstack('ACCOUNTS RECEIVABLE'!A1:D,'PAYMENT COLLECTED'!A2:D),"Select Col1,Col2,Col3,Col4, Count(Col1) where Col1<>'' group by Col1,Col2,Col3,Col4")

It groups all columns in the table and returns the count of records as follows.

The Query count technique to remove duplicate rows

The Outer Query filters out any row that contains >1 in the last column.

Matching Invoice Number (method A):

Assume the invoice amount is different for II000025 in both tables.

The above formulas won’t remove the said invoices because they are distinct.

In this case, we can modify our earlier Query formula by grouping only the invoice column.

=query(query(vstack('ACCOUNTS RECEIVABLE'!A1:D,'PAYMENT COLLECTED'!A2:D),"Select Col1,max(Col2),max(Col3),max(Col4), Count(Col1) where Col1<>'' group by Col1"),"Select Col1,Col2,Col3,Col4 where Col5=1")

2. Keeping the First Occurrence

Here also, we can use two types of formulas: Consider only the invoice number for matching (method A) or the total row (method B).

Matching Total Row (method B):

=let(ftr,unique(vstack('ACCOUNTS RECEIVABLE'!A2:D,'PAYMENT COLLECTED'!A2:D)),filter(ftr,choosecols(ftr,1)<>""))

Here the Unique does the job. The purpose of LET is to filter out blank rows in the result.

Matching Invoice Number (method A):

=let(ftr,sortn(vstack('ACCOUNTS RECEIVABLE'!A2:D,'PAYMENT COLLECTED'!A2:D),9^9,2,1,1),filter(ftr,choosecols(ftr,1)<>""))

Here the SORTN takes the place of Unique.

Related: SORTN Tie Modes.

The highlighted number in the formula represents the column for matching. Here it’s the invoice column, so # 1.

That’s all about how to compare two tables and remove duplicates in Google Sheets.

Prashanth KV
Prashanth KV
Your Trusted Google Sheets and Excel Guide Prashanth KV brings a wealth of experience in Google Sheets and Excel, cultivated through years of work with multinational corporations in Mumbai and Dubai. As a recognized Google Product Expert in Docs Editors, Prashanth shares his expertise through insightful blogging since 2012. Explore his blog for practical tips and guidance on maximizing your spreadsheet skills.

Creating Sequential Dates in Equally Merged Cells in Google Sheets

Do you know how to create sequential dates in equally merged cells across a...

Running Total By Month in Excel

This tutorial demonstrates how to calculate the running total by month in a range...

SORT and SORTBY – Excel Vs Google Sheets

While Excel offers the SORT and SORTBY functions for sorting a table, Google Sheets...

Get Top N Values Using Excel’s FILTER Function

When using the FILTER function in Excel to extract the top N values, it's...

More like this

Creating Sequential Dates in Equally Merged Cells in Google Sheets

Do you know how to create sequential dates in equally merged cells across a...

Interactive Random Task Assigner in Google Sheets

You have multiple tasks and multiple people. Here's a fun way to randomly assign...

Google Sheets Bar and Column Chart with Target Coloring

To color each data point in the column or bar chart according to the...

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.