One of my Google Sheets has two tabs with multiple columns of data. In both the data sets I’ve identical column or columns. How can I compare two sets of multi-column data for differences?
To do this your datasets should contain at least one common primary key column. So that you can compare your two sets of multi-column data. See the below examples:
How Do I Compare Two Sets of Multi-Column Data for Differences?
Let’s see how to compare two datasets in Google Sheets for differences. I am starting with a single column primary key comparison.
Single Primary Key Column to Compare Two Sets of Multi-Column Data:
Here I have two datasets. For example purpose, I’ve entered both the datasets in the same sheet tab. You can use datasets in two different tabs also.
In dataset 1 and dataset 2, there is a common primary key column labeled as “Product ID”. I am going to match the product IDs in dataset 1 with the product IDs in dataset 2.
Once find the match I want to check another column for any differences. It’s the “Quantity Received” column in both the datasets. If not matching, I want to return the dataset 1 rows that contain differences.
I have already the result entered in the above sheet in the range K2: M. There is only one row that contains the difference. The quantity of product ID M2001, that is Orange, is different in table 1 and table 2. So I want that row in table 1 returned.
Here is the formula that I have used to compare two sets of multi-column data for differences.
=ArrayFormula(SORT(if(countifs(F3:F&I3:I,A3:A&D3:D)=0,A3:D,)))
How can I use this Find Difference formula in my own Google Sheets?
See the generic version of the above formula to understand it better.
ArrayFormula(SORT(if(countifs(Product ID Data Set 2&Quantity Received Data Set2,Product ID Data Set 1&Quantity Received Data Set1)=0,Dataset 1,)))
Point # 1:
In concise, when you compare two sets of multi-column data for differences, first you should decide from which dataset (dataset 1 or dataset 2) you want to get the rows with differences.
If you want to get the result from dataset 1, you should compare dataset 2 with dataset 1 (not the reverse like dataset 1 with dataset 2) as above.
In other words, In COUNTIFS, the dataset 2 column should come first.
Point # 2:
In comparison, use the required columns as combined. For example, I am comparing Column A and D in dataset 1 with column F and I in dataset 2. I’ve combined the columns in my formula above.
Must Read: Google Sheets Functions Guide.
How to Use Two Primary Key Columns to Compare Two Sets of Multi-Column Data?
The above same logic you can apply to any number of columns. You only require to join the columns in COUNTIFS as below.
In example one, I have used one primary key column from each table for finding the differences. Here is how to use 2 or more columns.
Here I am matching Product ID, WH No., and Quantity Received. The relevant columns in dataset 1 are columns A, B, and D. You can say the primary key columns are column A and B.
In dataset 2, the relevant columns are the columns F, G, and I. The primary key columns are column F and G.
See the formula to understand how the columns combined.
=ArrayFormula(SORT(if(countifs(F3:F&G3:G&I3:I,A3:A&B3:B&D3:D)=0,A3:D,)))
Again I am repeating, in COUNTIFS, the dataset 2 columns should come first. You can use this logic with any number of columns.
That’s all. Follow the above steps to compare two sets of multi-column data for differences in Google Doc Spreadsheets.