To combine rows and get the latest values, we can depend on a combination formula.
If we want to extract the latest records (rows) from a dataset, then SORTN will do the task. It’s all about sorting using tie-modes.
But, here, our problem is entirely different.
We want to combine rows and get the latest values in the following way.
Rows to Merge:
Price 1 | Price 2 | Price 3 | Price 4 | |
Item 1 | 5.50 | 5.25 | 5.00 | |
Item 1 | 6.00 | 5.00 | 5.75 |
Latest Value Rows (expected result):
Price 1 | Price 2 | Price 3 | Price 4 | |
Item 1 | 5.50 | 6.00 | 5.00 | 5.75 |
Note:- There may be multiple items and more than two duplicate rows, and the above is just a sample.
The above may be a requirement of many spreadsheet users.
If there is a function to combine rows and get the latest values in Google Sheets, even a novice user could manipulate the data as above.
But sadly, there is no MERGE() array function with the optional parameters TRUE (latest) and FALSE (earliest) to achieve the same.
Further, we don’t want to use Apps Script or Add-on for the same. Then the only solution is to write a combination formula if supports.
As I have mentioned in the first paragraph, I have a formula-based solution to merge rows and get the latest values in Google Sheets. Let’s go to that.
How Does the Formula Determine the Latest Rows?
If you have a timestamp column, the formula would determine the latest entry based on it.
When you are working with source data fed by a Google Form, you may have a timestamp column.
Assume there is no timestamp column, then the last row will be considered as the latest row.
For example, if rows 5 and 10 have duplicates, then row 10 will be the latest row.
Based on the above, there will be a minor change in my Google Sheets formula that combines rows and return the latest values.
Merge Rows and Get the Latest Values When the Data Contain Timestamps
Sample Data and Expected Result:
As you can see, the first column in the above sample data contains timestamps. So it is easy for us to combine rows and get the latest values.
If I write the formula straight away, you may find it tough to adopt the same in your data range which might contain more columns than mine.
First, let’s understand the logic, then write the formula.
Before that one more thing!
The range C3:F can either contain numbers alone or text strings alone. Mixed type data in this range is not supported.
Logic in Five Steps
- We will first unpivot the dataset using the SORTN function.
- Then we will use the WHERE clause in Query to filter out rows that contain blanks in the price column.
- We will use the ORDER BY clause in the same Query to sort the data based on the timestamp in descending order.
- So we have an unpivoted dataset. We will only keep the first occurrence of rows based on the “Item” from this dataset. For that, we will use the SORTN function.
- Finally, we will pivot the same data using Query.
We can use this logic to combine rows and get the latest, or we can say the most recent values in Google Sheets.
You May Like: What is the Correct Clause Order in Google Sheets Query?
Formula in Three Steps
Let me write the formula in three easy-to-follow steps.
You will get logic points 1 to 3 in step 1, logic # 4 in step 2, and logic # 5 in step 3.
Step # 1
=ArrayFormula(
query(
split(flatten(A3:A&"|"&B3:B&"|"&C2:F2&"|"&C3:F),"|"),
"Select * where Col4 is not null order by Col1 desc"
)
)
The above SPLIT and FLATTEN functions unpivot the data, and the QUERY is for filtering out non-blanks based on the 4th column in the output and for sorting the data based on the first column.
The first column in the above output contains timestamps in number format, and that column is only for sorting.
Here onwards, we don’t want that column to combine rows and get the latest values.
So let’s modify the above Query slightly to remove the timestamp column.
Replace Select *
with Select Col2, Col3, Col4
, and here is that formula.
=ArrayFormula(
query(
split(flatten(A3:A&"|"&B3:B&"|"&C2:F2&"|"&C3:F),"|"),
"Select Col2,Col3,Col4 where Col4 is not null order by Col1 desc"
)
)
Step # 2
Using the SORTN function, we can remove duplicates from the above output.
It plays a vital role in writing the formula to combine rows and get the latest values.
It seems you want an explanation.
Please refer to the columns H and I in the below screenshot (columns H, I, and J contain the step # 1 formula output).
Let’s consider the item “c” and “Price 1”. It repeats thrice. We want the first row, delete the second and third occurrence.
Using SORTN, we can achieve it.
=sortn(H3:J,9^9,2,H3:H&I3:I,1)
Pivot back this output, and that will be equal to combine rows and get the latest values in Google Sheets. We will do that in the next step.
Step # 3
We can now pivot the above result.
=query(L3:N,"Select L, max(N) Where L is not null group by L pivot M")
Or
=query({L3:N},"Select Col1, max(Col3) Where Col1 is not null group by Col1 pivot Col2")
In the formula, I have used the Max aggregation function.
You can replace Max with Min, or Sum which won’t make any difference since we have already eliminated the duplicates.
However, I suggest you use the Max function because it will work with both numbers and strings.
Single Formula to Combine Rows and Get the Latest Values (All Three Steps Combined)
Combining formulas are simple if you start from the correct one.
In the above example, we must start from the step # 2 formula. Replace the range H3:J in that formula with the step # 1 formula.
Then what about H3:H and I3:I in that formula?
Here are the generic formulas.
For H3:H;
=index(step_1_formula,0,1)
For I3:I;
=index(step_1_formula,0,2)
Then go to step # 3 formula (the second one under the title “Step # 3” above) and replace L3:N with the above-modified Step # 2 formula.
You can remove all the ARRAYFORMULA functions from the formula. The SORTN alone will handle the array requirements.
For the final combined formula, please check cell H2 in Sheet1 in my sample sheet below.
Combine Rows and Get the Latest Values When No Timestamps Present
Please scroll up and refer to image # 1. You can see timestamps in A2:A.
Assume our column A is blank. In this scenario how can we write a formula to combine or merge rows and get the latest values in Google Sheets?
You can refer to Sheet2 of my above shared Google Sheet to get the data.
What are the changes required in the above timestamp-based data formula?
In our first example, the Flatten formula in step # 1 is as follows.
flatten(A3:A&"|"&B3:B&"|"&C2:F2&"|"&C3:F)
The range A3:A in that formula part refers to timestamp which is not available in our new dataset.
Replace A3:A with sequence(counta(C3:F),1)
which would return sequential numbers. Sorting this new column in descending order will meet our purpose.
flatten(sequence(counta(C3:F),1)&"|"&B3:B&"|"&C2:F2&"|"&C3:F)
There are no other changes.
In my sample sheet;
Just copy the Sheet1!H2 formula to Sheet2!H2 and paste. Then replace all the cell range reference A3:A in the formula with the just above Sequence formula.
I have already done that in my sample sheet.
That’s all about how to combine rows and get the latest values in Google Sheets.
Thanks for the stay. Enjoy!