Merge Duplicate Rows and Keep Latest Values in Excel

Published on

Here’s a dynamic array formula approach to merge duplicate rows and keep the latest values in your Excel spreadsheet. Before diving into the formula, let’s discuss when this method is useful.

Why Merge Duplicates and Keep Latest Values?

Assume you’re cleaning a student database. A student may have multiple entries, but you want to keep only the most recent one and remove the older records. However, the latest record might have missing data in fields, and you may want to fill those from the previous entries before deleting duplicates.

Generic Formula

=DROP(REDUCE("", SEQUENCE(n_col), LAMBDA(acc, val, HSTACK(acc, MAP(UNIQUE(id_col), LAMBDA(rowV, CHOOSEROWS(TOCOL(FILTER(INDEX(range, 0, val), id_col=rowV), 1), -1)))))), 0, 1)

Explanation of Parameters

  • n_col – Replace with the number of columns in your range.
  • id_col – Replace with the reference to the ID column.
  • range – The data range from which you want to merge duplicate rows and keep the latest values.

Prerequisites for Merging Duplicates and Keeping the Latest Values

  1. The dataset must have an ID column. If the same ID appears multiple times, only the second occurrence onward is treated as a duplicate.
  2. If there’s a date column that tracks entry updates, sort the data first by ID and then by date (ascending order). If there’s no date, keep the data as is. The formula assumes the latest records are at the bottom.

Now, let’s look at a sample dataset and apply the formula to merge duplicates while keeping the latest values.

Sample Data

Below is the sample data stored in the “student data” sheet (range A1:G5):

Student IDStudent NameAddressDate of BirthMajorGPAEnrollment Date
101Roy123, Main Street, Rohini2001-05-10CS3.92025-02-17
101Roy12, Inner Circle, Connaught Place
104ThanviB-45, Greater Kailash – II2001-07-05Physics3.82025-02-18
104Thanvi3.5

If you apply the merge duplicates and keep latest values formula, the output will be:

Student IDStudent NameAddressDate of BirthMajorGPAEnrollment Date
101Roy12, Inner Circle, Connaught Place10/05/2001CS3.917/02/2025
104ThanviB-45, Greater Kailash – II05/07/2001Physics3.518/02/2025

Example: Merging Duplicates and Keeping the Latest Values

To process this data dynamically, enter the following formula in cell A1 of a new worksheet:

=DROP(REDUCE("", SEQUENCE(7), LAMBDA(acc, val, HSTACK(acc, MAP(UNIQUE('student data'!A1:A5), LAMBDA(rowV, CHOOSEROWS(TOCOL(FILTER(INDEX('student data'!A1:G5, 0, val), 'student data'!A1:A5=rowV), 1), -1)))))), 0, 1)

This formula merges duplicates and keeps the latest values in Excel by:

  1. Extracting unique IDs from the dataset.
  2. Using FILTER, TOCOL, and CHOOSEROWS to retrieve the latest value for the first unique ID in the first column.
  3. Applying MAP to extend this logic across all unique IDs.
  4. Using BYCOL to apply this operation to each column individually.
  5. Using HSTACK to combine the processed columns horizontally, forming the final dataset.

Things to Know

  • The formula uses multiple LAMBDA functions, which may be resource-intensive for large datasets.
  • If your dataset is too large, consider merging duplicates in smaller chunks and then combining results using VSTACK: =VSTACK(merge_duplicates_1, merge_duplicates_2, ...)

FAQs on Merging Duplicate Rows and Keeping Latest Values in Excel

1. How do I merge duplicate rows in Excel and keep the latest values without VBA or Power Query?

You can use a dynamic array formula in Excel to merge duplicates and keep the latest values without using VBA or Power Query. The formula extracts unique IDs and retrieves the latest non-blank values for each column. See the formula in this guide above.

2. What if my dataset doesn’t have a date column to determine the latest entry?

If your dataset lacks a timestamp or date column, Excel assumes that the latest values are at the bottom of the dataset. Ensure that your data is sorted properly before applying the formula.

3. Will this formula work on large datasets?

This formula is resource-intensive due to multiple LAMBDA and FILTER functions. If you have thousands of rows, consider processing data in smaller chunks.

4. Can I use this formula to merge duplicate rows across multiple sheets?

Consolidate your data into one sheet before applying the formula.

5. How do I fill missing values from other duplicate rows?

The formula automatically fills missing values in the latest row using data from older entries.

6. Does this work in Excel 2019 or earlier versions?

No. The formula relies on dynamic array functions (MAP, REDUCE, TOCOL, CHOOSEROWS), which are available only in Excel 365 and Excel 2021.

Prashanth KV
Prashanth KV
Your Trusted Google Sheets and Excel Guide Prashanth KV brings a wealth of experience in Google Sheets and Excel, cultivated through years of work with multinational corporations in Mumbai and Dubai. As a recognized Google Product Expert in Docs Editors, Prashanth shares his expertise through insightful blogging since 2012. Explore his blog for practical tips and guidance on maximizing your spreadsheet skills.

Count Consecutive Duplicates in Excel (Dynamic Array Formula)

Counting consecutive duplicates in Excel is useful for analyzing patterns, detecting repetitive sequences, and...

How to Break RANK Ties Alphabetically in Google Sheets

The RANK function in Google Sheets is commonly used to assign rankings to numerical...

Google Sheets: Highlight an Entire Column If Any Cell Has an Error

Google Sheets allows you to highlight an entire column if any cell has an...

Filter Top N per Category in Excel (Using FILTER & COUNTIFS)

You can use a combination of FILTER and COUNTIFS to filter the top N...

More like this

Count Consecutive Duplicates in Excel (Dynamic Array Formula)

Counting consecutive duplicates in Excel is useful for analyzing patterns, detecting repetitive sequences, and...

Filter Top N per Category in Excel (Using FILTER & COUNTIFS)

You can use a combination of FILTER and COUNTIFS to filter the top N...

Rank per Group in Excel

You have two groups of 20 students each. How do you determine the rank...

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.