Formula to Combine Duplicate Rows in Google Sheets [No-Addon]

Published on

Is there any formula to combine duplicate rows in Google Sheets? I don’t want to use any Script or add-on.

Yes! I have a one-line Query formula to combine duplicate rows in Google Spreadsheets.

Before going for any free or paid add-on try whether my formula meets your Spreadsheet requirement.

The formula that I am going to provide you combines duplicate rows, irrespective of the number of columns in the dataset. First, understand what I meant.

Assume I have a dataset and in that, the first two columns contain the first names and last names of employees. Other columns contain “X” marks.

I want to merge the names and combine the “X” marks into one single row if the names (first and last) repeat. In other words, I wish to merge the duplicate names and combine values in adjoining columns into one row.

See below how my formula combines duplicate rows in Google Sheets.

Combine Duplicate Rows in Google Sheets

If you are looking for this kind of output, please check this tutorial: Merging Duplicate Rows Using Array Formulas in Google Sheets.

Combining Duplicates and Joining Values

Google Sheets Formula to Combine Duplicate Rows

With one Query formula, yes! I am using Google Sheets Query, we can combine multiple duplicate rows into one.

To polish the Query output like removing any blank rows and adding the header, I am using another Query as a nested. Here is the master Query formula that combines duplicate rows in Google Sheets.

={A1:N1;Query(ArrayFormula(query({A1:N},"Select Col1, Col2,"&ArrayFormula(join(", ","Max(Col"&column(C1:N1)&")"))&"group by Col1,Col2",1)),"Select * where Col1 is not null Offset 1",0)}

I know you can’t use this formula as it’s because your data range may be different than mine. So I am giving priority to how to modify this Query for any number of columns.

How Do I Modify The Query Formula That Combines Duplicate Rows?

Sample Data:

Sample Dataset to Combine Duplicate Rows

In this, we can find duplicates by looking into columns A and B. I want the formula to merge the duplicates in columns A and B. Then the values in columns C to N to be combined.

In my sample data, I have to look for duplicates in two columns – First Name and Last Name (columns A and B).

If you have only one column (First Name column) and no Last Name column the following changes must be made to the formula.

The data range is A1:M and in that, A1:M1 contains the headers as below.

header-row-for-duplicates

Now see my earlier formula.

tips to modify query that merge duplicate rows
  1. In this the column letter N is to be replaced by M.
  2. The column letter C is to be replaced by B because we want to combine columns B to M.
  3. Change the Query Select and Group by Clause as below.
    • Select Col1, Col2, to Select Col1,
    • Grop by Col1, Col2 to Group by Col1

Here is that modified formula for the range A1:M.

={A1:M1;Query(ArrayFormula(query({A1:M},"Select Col1,"&ArrayFormula(join(", ","Max(Col"&column(B1:M1)&")"))&"group by Col1",1)),"Select * where Col1 is not null Offset 1",0)}

I have provided you with the formula to combine duplicate rows in Google Sheets and explained how to modify it. See one more tip to modify the formula.

Sample Data: Check duplicates for first three columns.

Check duplicates for first three columns

Formula:

={A1:O1;Query(ArrayFormula(query({A1:O},"Select Col1, Col2,Col3,"&ArrayFormula(join(", ","Max(Col"&column(D1:O1)&")"))&"group by Col1,Col2,Col3",1)),"Select * where Col1 is not null Offset 1",0)}

So you have learned the trick to combining duplicate Rows without add-ons in Google Sheets. See how the formula works.

How the Query Merge Duplicates and Combine Rows – Formula Explanation

Some of you may have familiar with Query. If so I can help you to understand my above formula.

I have already mentioned at the beginning that the inner Query formula is only required to combine duplicates. Hope you have that already in your mind. Now read on to understand how the formula combines duplicate rows in Google Sheets.

Logic: To combine rows I have used the Max aggregate function in Query. It aggregates texts.Too make the data range flexible, I did some tricks in the Query Select Clause. I’ll explain that.

We can test the above formula in a relatively small dataset like A1:D7. In that, A1:D1 is the header row and A2:A7 contains the duplicate names. I want to merge the duplicates and combine rows in the range B2:C7 (refer to the below image)

This time I am only using the inner Query.

=ArrayFormula(query({A1:D7},"Select Col1,"&ArrayFormula(join(", ","Max(Col"&column(B1:D1)&")"))&"group by Col1",1))

That Query is equal to;

=ArrayFormula(query({A1:D7},"Select Col1, Max(Col2), Max(Col3), Max(Col4) group by Col1",1))
Query-Select-Clause-Automation in merging rows

Hope that makes everything clear.

So think twice before going for any costly add-on. Try this formula first to Combine Duplicate Rows in Google Sheets.

Example Sheet with Formula

Related Reading:

  1. How to Aggregate Strings Using Query in Google Sheets.
  2. Combine Similar Rows and Sum Values in Google Sheets.
  3. How to Sum, Avg, Count, Max, and Min in Google Sheets Query.
  4. Return All Values in Query Group By Clause in Google Sheets.
Prashanth KV
Prashanth KV
Your Trusted Google Sheets and Excel Guide Prashanth KV brings a wealth of experience in Google Sheets and Excel, cultivated through years of work with multinational corporations in Mumbai and Dubai. As a recognized Google Product Expert in Docs Editors, Prashanth shares his expertise through insightful blogging since 2012. Explore his blog for practical tips and guidance on maximizing your spreadsheet skills.

Lookup Values Under Categories in Excel and Google Sheets

We can use a combination of XLOOKUP and VLOOKUP in both Excel and Google...

Extract Items Under a Category from the Same Column in Excel

In Excel, you can use a combination of the XLOOKUP and DROP functions to...

How to Create a Searchable Table in Excel Using the FILTER Function

Finding specific records, or rows containing the required information, is straightforward in Excel using...

Time Sequences in Excel by Minute, Hour, or Second Increments

Creating time sequences, whether by hour, minute, or second increments in Excel, can be...

More like this

XMATCH Row by Row: Finding Values Across a Range in Google Sheets

Using the BYROW function with XMATCH in Google Sheets allows us to match values...

Limit Formula Expansion to a Specific Row in Google Sheets

In this tutorial, I’ll explain how to limit the expansion of an array formula...

3-D Referencing Structured Data Tables in Google Sheets

When you have several tables within a single sheet—not across multiple sheets in a...

29 COMMENTS

  1. Can you help me with my data? I have 4 columns, the first 3 have duplicates, and only the 4th column has different figures. I want to merge the duplicates and create a 4th column that is comma-separated. I checked out your other solutions using JOIN and FILTER, but I need to perform an array operation because I am importing the data from the IMPORTRANGE() function.

    • Hi Avishkar,

      Sure. When I coded those formulas, the LAMBDA function was not available in Google Sheets.

      That being said, I assume your sample data is in cells A1 to D and cells A1 to D1 contain the field labels IP, MAC, Hostname, and Anomaly.

      Here is the code that you can try:

      =LET(distinct,UNIQUE(A2:C),HSTACK(distinct,MAP(CHOOSECOLS(distinct,1),
      CHOOSECOLS(distinct,2),CHOOSECOLS(distinct,3),LAMBDA(ip,mac,host,
      TEXTJOIN(", ",TRUE,FILTER(D2:D,A2:A&B2:B&C2:C=ip&mac&host))))))

      I will try to post a custom function soon. Please watch for my upcoming post.

  2. Thanks, Prashant for the guidance above, it is so close to what I’m trying to achieve! Instead of taking the MAX(), I just need the column data from the latest row. How can I do that please? Been trying for a couple of hours, thanks Prashant.

    • Hi, Junnie Chen,

      To write a formula, I need more info like;

      Do you want only the latest rows?
      Do you have a timestamp column in your dataset?

      Better, you may please share the URL of a sample of your sheet (after replacing confidential data with dummy data and removing personal info) via reply. The file should be in either View or Edit mode.

      • Thanks, Prashanth for getting back. Basically, there is a form for students to key in their exam marks, they could submit the form more than 1 time, so yes there will be Timestamp.

        My goal is to show the summary of each student & their marks and grades in the “Subjects & Grades (Combined)” sheet. For each subject, I want to show the marks and grades from the latest row (as highlighted in orange in the “Subjects & Grades (ALL)” sheet). Currently, the formula I am using is showing the MAX() of marks and grade which is incorrect. Appreciate your guidance, thanks.

        This is the sheet: …link removed by admin…

        • Hi, Junnie Chen,

          I have created a new tab named “prashanth @infoinspired” in your shared Sheets.

          In that, the main formula is in cell B2 which dragged down (not to right).

          =ArrayFormula(IFERROR(REGEXEXTRACT(trim(query(filter(sort('Subjects & Grades (ALL)'!$B$2:$AC,row('Subjects & Grades (ALL)'!$A$2:$A),0),
          sort('Subjects & Grades (ALL)'!$A$2:$A,row('Subjects & Grades (ALL)'!$A$2:$A),0)=A2),,9^9)),"[A-Za-z0-9.]+")))

          That’s the only formula you have to pay attention to.

          Note:- The formula formats numbers to text. But you can use VALUE to convert in case you want to use it in aggregation.

          I’ll write a tutorial later and will leave the link below.

  3. Thank you for all of your helpful tutorials, I appreciate it.

    I have some data where I am trying to use the formula in this tutorial to output a single row with title, color, size (qty).

    I am having trouble joining all of the sizes to output in one line or in separate columns.

    Currently, it is outputting the last size it encounters in column H. Any suggestions are much appreciated.

    • Hi, Kevin Watson,

      It seems a pivot table is the best option to solve the problem. You can use the following Query Pivot.

      =query(A1:D,"Select A,B,sum(D) where A is not null group by A,B pivot C",1)

      The above formula will return title in the first column, color in the second column, and summed quantities under various sizes from column 3 onwards.

      The third column has no label as it represents “NIL” sizes. If you want to label it, use the below Query instead.

      =query(query(A1:D,"Select A,B,sum(D) where A is not null group by A,B pivot C",1),"Select * label Col3'Nil'",1)

      Your sheet is view only. So I haven’t inserted the formula in your sheet.

  4. Thanks for this awesome query. Noticed some odd behavior. I’m using the query that looks for dups only in the first column.

    Works beautifully as long as the entries in the first column are strings, not numbers. If the entries are numbers, it seems to drop some. Not sure if this is an issue with the query or Google Sheets.

    But if you try it out with just for rows with two dups, it’ll work with strings in the first column: it returns two rows with the rest of the columns merged.

    If you change the first column to numbers, it returns only one row, original one of the original rows.

    Any ideas what’s causing this?

  5. Hello,

    What if I’d like to combine rows in a way that I get two X where the same name has an X in the same column but on two different rows.

    Is there a way to accomplish that?

    Thanks

    Andrea

  6. Hi there,

    Thank you for this! Is there any way to not have the columns retitle with “Max”? Likewise, I’m getting an extra row added after the first row in the new array- is that supposed to happen and can I stop that from happening?

    Thank you!

    • Hi, Kristen,

      Actually, in my formula, the outer Query is to remove that titles. I don’t know why it’s not happening in your case.

      If I can see a copy of your Sheet, of course after removing personal/confidential/sensitive data, I can suggest a solution.

      Did you check my example sheet, that I have shared at the last part of my tutorial?

      Best,

      • I am also having the word “max” added which makes the output uninterpretable.

        This is so close to the perfect solution for me!

        Here is my sheet: URL removed by admin

        I would like to merge the duplicate rows in ‘alphatranspose’ tab in the separate ‘merge’ tab but without the “max” is added to one of the rows.

        What am I doing wrong?

        Thanks

        • Hi, Nick,

          You have perfectly followed my instructions to merge duplicate rows. The issue is not from your side.

          The Query function guesses the header row if you omit the header row (did not specify) number in the Query.

          QUERY(data, query, [headers])

          In my second Query formula (the outer Query formula) I have omitted the header and leave the Query to guess it. In your case, the Query guessed the header row number as 1. Please specify it to 0 to avoid the max row.

          You can do that by putting the value 0 just after the Offset clause in the second Query as below.

          "Select * where Col1 is not null Offset 1",0)}

          I’ll update the formula in my post and my shared sheet also to incorporate the change.

          Best,

          Prashanth KV

  7. Is there any way to get the data to stop adding in alphabetical order and just append to the bottom of the list?

    I have triggers set up for my data set, but somehow this formula formats A-Z and triggers all of my leads when a name like “Adam” gets appended to the top instead of the bottom of the list in chronological order.

      • Thanks for responding!

        Here is a Demo link. There are two tabs. The first tab is where the duplicate leads populate. The “Dedup” sheet is where the formula is and you can see they get formatted in alphabetical order. I don’t need them in alphabetical order, I need them to append to the bottom of the list when new leads populate.

        — link removed by admin —

        • Also, seem to be having issues w/ Upper and lower case not combining duplicates because the last name may be “som” or “Som”.

          • Hi, Ellysha,

            Thanks for sharing your example sheet. That helped me to well understand the problem.

            I have made a copy of your shared sheet, I hope you don’t mind since it’s a demo sheet, and entered my solution there. I’ve just edited my post and at the end, shared your demo sheet with my solution entered in cell J1 (tab name: “User Question”).

            Here is that formula.

            ={A1:I1;Query(ArrayFormula(query({A1:I},"Select upper(Col1),upper(Col2),Col3,lower(Col4),"&ArrayFormula(join(", ","Max(Col"&column(E1:I1)&")"))&"group by upper(Col1),upper(Col2),Col3,lower(Col4) order by Max(Col9)",1)),"Select * where Col1 is not null Offset 1",0)}

            In this, I have made the grouping columns, i.e. first name, last name and email address to upper/lower case letters. So it’ll solve the case sensitive problem.

            To arrange the data based on your new leads, I have sorted the data based on your timestamp column.

            Cheers!

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.