HomeGoogle DocsSpreadsheetHow to Remove Duplicates from Comma-Delimited Strings in Google Sheets

How to Remove Duplicates from Comma-Delimited Strings in Google Sheets

Published on

In this tutorial, you will get the array and non-array formulas to remove duplicates from comma-delimited strings in Google Sheets.

It must not necessarily be a comma delimiter. You can use the formula if the delimiter is a pipe, new line, etc.

I’ll explain what corrections you should make in the formula.

You may already have the formula to remove duplicates from comma-delimited strings in Google Sheets.

But do you have an array formula to expand for each row?

The purpose of the array formula here is to remove duplicates from comma (pipe or any other) delimited strings from each row by residing in the topmost row in the range.

I’ve two types of array formulas: An old-school one and a new one based on the Lambda functions.

Removing Duplicates from Comma-Delimited Strings: Non-Array Formula

Non-Formula:

=ArrayFormula(textjoin(", ",true,unique(trim(split(A2,",")),true)))
Non-Array Formula to Remove Duplicates from Comma-Delimited Strings

Insert the above formula in cell B2. Then copy and paste it down the column as far as you want.

How does it work?

The SPLIT divides the text around the comma and puts each fragment into a separate cell in the row.

The TRIM removes white spaces.

Then UNIQUE removes duplicates. The role of the TEXTJOIN is to join the fragments back into a comma-delimited string.

The ArrayFormula is to support the non-array TRIM function.

The above is a non-array formula to remove duplicates from comma-delimited strings.

If the delimiter is a pipe, replace split(A2,",") with split(A2,"|").

Also in the TEXJOINT, replace textjoin(", " with textjoin("| ".

Let’s move to the array formula to remove duplicates from the comma-delimited text.

Removing Duplicates from Comma-Delimited Strings: Array Formula

I suggest you use the Lambda formula which is clean and works better.

Old School Approach (Coded before Lambda): ✖

The array formula is a bit lengthier. Here it is.

Array Formula (Old School):

=ArrayFormula(transpose(split(textjoin(", ",1,IFERROR((if(row(indirect("A2:A"&counta(to_text(unique(filter(flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),",")),flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),","))<>""))))))-match(REGEXEXTRACT(to_text(unique(filter(flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),",")),flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),","))<>""))),"\w+"),REGEXEXTRACT(to_text(unique(filter(flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),",")),flatten(split(if(len(A2:A),row(A2:A),)&" "&substitute(A2:A,", ",","&row(A2:A)&" "),","))<>""))),"\w+"),0)=1,"|",)&REGEXREPLACE(to_text(unique(filter(flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),",")),flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),","))<>""))),"^(\w*\W*){1}","")))),"|")))

For the sample sheet with the formula entered (step-by-step), please scroll down to the end of the post.

How does this formula remove duplicate texts from the comma-delimited strings?

We can explain the above formula in 9 steps under three categories.

  1. Split Strings and Row Wise Unique (3 steps).
  2. Elements to Combine Strings Back to Its Rows (4 steps).
  3. Final Steps to Remove Duplicates from Comma Delimited Strings (2 steps).

1. Split Strings and Row Wise Unique

The below formulas go in cells B2, C2, and D2. Please see the image below to understand what they return.

1. Inserting Row Numbers and a White Space Immediately After the Delimiters.

=ArrayFormula(trim(substitute(A2:A,",",","&row(A2:A)&" ")))

2. Inserting Row Numbers at the Beginning of Values in Each Row.

=ArrayFormula(if(len(A2:A),row(A2:A),)&" "&B2:B)

The purpose of the above two formulas is to add row numbers to each string.

3. Split Strings Based on Delimiter, Unique, and Filter.

=ArrayFormula(to_text(unique(filter(flatten(split(C2:C,",")),flatten(split(C2:C,","))<>""))))
Steps 1 to 3 - Split Strings and Row Wise Unique

When you unique the strings this way, the formula will only remove duplicates from the same row.

We have solved one main issue we may encounter when trying to remove duplicates from comma-delimited strings.

2. Elements to Combine Strings Back to Their Rows

The below formulas go in cells E2, F2, G2, and H2.

The step # 3 formula in the previous step returns unique row number added values.

Below will separate the row numbers (4th step) and the rest of the values (5th step). Please refer to the image below.

4. Extracting Row Numbers.

=ArrayFormula(REGEXEXTRACT(D2:D16,"\w+"))

5. Extracting Strings.

=ArrayFormula(REGEXREPLACE(D2:D16,"^(\w*\W*){1}",""))

Related: Regex to Get All Words after Nth Word in a Sentence in Google Sheets.

The purpose of the following two (steps # 6 and 7) formulas is to return a separator to mark the end of each row based on column D.

We will achieve that by returning the running count of step # 4 output and replacing sequence number 1 with a pipe and the rest of the values with blanks.

6. Running Count Based on Extracted Row Numbers.

=ArrayFormula(row(A2:A16)-match(E2:E16,E2:E16,0))

7. Delimiter to Separate Rows.

=ArrayFormula(if(G2:G16=1,"|",))

Note:- Here, I have used the | character. If your strings are separated by the same, not a comma, then use here a ~ character instead.

Steps 4 to 7 - Elements to Combine Strings Back to Its Rows

3. Final Formula that Removes Duplicates from Comma-Delimited Strings

Please see columns F and H. We want those columns in the final step. First, we will combine the pipe (or tilde) separators in column H with the strings in column F.

Then join them using a comma separator, split at the pipe (or tilde) separator, and transpose.

The following formulas go in cells I2 and J2.

8. Add Separators with Extracted Strings.

=ArrayFormula(H2:H16&F2:F16)

9. Combine Strings Using a Comma Delimiter and Split Based on Separator.

=transpose(split(textjoin(", ",1,I2:I16),"|"))
Steps 8 and 9 - Array Formula to Remove Duplicates from Comma Delimited Strings

This way we can remove duplicates from comma-delimited strings in Google Sheets.

    Note: There shouldn’t be any blank rows in the range. Also, this resource-hungry formula may return errors in a larger data range.

    New Approach (Lambda): ✔

    Array Formula (Lambda – New):

    =map(A2:A,lambda(r,iferror(ArrayFormula(textjoin(", ",true,unique(trim(split(r,",")),true))))))

    Just insert this code in cell B2. Unlike the non-array formula, you won’t require to copy-paste it down.

    It will spill down automatically if there are blank cells below B2.

    It is the same non-array formula we used to remove duplicates from the comma-delimited strings.

    The only difference is the MAP Lambda helper function.

    The LHF helps the non-array formula to spill down.

    Sample_Sheet_15521

    Prashanth KV
    Prashanth KV
    Your Trusted Google Sheets and Excel Guide Prashanth KV brings a wealth of experience in Google Sheets and Excel, cultivated through years of work with multinational corporations in Mumbai and Dubai. As a recognized Google Product Expert in Docs Editors, Prashanth shares his expertise through insightful blogging since 2012. Explore his blog for practical tips and guidance on maximizing your spreadsheet skills.

    Creating Sequential Dates in Equally Merged Cells in Google Sheets

    Do you know how to create sequential dates in equally merged cells across a...

    Running Total By Month in Excel

    This tutorial demonstrates how to calculate the running total by month in a range...

    SORT and SORTBY – Excel Vs Google Sheets

    While Excel offers the SORT and SORTBY functions for sorting a table, Google Sheets...

    Get Top N Values Using Excel’s FILTER Function

    When using the FILTER function in Excel to extract the top N values, it's...

    More like this

    Creating Sequential Dates in Equally Merged Cells in Google Sheets

    Do you know how to create sequential dates in equally merged cells across a...

    Interactive Random Task Assigner in Google Sheets

    You have multiple tasks and multiple people. Here's a fun way to randomly assign...

    Google Sheets Bar and Column Chart with Target Coloring

    To color each data point in the column or bar chart according to the...

    LEAVE A REPLY

    Please enter your comment!
    Please enter your name here

    This site uses Akismet to reduce spam. Learn how your comment data is processed.