HomeGoogle DocsSpreadsheetHow to Remove Duplicates from Comma-Delimited Strings in Google Sheets

How to Remove Duplicates from Comma-Delimited Strings in Google Sheets

Published on

In this tutorial, you will get the array and non-array formulas to remove duplicates from comma-delimited strings in Google Sheets.

It must not necessarily be a comma delimiter. You can use the formula if the delimiter is a pipe, new line, etc.

I’ll explain what corrections you should make in the formula.

You may already have the formula to remove duplicates from comma-delimited strings in Google Sheets.

But do you have an array formula to expand for each row?

The purpose of the array formula here is to remove duplicates from comma (pipe or any other) delimited strings from each row by residing in the topmost row in the range.

I’ve two types of array formulas: An old-school one and a new one based on the Lambda functions.

Removing Duplicates from Comma-Delimited Strings: Non-Array Formula

Non-Formula:

=ArrayFormula(textjoin(", ",true,unique(trim(split(A2,",")),true)))
Non-Array Formula to Remove Duplicates from Comma-Delimited Strings

Insert the above formula in cell B2. Then copy and paste it down the column as far as you want.

How does it work?

The SPLIT divides the text around the comma and puts each fragment into a separate cell in the row.

The TRIM removes white spaces.

Then UNIQUE removes duplicates. The role of the TEXTJOIN is to join the fragments back into a comma-delimited string.

The ArrayFormula is to support the non-array TRIM function.

The above is a non-array formula to remove duplicates from comma-delimited strings.

If the delimiter is a pipe, replace split(A2,",") with split(A2,"|").

Also in the TEXJOINT, replace textjoin(", " with textjoin("| ".

Let’s move to the array formula to remove duplicates from the comma-delimited text.

Removing Duplicates from Comma-Delimited Strings: Array Formula

I suggest you use the Lambda formula which is clean and works better.

Old School Approach (Coded before Lambda): ✖

The array formula is a bit lengthier. Here it is.

Array Formula (Old School):

=ArrayFormula(transpose(split(textjoin(", ",1,IFERROR((if(row(indirect("A2:A"&counta(to_text(unique(filter(flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),",")),flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),","))<>""))))))-match(REGEXEXTRACT(to_text(unique(filter(flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),",")),flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),","))<>""))),"\w+"),REGEXEXTRACT(to_text(unique(filter(flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),",")),flatten(split(if(len(A2:A),row(A2:A),)&" "&substitute(A2:A,", ",","&row(A2:A)&" "),","))<>""))),"\w+"),0)=1,"|",)&REGEXREPLACE(to_text(unique(filter(flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),",")),flatten(split(if(len(A2:A),row(A2:A),)&" "&trim(substitute(A2:A,",",","&row(A2:A)&" ")),","))<>""))),"^(\w*\W*){1}","")))),"|")))

For the sample sheet with the formula entered (step-by-step), please scroll down to the end of the post.

How does this formula remove duplicate texts from the comma-delimited strings?

We can explain the above formula in 9 steps under three categories.

  1. Split Strings and Row Wise Unique (3 steps).
  2. Elements to Combine Strings Back to Its Rows (4 steps).
  3. Final Steps to Remove Duplicates from Comma Delimited Strings (2 steps).

1. Split Strings and Row Wise Unique

The below formulas go in cells B2, C2, and D2. Please see the image below to understand what they return.

1. Inserting Row Numbers and a White Space Immediately After the Delimiters.

=ArrayFormula(trim(substitute(A2:A,",",","&row(A2:A)&" ")))

2. Inserting Row Numbers at the Beginning of Values in Each Row.

=ArrayFormula(if(len(A2:A),row(A2:A),)&" "&B2:B)

The purpose of the above two formulas is to add row numbers to each string.

3. Split Strings Based on Delimiter, Unique, and Filter.

=ArrayFormula(to_text(unique(filter(flatten(split(C2:C,",")),flatten(split(C2:C,","))<>""))))
Steps 1 to 3 - Split Strings and Row Wise Unique

When you unique the strings this way, the formula will only remove duplicates from the same row.

We have solved one main issue we may encounter when trying to remove duplicates from comma-delimited strings.

2. Elements to Combine Strings Back to Their Rows

The below formulas go in cells E2, F2, G2, and H2.

The step # 3 formula in the previous step returns unique row number added values.

Below will separate the row numbers (4th step) and the rest of the values (5th step). Please refer to the image below.

4. Extracting Row Numbers.

=ArrayFormula(REGEXEXTRACT(D2:D16,"\w+"))

5. Extracting Strings.

=ArrayFormula(REGEXREPLACE(D2:D16,"^(\w*\W*){1}",""))

Related: Regex to Get All Words after Nth Word in a Sentence in Google Sheets.

The purpose of the following two (steps # 6 and 7) formulas is to return a separator to mark the end of each row based on column D.

We will achieve that by returning the running count of step # 4 output and replacing sequence number 1 with a pipe and the rest of the values with blanks.

6. Running Count Based on Extracted Row Numbers.

=ArrayFormula(row(A2:A16)-match(E2:E16,E2:E16,0))

7. Delimiter to Separate Rows.

=ArrayFormula(if(G2:G16=1,"|",))

Note:- Here, I have used the | character. If your strings are separated by the same, not a comma, then use here a ~ character instead.

Steps 4 to 7 - Elements to Combine Strings Back to Its Rows

3. Final Formula that Removes Duplicates from Comma-Delimited Strings

Please see columns F and H. We want those columns in the final step. First, we will combine the pipe (or tilde) separators in column H with the strings in column F.

Then join them using a comma separator, split at the pipe (or tilde) separator, and transpose.

The following formulas go in cells I2 and J2.

8. Add Separators with Extracted Strings.

=ArrayFormula(H2:H16&F2:F16)

9. Combine Strings Using a Comma Delimiter and Split Based on Separator.

=transpose(split(textjoin(", ",1,I2:I16),"|"))
Steps 8 and 9 - Array Formula to Remove Duplicates from Comma Delimited Strings

This way we can remove duplicates from comma-delimited strings in Google Sheets.

    Note: There shouldn’t be any blank rows in the range. Also, this resource-hungry formula may return errors in a larger data range.

    New Approach (Lambda): ✔

    Array Formula (Lambda – New):

    =map(A2:A,lambda(r,iferror(ArrayFormula(textjoin(", ",true,unique(trim(split(r,",")),true))))))

    Just insert this code in cell B2. Unlike the non-array formula, you won’t require to copy-paste it down.

    It will spill down automatically if there are blank cells below B2.

    It is the same non-array formula we used to remove duplicates from the comma-delimited strings.

    The only difference is the MAP Lambda helper function.

    The LHF helps the non-array formula to spill down.

    Sample_Sheet_15521

    Prashanth KV
    Prashanth KV
    Your Trusted Google Sheets and Excel Guide Prashanth KV brings a wealth of experience in Google Sheets and Excel, cultivated through years of work with multinational corporations in Mumbai and Dubai. As a recognized Google Product Expert in Docs Editors, Prashanth shares his expertise through insightful blogging since 2012. Explore his blog for practical tips and guidance on maximizing your spreadsheet skills.

    Create a Calendar in Excel with a One-Line Dynamic Array Formula

    This tutorial explains how to create a calendar in Excel using a one-line formula...

    Excel: Month Name to Number & Number to Name

    This tutorial showcases the most efficient formulas for converting a month name to a...

    Get the First or Last Row/Column in a New Google Sheets Table

    If you've recently started using the new Google Sheets TABLE functionality, you may find...

    UNIQUE Function in Visible Rows in Google Sheets

    The UNIQUE function doesn't inherently include only visible rows when it returns values, discarding...

    More like this

    Get the First or Last Row/Column in a New Google Sheets Table

    If you've recently started using the new Google Sheets TABLE functionality, you may find...

    UNIQUE Function in Visible Rows in Google Sheets

    The UNIQUE function doesn't inherently include only visible rows when it returns values, discarding...

    Customizing Alternating Colors of a Table in Google Sheets

    Google Sheets' new table feature offers limited options for customizing colors. While you can...

    LEAVE A REPLY

    Please enter your comment!
    Please enter your name here

    This site uses Akismet to reduce spam. Learn how your comment data is processed.