HomeGoogle DocsSpreadsheetHow to Filter the Top 3 Most Frequent Strings in Google Sheets

How to Filter the Top 3 Most Frequent Strings in Google Sheets

Published on

This post explains how to filter the top 3 most frequent strings in Google Sheets.

Before going to an example, let me clarify one important point about the result that my formula will return.

Assume there are five strings, and the occurrences of them are as follows.

ValuesNumber of Times (Occurrences)
A4
B1
C4
D5
E2

Note:- For the sample data, please jump to image # 1 below (A2:A). We will find the occurrences as above using a COUNTIF later.

My formula is going to return the strings D (5 occurrences), A (4 occurrences), C (4 occurrences), and E (2 occurrences).

We want three strings, right? Then why does the formula return four strings?

It is because there are two strings in the second position, and they are A and C with four occurrences each.

One more thing!

Earlier, we have learned how to find the MODE of text values.

That post is related to finding the top most frequent string (MODE) / strings (MODE.MULT), not about finding the top 3 or n most frequent strings.

I’ll explain step by step how to filter the top 3 most frequent strings in Google Sheets.

Please note that you can use ‘n’ instead of 3 without making any major changes in the formula.

Example

Filtering the Top 3 Most Frequent Strings - Example
image # 1

In the above example, the sample texts are in A2:A, and my formula in C2 returns the most frequent three values and their occurrences in the number of times.

As you can see, the formula returns four values since the values “A” and “C” repeat twice, which I have already explained above.

Let’s go to the tutorial section.

Filter the Top 3 Most Frequent Strings in Google Sheets

We will use a few helper columns (additional columns) in the beginning steps, and on the course, we can remove those.

There are three main steps involved. Here are them.

Step # 1 – COUNTIF Array to Return the Count of Strings

Prepare a sample sheet as per the above image. Just enter the values in A2:A. Leave other values.

In that sheet, in cell B2 insert the below COUNTIF Array Formula to return the count of text strings in A2:A.

=ArrayFormula(if(len(A2:A),countif(A2:A,A2:A)))

To make you know what the COUNTIF returns, I am leaving the corresponding screenshot below.

Countif Array Result
image # 2

In the above formula, you may want to know the LEN use. You can find that here – LEN Function in Google Sheets and Practical Use of It.

The COUNTIF returns how many times each string occurs.

Step # 2 – Find the Number of Times the Most Frequent N Strings Occur

The next two Google Sheets formulas determine the number of most frequent strings that you want to filter.

First, you may enter the below SORTN, UNIQUE, and COUNTIF combination formula in cell D2.

=sortn(unique(countif(A2:A,A2:A)),3,0,1,0)

Formula Explanation

The COUNTIF enclosed within UNIQUE returns the unique values from the values in column B.

In the above formula, instead of referring to the COUNTIF result in B2:B, I have used the same formula within the UNIQUE itself.

The 3 in SORTN means return 3 values after sorting column 1 (there is only one column) in descending order.

If you want to filter the top 5 most frequent strings in Google Sheets, change 3 to 5 in the formula.

SORTN Explained Further

Now I am talking about the 0 present after the 3. It’s one of the SORTN display_ties_mode. It determines what to do in case of duplicates.

If you want, you can read more about ties mode here – SORTN Tie Modes in Google Sheets – The Four Tiebreakers.

The next 1 indicates the sort column, and 0 means to sort the column in descending order (Z-A).

Step # 2.1 – Coding a Regular Expression to Filter the Most Frequent 3 Strings

Now I am coming to the logic of my final formula that you will get after one more step below. So please pay extra attention.

Logic

We will filter the strings in A2:A based on the number of times they appear. But not just based on the number of times, but based on the top n number of times returned by the step # 2 formula.

So we will use the FILTER function to filter A2:A if B2:B (COUNTIF Array) values are equal to Step # 2 result.

To specify B2:B is equal to Step # 2 result in FILTER, we can use the REGEXMATCH in FILTER as below.

Generic Formula (You will get the final formula based on this generic formula in the next step)

=FILTER(A2:A,regexmatch(B2:B&"",step_2.1_formula))

Now, first, let’s understand the REGEXMATCH formula within the above Generic Formula.

The REGEXMATCH is one of the Text functions in Google Sheets. Since B2:B contains numbers, I’ve converted them to text by adding a null character. I mean B2:B is replaced by B2:B&"".

Otherwise, REGEXMATCH will return #VALUE! error stating something similar below.

Function REGEXMATCH parameter 1 expects text values. But ‘5’ is a number and cannot be coerced to a text.

Then what is the step_2.1_formula?

To match multiple numbers, in REGEXMATCH, we should combine the numbers in a specific way as below. That’s the step # 2.1 formula (a modified version of the D2 [Step # 2] formula).

="^"&textjoin("$|^",true,sortn(unique(countif(A2:A,A2:A)),3,1,1,0))&"$"

The above is as per the below generic formula.

="^"&textjoin("$|^",true,step_2_formula)&"$"
Regular Expression to Filter the Top 3 Most Frequent Strings
image # 3

Time to filter the most frequent 3 strings using the above number of times.

Step # 3 – Filter the Most Frequent 3 Strings Using Number of Times (Step 2.1 Result)

Once again, please pay attention to the generic formula to filter the most frequent 3 strings in Google Sheets.

=FILTER({A2:A,B2:B},regexmatch(B2:B&"",step_2.1_formula))

I’ve already explained all the required parameters to use in the FILTER. So I am going straightaway to the formula (not to the final formula).

Copy-paste the above generic formula to cell C2. Cut D2 formula (step_2.1_formula) and replace step_2.1_formula in the C2 generic formula with that.

Formula

=filter({A2:A,B2:B},regexmatch(B2:B&"","^"&textjoin("$|^",true,sortn(unique(countif(A2:A,A2:A)),3,1,1,0))&"$"))

It’s high time to remove the helper columns. For that, replace B2:B (it appears twice in the formula) with the formula in B2 itself.

=filter({A2:A,if(len(A2:A),countif(A2:A,A2:A))},regexmatch(if(len(A2:A),countif(A2:A,A2:A))&"","^"&textjoin("$|^",true,sortn(unique(countif(A2:A,A2:A)),3,1,1,0))&"$"))

Finally, you may enclose this formula with the UNIQUE to remove duplicates (multiple occurrences of rows).

=unique(filter({A2:A,if(len(A2:A),countif(A2:A,A2:A))},regexmatch(if(len(A2:A),countif(A2:A,A2:A))&"","^"&textjoin("$|^",true,sortn(unique(countif(A2:A,A2:A)),3,1,1,0))&"$")))

Then SORT the second column in descending order.

=sort(unique(filter({A2:A,if(len(A2:A),countif(A2:A,A2:A))},regexmatch(if(len(A2:A),countif(A2:A,A2:A))&"","^"&textjoin("$|^",true,sortn(unique(countif(A2:A,A2:A)),3,1,1,0))&"$"))),2,0)

That’s all. Thanks for the stay, enjoy!

Sample_Sheet_7121

Prashanth KV
Prashanth KV
Your Trusted Google Sheets and Excel Guide Prashanth KV brings a wealth of experience in Google Sheets and Excel, cultivated through years of work with multinational corporations in Mumbai and Dubai. As a recognized Google Product Expert in Docs Editors, Prashanth shares his expertise through insightful blogging since 2012. Explore his blog for practical tips and guidance on maximizing your spreadsheet skills.

Excel: Hotel Room Availability and Booking Template (Free)

Presenting a free Excel template with advanced features for hoteliers to check room availability...

Dynamic Sum Column in SUMIF in Google Sheets

To obtain a dynamic sum column (sum_range argument) in SUMIF, we can utilize the...

Create a Calendar in Excel with a One-Line Dynamic Array Formula

This tutorial explains how to create a calendar in Excel using a one-line formula...

Excel: Month Name to Number & Number to Name

This tutorial showcases the most efficient formulas for converting a month name to a...

More like this

Dynamic Sum Column in SUMIF in Google Sheets

To obtain a dynamic sum column (sum_range argument) in SUMIF, we can utilize the...

Get the First or Last Row/Column in a New Google Sheets Table

If you've recently started using the new Google Sheets TABLE functionality, you may find...

UNIQUE Function in Visible Rows in Google Sheets

The UNIQUE function doesn't inherently include only visible rows when it returns values, discarding...

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.