Group a Column by the First Few Characters in Google Sheets

Published on

Grouping a column in Google Sheets is easier than you might think. All you need is a basic understanding of the QUERY function. However, sometimes you may want to group data based on the first few characters in a column. Let me explain how to do that step by step.

When working with spreadsheet applications, the best way to generate ideas is by learning as many functions as possible. This is because you can combine or nest functions to achieve your desired results.

For grouping a column by the first few characters, you’ll need to use a combination of functions—specifically, the LEFT and QUERY functions.

Sample Data for Grouping by the First Few Characters

The sample data consists of item codes in the first column and spans across 4 columns.

Item CodeProductUnitQty
AE2565Apple – U.S. FancyKg10
AE2566Apple – U.S. No. 1Kg5
AE2567Apple – U.S. UtilityKg10
AE2566Apple – U.S. No. 2Kg10
OE1979Orange – U.S. FancyKg20
OE1980Orange – U.S. No. 1Kg20
OE1982Orange – U.S. No. 1 GoldenKg10

In this dataset, the goal is to group the data based on the first two letters in column A and sum the quantities in column D.

A standard QUERY formula that groups by item codes would look like this:

=QUERY(A1:D, "SELECT Col1, SUM(Col4) WHERE Col1<>'' GROUP BY Col1", 1)

But how do you group this data by the first few characters of the item codes? Let’s learn how to do it step by step.

Step 1: Extracting the First Few Characters

Our original data is in the range A1:D, where A1:D1 contains the field labels. In cell E1, enter the following array formula to extract the first two characters from the item codes:

=ArrayFormula(VSTACK("Title", LEFT(A2:A, 2)))
Extracting the First Two Characters for Grouping

The LEFT function returns the first two characters from the range A2:A, and the VSTACK function appends the column name “Title” to it. You can replace “Title” with any name you prefer, such as “Group Item.”

Step 2: Grouping Data by the First Two Characters

Next, extend the data range to include column E and group by column E instead of column A. The formula will be:

=QUERY(A1:E, "SELECT Col5, SUM(Col4) WHERE Col5<>'' GROUP BY Col5", 1)

This will group the data by the first two characters. If you enter this formula in cell G1, you will get the output in the range G1:H3.

Titlesum Qty
AE35
OE50

Step 3: Removing the Helper Column

If you prefer not to keep the helper column, modify the data used in the QUERY function.

Currently, the data range is A1:E. You should use it as follows:

HSTACK(ArrayFormula(VSTACK("Title", LEFT(A2:A, 2))), B1:D)

The HSTACK function horizontally stacks the helper column formula with the data in B1:D.

You can use this formula with the standard QUERY formula mentioned above Step 1 as follows:

=QUERY(HSTACK(ArrayFormula(VSTACK("Title", LEFT(A2:A, 2))), B1:D), "SELECT Col1, SUM(Col4) WHERE Col1<>'' GROUP BY Col1", 1)

Note: You can move the ARRAYFORMULA to the beginning, which is a common practice. The formula would then look like this:

=ArrayFormula(QUERY(HSTACK(VSTACK("Title", LEFT(A2:A, 2)), B1:D), "SELECT Col1, SUM(Col4) WHERE Col1<>'' GROUP BY Col1", 1))

That’s how you group data by the first few characters in Google Sheets.

Resources

Prashanth KV
Prashanth KV
Your Trusted Google Sheets and Excel Guide Prashanth KV brings a wealth of experience in Google Sheets and Excel, cultivated through years of work with multinational corporations in Mumbai and Dubai. As a recognized Google Product Expert in Docs Editors, Prashanth shares his expertise through insightful blogging since 2012. Explore his blog for practical tips and guidance on maximizing your spreadsheet skills.

Hierarchical Numbering Sequences in Excel

Creating hierarchical numbering sequences in an Excel spreadsheet can significantly improve the way you...

How to Easily Repeat a Sequence of Numbers in Excel

Excel offers multiple ways to accomplish tasks, and the simplicity of each approach depends...

Create a Sequence of Dates at Every Nth Row in Excel (Dynamic Array)

Would you like to create a sequence of dates in every nth row in...

XMATCH Row by Row: Finding Values Across a Range in Google Sheets

Using the BYROW function with XMATCH in Google Sheets allows us to match values...

More like this

XMATCH Row by Row: Finding Values Across a Range in Google Sheets

Using the BYROW function with XMATCH in Google Sheets allows us to match values...

Limit Formula Expansion to a Specific Row in Google Sheets

In this tutorial, I’ll explain how to limit the expansion of an array formula...

3-D Referencing Structured Data Tables in Google Sheets

When you have several tables within a single sheet—not across multiple sheets in a...

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.