HomeGoogle DocsSpreadsheetHow to Use Regexextract Function in Google Sheets and RE2 Regular Expressions

How to Use Regexextract Function in Google Sheets and RE2 Regular Expressions

Published on

Google Sheets Regex functions are a challenge to learn for even for its regular advanced users. So this time I dedicate my topic to Regexextract function in Google Sheets. This function is one among the three Regex functions in Google Sheets.

You cannot find enough resources or discussion connected with Google Sheets Regex functions anywhere. The reason, it’s a bit complicated. I think users are skipping it mainly due to the lack of proper guidance anywhere.

There are three built-in REGEX functions in Google Sheets. They are Regexextract, Regexmatch, and Regexreplace. These functions are similar in use as the secret lies in the RE2 regular expressions. So learning any one of these function is enough to understand the rest. Here I will explain to you how to use the Regexextract function in Google Sheets.

I have selected Regexextract for my tutorial because instead of true or false, you can see the value as result in this.

You can use the REGEX function in combination with the array, trim, transpose, query like other functions. Understanding Regex comes important at this point. First, learn Regex and then start using it with other functions.

Why Regex function seems complected? As already told it is purely connected with the RE2 expressions in it in use. If you go through the expressions you will not understand much. So with the help of few formulas, I will try to explain the most frequently using RE2 expressions in Google Sheets Regexextract. It is also applicable to the Regexmatch as well as the Regexreplace functions.

How to Use Regexextract Function in Google Sheets

Google Sheets Regexextract function can only work with text. If any extracted value from the text is numeric you can use the value() function to convert it to numeric. As already told, you can use Regex function together with lots of other Google Sheets functions. I am not going to that part this time.

Let us begin with the syntax:

REGEXEXTRACT(text, regular_expression)

What are the two elements in the syntax?

text – The input text.

regular_expression – The first part of text that matches this expression will be returned.

Google products use RE2 for regular expressions. That part is the complicated one with the functions and it’s pathetic that there is no proper guide to use these expressions within functions in Google Sheets.

Now it’s time to go through different scenarios.

Formulas to Learn How to Use Regexextract Function in Google Sheets

1. When you want to extract any plain text using regex, use that directly within quotes

Cell value in A2 is Info Inspired 2017, India

Formula:

=REGEXEXTRACT(A2, "Info Inspired")

Result: Info Inspired

2. To extract either of the character or word  (this or that)

Cell value in A3 is Info Inspired 2017, India

Formula:

=REGEXEXTRACT(A3, "y|o")

Result: o

Formula:

=REGEXEXTRACT(A3, "blog|India")

Result: India

Remember the syntax this time. What does it say? “The first part of the text that matches this expression will be returned.”

3. A single period extracts single character, two periods extract 2 characters and so on.

Cell value in A12 is Info Inspired 2017

Formula:

=REGEXEXTRACT(A12, "....")

Result: Info

Formula:

=REGEXEXTRACT(A12, "In..")

Result: Info

The above formula will be applicable to the first part of the text. When you want to extract a certain number of characters from the end, use the formula as below.

Formula:

=REGEXEXTRACT(A12,"(....\z)")

Result: 2017

4. Google Sheets Regexextract function to extract decimal digits

Cell value in A6 is Info Inspired 2017 Blog

A single decimal digit, not the whole number.

Formula:

=REGEXEXTRACT(A6, "(\d)")

Result: 2

A whole number, the “+” make the difference.

Formula:

=REGEXEXTRACT(A6, "(\d+)")

Result: 2017

5. When you want to extract characters

Cell value in A7 is Info Inspired 2017 Blog

Formula:

=REGEXEXTRACT(A7, "(\w)")

Result: I

Formula:

=REGEXEXTRACT(A7, "(\w+)")

Result: Info

6. When you want to extract both characters and numbers

Cell value in A8 is Info Inspired 2017 Blog

Formula:

=REGEXEXTRACT(A8, "(\w*)\s(\d+)")

Result: Inspired in one cell, 2017 in another, horizontally.

RE2 expression in this formula “\s” used to match the white space between Inspired and 2017.

7. Regexextract function to extract the whole text

Cell value in A17 is Info Inspired 2017

Formula:

=REGEXEXTRACT(A17, "(.+)")

Result: Info Inspired 2017

7.1. Extract Whole Word Based on Partial Match.

I want to extract the whole word based on a partial match.

Cell value in A1 is Google Sheets Tutorials

Formula:

=REGEXEXTRACT(A1,"Tut\w+")

Result: Tutorials

8. Characters grouped in square bracket 

Cell value in A5 is Info Inspired 2017 Blog

Formula:

=REGEXEXTRACT(A5, "[dxy]")

Result: d

9. Expression [a-z] can extract any small case letters in a sequence

Cell value in A9 is info inspired 2017

Formula:

=REGEXEXTRACT(A9, "[a-z]")

Result: i

Formula:

=REGEXEXTRACT(A9, "[a-z]+")

Result: info

10. Expression [A-Z] can extract any uppercase letters in a sequence

Cell value in A10 is INFO INSPIRED 2017

Formula:

=REGEXEXTRACT(A10, "[A-Z]")

Result: I

Formula:

=REGEXEXTRACT(A10, "[A-Z]+")

Result: INFO

11. [A-Za-z] Use this expression in Regexextract when you are not case sensitive

Cell value in A11 is Info Inspired 2017

Formula:

=REGEXEXTRACT(A11, "([A-Za-z])")

Result: I

Formula:

=REGEXEXTRACT(A11, "([A-Za-z]+)")

Result: Info

12. Expression [0-9] can match any number in a sequence from a text

Cell value in A19 is Info Inspired 2017

Formula:

=REGEXEXTRACT(A19, "[0-9]")

Result: 2

Formula:

=REGEXEXTRACT(A19, "[0-9]+")

Result: 2017

If you could learn the above use of RE2 expressions in Regexextract, you can continue to the few more advanced usage. The below examples are a mix of the above expressions.

13. Extracted content which is in between certain characters

Cell value in A21 is a ?Content? between question marks

Formula:

=REGEXEXTRACT(A21, "\?([A-Za-z]+)\?")

Result: Content

Cell value in A22 is a (Content) between open and close brackets

Formula:

=REGEXEXTRACT(A22, "\(([A-Za-z]+)\)")

Result: Content

Cell value in A24 is email me <info@infoinspired.com>

Formula:

=REGEXEXTRACT(A24, "<(.+)>")

Result: info@infoinspired.com

The above are the most common RE2 expressions in Regex Google Sheet functions.

Related:

1. How to Extract Multiple Words Using Regexextract in Google Sheets

2. How to Restrict Entering Special Characters in Google Sheets Using Regex

3. Extract Username from Email Address Using Regex in Google Sheets

4. Google Sheets LEFT, RIGHT Functions, and Alternative REGEXEXTRACT

Prashanth KV
Prashanth KV
Your Trusted Google Sheets and Excel Guide Prashanth KV brings a wealth of experience in Google Sheets and Excel, cultivated through years of work with multinational corporations in Mumbai and Dubai. As a recognized Google Product Expert in Docs Editors, Prashanth shares his expertise through insightful blogging since 2012. Explore his blog for practical tips and guidance on maximizing your spreadsheet skills.

XLOOKUP in Excel: Working with Visible Rows in a Table

In Excel, the XLOOKUP function isn't inherently designed to exclusively work with visible rows;...

SUMPRODUCT Differences: Excel vs. Google Sheets

The core purpose of the SUMPRODUCT function remains consistent across both Excel and Google...

Sum Values by Month and Category in Excel

With the help of EOMONTH, UNIQUE, and SUMIFS functions in Excel, you can efficiently...

Sum Values by Categories in Excel

In Excel, the easiest way to sum values in a range by categories is...

More like this

Interactive Random Task Assigner in Google Sheets

You have multiple tasks and multiple people. Here's a fun way to randomly assign...

Google Sheets Bar and Column Chart with Target Coloring

To color each data point in the column or bar chart according to the...

Analyzing Column A Between Non-Adjacent Values in Column B: Google Sheets

This tutorial addresses a common scenario encountered in Google Sheets: how to sum, average,...

68 COMMENTS

  1. I use your site often to get answers for various Google Sheets functionality and find it very helpful. Have a question about the Regexextract function for the following situation

    I want to extract my and other employees’ names, which are of different lengths, from the following format.

    “NEFT-P15031008433074-1-Jonathan Lance”

    I can extract the date after the first “-” (hyphen).

    How do I set up the Regexextract formula to extract the employee’s name after the third hyphen? Thanks.

    Regards,
    Jonathan

  2. Hi, I would like to extract everything after the first dash but before a word.

    Thanks for any help you can provide. I’ve given you editing privileges with the URL below.

    —URL removed by admin—

    • Hi, Spencer,

      Thanks for sharing your sheet with a clear example. Please find the array version of the following formula there.

      =regexextract(regexreplace(A2,"^(?:.*?word_here)","$0@"), "\-(.*)\@")

  3. Hello Prashanth,

    I have 3 lines(ctrl+enter) in one cell:

    “D1ACA
    Cu 4×10/10
    Max 63A”

    I want to extract line 1 (D1ACA) to one cell. 2 to another cell … and so on. Can you help me?

    Thanks in advance from Sweden

  4. Hi,

    Let’s say I would like to pull the month digit out from a date like this>> 23/04/2021.

    May I know how should I form my formula?

    • Hi, Alex Kwan,

      If the date in question is formatted as a text value, then use the below Regex.

      =REGEXEXTRACT(to_text(E3), "\/(.+)\/")

      If the date is actually a date formatted value, the below Month function may help.

      =month(E3)

      You can check the format using =ISDATE(E3)

  5. Thank you. this has partially solved the issue but also created a new issue. If I share a google sheet with you so you can see the cells I’m working with – is that possible? Also, I’m curious, what the “fishes” are in the formula. I’ve never seen that before.

    • Hi, Jesse Yuen,

      The fish is just a delimiter (character code). Instead, you can use the available symbols (characters) on your keyboard such as a ~, | etc.

      Feel free to share the link via reply.

  6. Hi – I’m trying to pull Instagram handles out of a google sheet. this formula: =REGEXEXTRACT(C5,"\@[A-Za-z0-9]+") works for handles that begin with a number or a letter, but it doesn’t work in the following scenarios:

    1. If there are multiple handles in a cell (e.g. @abc and @123), the formula only captures 1 handle.

    2. if the handle includes any punctuation like a “.” or “-” (those are the most common), the formula captures up to the “.” or “-“. I tried this: =REGEXEXTRACT(C4,"\@[A-Za-z0-9]+\.[A-Za-z0-9]+"), but this only works if a handle has “.”, otherwise it returns “#N/A”

    3. If the handle begins with a “.” or “-“, the formula creates an error.

    I’m hoping that I can use 1 formula for all these scenarios, but I have listed the scenarios in priority so I would be happy if 1 formula could solve for the base + scenario 1.

    • Hi, Jesse Yuen,

      This combo formula may help.

      =textjoin(", ",1,filter(split(REGEXreplace(C5,"(\@[\w]+)","🐠$0🐠"),"🐠"),search("@",split(REGEXreplace(C5,"(\@[\w]+)","🐠$0🐠"),"🐠"))))

  7. I really dig your tutorials as you don’t spoon-feed everything to your readers but make them figure some things out on their own via the examples. Great way to learn if you ask me 🙂

    Now I have this situation at work I need help with 😀

    I have the following info in one cell (without quotes):

    “(ABC) Joe Smith, (DEF) Jane Doe, (GHI) Michael Jordan, (ABC) Wayne Gretzky”

    where the letters in parentheses are their job codes

    The result I’m looking to output in another cell is this (without quotes):

    “Joe Smith, Wayne Gretzky”

    where I want to extract and display the names of only those with the job code (ABC) and disregard the rest.

    Hoping you can help Prasanth! Thanks.

    • Hi, Arthur,

      Assume the names to extract without designations are in cell A1. Then you can try the following formula in cell B1.

      =textjoin(",",1,filter(split(trim(REGEXREPLACE(A1, "\(([a-zA-Z]+)\)","")),","),search("(ABC)",split(A1,","))))

      Another formula as an alternative.

      =ArrayFormula(trim(textjoin(",",1,regexreplace(filter(SPLIT(A1,","),search("(ABC)",SPLIT(A1,","))),"\(.*\)",""))))

  8. Hi,

    I would like to extract a date value and price in a specific string.

    I’m creating a trading journal for options. Here is the string.

    Ex: abc201224p00025000

    abc=ticket
    201224 = date (yymmdd)
    p = option strategy (always one letter p or c)
    00025000 = price (always 8 digits) use only first 7 digits to determine price. $25.00.

    Can you help me? I have tried and get stuck with a number and not a date value or price value.

    Thank you,

    Daniel

    • Hi, Daniel,

      The date you can extract and reformat using the below combo formula.

      =date("20"&left(REGEXEXTRACT(A1,"(\d+)"),2),mid(REGEXEXTRACT(A1,"(\d+)"),3,2),right(REGEXEXTRACT(A1,"(\d+)"),2))

      Regarding the price, use the below Regexextract.

      =REGEXEXTRACT(A1,"(\d+)$")

      It would return 00025000. You can further use the Regex or MID function to return the amount from this.

  9. Hi,

    I have to loop thru a range of cells and use REGEXEXTRACT. How do I pass the cell value dynamically to REGEXEXTRACT inside the app script?

    Thanks

  10. Really helpful thank you!

    I have the following:

    2019 Nurture.02 - B2C - BRSM Commerce - Sample: Virtual Booth with Forrester.Email A

    I only need Sample: Virtual Booth with Forrester.Email A what is the formula for this?

    • Hi, Ian Burgers,

      I’ve modified part of your text due to a ‘possible’ SEO issue.

      I could understand that you want to extract the strings after the last hyphen. This may help.

      =REGEXREPLACE(A1, "(.*)\-","")

      To trim any white space, use TRIM with REGEX formula as below.

      =trim(REGEXREPLACE(A1, "(.*)\-",""))

  11. Hi Prashanth,

    Thanks for putting all this useful info together!

    If I have some numbers separated by quotation marks. How can I split them so as to return the number of each group in different cells horizontally?

    For example:

    Input : “2”,”2″,”2″,”2″
    Desired output: 4 cells horizontally with number 2

    Another example:

    Input: “4-0,13″,”1″,”2-3,6″,”3”
    Desired output: 4 cells
    first cel: 4-0,13
    second cell: 1
    third cell: 2-3,6
    last cell: 3

    THANK YOU!

  12. Hi Prashanth,

    First of all, thank you very much for this guide.

    I have the following….. “Communication Services | Telecom Services | USA”

    How would I extract everything up to first | so in the above “Communication Services”

          • hi Prashanth. Many thanks! it works well until I face a more complicated example when we have several “|”.

            For example in “Communication Services | Telecom Services | USA|Germany|Angola”

            REGEXEXTRACT(A1, "\|(.+)\|")

            This returns everything between the first “|” and the last one ( Telecom Services | USA | Germany). Could you please help with advice on how this should be changed to get back “Telecom Services” only???

  13. Hello Prashanth!

    How would I go about extracting a number that duplicated itself in a cell (and it may be worth noting that the number may be different lengths in each cell)?

    For example:
    A1 = 33
    I want to extract just ‘3’

    Or,
    A1 = 123123
    I want to extract just ‘123’

    Thank you in advance.

    • Hi, Nadia,

      This regex-based combo formula will do the job for you.

      =join("",unique(TRANSPOSE(split(REGEXREPLACE(A1&"",".{1}", "$0,"),","))))*1

      But there might be an easier way.

  14. Hi Prashanth,

    What should I do if I want to Extract initials from names?

    Eg: “Rita Ruiz”
    result: “RR”

    Eg: “Don Miguel Rukeyser”
    result: “DMR”

    Eg: “Donald Runnels”
    result: “DR”

    • Hi, Nguyễn Thanh Tuyền,

      You may try this, please.

      =REGEXREPLACE(A2,"[a-z\s]","")

      If the names are in A2:A, then use the below REGEX array formula in B.

      =ArrayFormula(if(len(A2:A),REGEXREPLACE(A2:A,"[a-z\s]",""),))

  15. Thank you, Prashanth. This was really helpful, but I’m still having some trouble.

    I am trying to create a phone log with a tab for each letter of the alphabet. I should note that information comes from a Google Form. Can you provide any guidance? Thank you.

  16. Hi Prashanth,

    For this formula;

    =query(transpose(split(A1,char(10))),"Select * where Col1 contains 'Item'")

    If the item details are in more than one line?

    For example:
    Item: *2 x big
    ball*

    Is there any solution for this problem?

  17. Thanks, its also work 🙂

    =REGEXEXTRACT(clean(SUBSTITUTE(A1,char(10),"~")),"(\d+)~~Congratulations")

    This will returns the number before a specific string. So what should I change if I want to return number after a specific string?

    • It depends on the pattern of value. Here is one example.

      String in cell A1: 10 students scored 500 points

      The formula in cell B1 to return the number present just after the substring “scored”

      =REGEXEXTRACT(A1,"scored (\d+)")

      Result: 500

    • Hi, Sterben,

      Assume the above text with numbers is in cell A1. Then try this Regexextract.

      =REGEXEXTRACT(clean(A1),"(\d+)Congratulations")

      Included the CLEAN function assuming the text entered in multiple lines in cell A1. If not, you can remove the use of it.

        • The new line character makes the issue. Here are two more formulas.

          Option 1: Returns the second number.

          =regexextract(A1,"(?:\D*(\d+)){2}")

          Option 2: Returns the number before a specific string.

          =REGEXEXTRACT(clean(SUBSTITUTE(A1,char(10),"~")),"(\d+)~~Congratulations")

          • Thank you, Prashanth. It work like a charm!

            I have another question.

            Example:
            Service: XXXXX
            Item: *4 × ruler*
            Total price: XXXX

            XXXXX
            XXXXX

            Date: XXXX
            Time: XXXX
            Type: XXXX
            Service: XXXXXX
            Item: *2 ball*
            Total price: XXXXX

            I want to extract both “Item” ‘ s detail

            Result:
            Item: *2 ball*
            Item: *4 × ruler*

            Is it possible?

  18. I have pulled this formula from the web;

    =(regexextract(index(IMPORTHTML("http://finviz.com/quote.ashx?t="&"ggn","table", 11),7,2),"\d{1,}\.\d{1,}") + 0)/12

    As long as I use the stock symbol “ggn” it works fine but when I try to reference “ggn” with the cell reference in the spread sheet it gives me an error. any ideas?

    • Hi, Steve,

      It works perfectly for me.

      =(regexextract(index(IMPORTHTML("http://finviz.com/quote.ashx?t="&A1,"table", 11),7,2),"\d{1,}\.\d{1,}") + 0)/12

      Cell A1 contains the string “ggn” without quotes.

      Best,

  19. Prashanth,

    Thanks for the examples.

    I am trying to extract an options OPRA code into 4 different cells. I cannot use split text because there are no delimiters, and the char or number lengths are never consistent.

    Examples:

    The code has 4 parts: stock symbol, date, C or P, price

    .SCHL191018P22.5 – would split to: SCHL – 191018 – P – 22.5
    .MU190920P46 – would split to MU – 190920 – P – 46
    .NIO19092CP3 – would split to – NIO – 190927 – C – 3

    With the code in A1, the period is not needed, so I can use:=RIGHT(A1, LEN(A5)-1) = SCHL191018P22.5

    With that result in B2, I used =LEFT(B1,SEARCH("19",B1)-1) to get the code of whatever is left of the year 19.

    I don’t think this is a very clean formula since the year will change and I would have to redo the formula every year. Do you have a better suggestion?

    Then I used =REGEXEXTRACT(B1, "[0-9]+") to get the date = 191018. The date is always 6 numbers long. Sheets doesn’t read this as a date, and I have not figured out how to switch the year,month,date around.

    If you have any suggestions on the remaining parts, I would greatly appreciate a point in the right direction.

    !!!EDIT!!!

    As soon I post this, I go back and figure it out…. here is what I came up with:

    =REGEXEXTRACT(A3, "([A-Za-z]+)([0-9]+)([A-Za-z]+)((\d*\.)?\d+)")

    It generates an extra column, but I can just hide it.

      • Yes, it does. Much cleaner, thank you.

        I’ve now moved on to complex option codes, they add a dash or plus, so I can use the SPLIT function.

        Example:

        .AAPL190927C207.5-.AAPL190927C210+.AAPL190927P205-.AAPL190927P202.5

        Formula

        =split(A13, "-|+")

        I wanted to use truncate to expand down rows, but I got errors for the rows below that already had formula data in. So, split will expand out columns instead.

        I still never solved the date problem.

        Using DATE function converts to 1900 rather than 2000, throwing off other calculations:

        W13 cell’s data is “190927”

        =DATE((left(W13,2)),(mid(W13,3,2)),(right(W13,2)))

        This shows 09/27/1919 instead of 09/27/2019. Any ideas?

  20. How would I extract characters between a / and a |? I’m having some trouble?

    Example: electrical/wire-management/cable-ties | Cable Tie Design:Standard

    Also, extract Characters After or Before –like After |

    Any help is appreciated!

    • Hi, Calvin,

      Formula to extract the string between a forward slash and a pipe.

      =REGEXEXTRACT(C2, "\/(.+)\|")

      Since there are multiple forward slashes, replacing the first forward slash with any other character (blank) will give you a more accurate result.

      =REGEXEXTRACT(SUBSTITUTE(C2,"/","",1), "\/(.+)\|")

      To extract the strings before the pipe, use this Regexextract formula.

      =REGEXEXTRACT(C2, "(.+)\|")

      After the pipe, you can try this.

      =REGEXEXTRACT(C2, "\|(.+)")

      Best,

  21. Hi Prashanth,

    I’m trying to extract one or multiple hashtags from a long text cell. I’m able to extract one but I’m not sure how to extract the 2nd, 3rd, 4th, etc. info a row.

    For example, here is my formula:

    =REGEXEXTRACT(D2,"#\w+")

    I also tried this formula:

    =iferror({REGEXEXTRACT(D2,"#\w+"),REGEXEXTRACT(D2,"#\w+")})

    Thanks!

    • Hi, Dontae Mears,

      To extract multiple hashtags, you can possibly use this REGEXREPLACE based formula in Google Sheets.

      =SPLIT(REGEXREPLACE(D2, "(^|\s)[^#]\S*", "")," ")

      Best

  22. Hi I’m trying to regEx a date like this: 2/20/2019, and I’m having trouble with the year: using this did not work REGEXEXTRACT(F2,"\/([0-9]+)"),"") because it extracts the 20

    Any ideas of what I should do?

  23. I have a Google Sheets file that pulls all of my Amazon orders from my Gmail for further processing. I want a second cell that pulls just the total order cost.

    A snip of the email body text in A1:

    Total Before Tax:
    Estimated Tax:
    Order Total:
    $37.61
    $1.92
    $39.53
    We hope to see you again soon.

    I cannot seem to get a REGEXEXTRACT formula to pull the $37.61 figure (I am ok with ignoring the actual total including tax).

  24. Hi, How can I extract characters between @ and . from below email id.
    “corey@g4designhouse.com” OR “accounting@g4designhouse.com”

    Output needed is g4designhouse

    Please help

    • Hi, Shagun Chaudhary,

      To extract the domain name without TLD from email address, I mean the text between “@” and “.”, you can use the below Regexreplace formula.

      =REGEXEXTRACT(A1, "@(.+)\.")

      Replace the cell address.

      Best,

  25. I have the value $1,200.00 at M274

    =REGEXEXTRACT(M274, "[0-9]+")

    only extracts “1”.

    Perhaps your tutorial could use an update? or perhaps I misread something.

  26. The page seems really helpfull. Can you please tell me how to use =REGEXEXTRACT(A22, "\(([A-Za-z]+)\)") if multiple content between special character are to be extracted.

    • Hi, Laura,
      I fear I don’t interpret your question correctly. I guess you want to try something like this.
      Text in Cell A1: “This is a (relatively new tech) blog.”
      Result: “relatively new tech”
      If this is the case, you can use either of the below REGEX formulas.
      Formula 1:
      =REGEXEXTRACT(A1, "\((.+)\)")
      Formula 2:
      =REGEXEXTRACT(A1, "\(([A-Za-z\s]+)\)")

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.