REGEXP_CONTAINS function: How to use & example
Explore the power of text string search using REGEXP_CONTAINS function in Looker Studio, master its syntax, and learn how to extract specific data patterns.
Google Data Studio Function : REGEXP_CONTAINS
Google Data Studio opens up an expansive world of data manipulation and extraction for users. One tool in its feature-rich toolkit is the
REGEXP_CONTAINS
function. This function offers the power to search text strings using regular expressions, providing an efficient way to identify and extract specific data patterns.
Function Syntax and How It Works
The
REGEXP_CONTAINS
function uses the following syntax:markdown
Where:
REGEXP_CONTAINS(X, regular_expression)
- 'X' is a field or an expression that you wish to evaluate.
- 'regular_expression' is the pattern you want to search for.
This function will return Boolean values, i.e., true or false. If the pattern you've entered in 'regular_expression' is found within 'X', the function will return true. Otherwise, it will return false.
One key difference between
REGEXP_CONTAINS
and the similar REGEXP_MATCH
function is that REGEXP_CONTAINS
can match a part of the value in 'X', while REGEXP_MATCH
checks to match the entire string by default.
Examples
Let's say you have a data set of monthly sales reports. Each report is labeled with text that includes the sales representative's name ("rep") and the month ("mon"), e.g., "rep_John_mon_Jan."
Now, suppose you want to find all sales reports by the representative "John." You could use
REGEXP_CONTAINS
to achieve this.markdown
REGEXP_CONTAINS(report_label, 'rep_John')
This will evaluate the 'report_label' field and return true for any record containing 'rep_John' and false for all others.
Function Limitations
This function detects patterns based on RE2 regular expression syntax, so it's important to familiarize yourself with it to produce accurate results. Patterns containing escape characters, such as
\
, may require additional escaping in Google Data Studio.
Helpful Tips
Despite its limitations,
REGEXP_CONTAINS
is a powerful function that allows for complex pattern matching. Not only can it be used to detect substrings, but with a well-crafted regular expression, you can recognize specific character sequences, repeat patterns, and much more.
Remember to be meticulous with the regular expressions you use, as incorrect syntax can lead to unexpected results, and always test your expressions before finalizing your reports. Happy data analyzing!