Jump to content

Welcome to CodeNameJessica

Welcome to CodeNameJessica!

💻 Where tech meets community.

Hello, Guest! 👋
You're just a few clicks away from joining an exclusive space for tech enthusiasts, problem-solvers, and lifelong learners like you.

🔐 Why Join?
By becoming a member of CodeNameJessica, you’ll get access to:
In-depth discussions on Linux, Security, Server Administration, Programming, and more
Exclusive resources, tools, and scripts for IT professionals
A supportive community of like-minded individuals to share ideas, solve problems, and learn together
Project showcases, guides, and tutorials from our members
Personalized profiles and direct messaging to collaborate with other techies

🌐 Sign Up Now and Unlock Full Access!
As a guest, you're seeing just a glimpse of what we offer. Don't miss out on the complete experience! Create a free account today and start exploring everything CodeNameJessica has to offer.

In regular expressions, round brackets (()) are used for grouping. Grouping allows you to apply operators to multiple tokens at once. For example, you can make an entire group optional or repeat the entire group using repetition operators.


Basic Usage

For example:

Set(Value)?

This pattern matches:

  • "Set"

  • "SetValue"

The round brackets group "Value", and the question mark makes it optional.

Note:

  • Square brackets ([]) define character classes.

  • Curly braces ({}) specify repetition counts.

  • Only round brackets (()) are used for grouping.


Backreferences

Round brackets not only group parts of a regex but also create backreferences. A backreference stores the text matched by the group, allowing you to reuse it later in the regex or replacement text.

Example:

Set(Value)?

If "SetValue" is matched, the backreference \1 will contain "Value". If only "Set" is matched, the backreference will be empty.

To prevent creating a backreference, use non-capturing parentheses:

Set(?:Value)?

The (?: ... ) syntax disables capturing, making the regex more efficient when backreferences are not needed.


Using Backreferences in Replacement Text

Backreferences are often used in search-and-replace operations. The exact syntax for using backreferences in replacement text varies between tools and programming languages.

For example, in many tools:

  • \1 refers to the first capturing group.

  • \2 refers to the second capturing group, and so on.

In replacement text, you can use these backreferences to reinsert matched text:

Find:  (\w+)\s+\1
Replace:  \1

This pattern finds doubled words like "the the" and replaces them with a single instance.


Using Backreferences in the Regex

Backreferences can also be used within the regex itself to match the same text again.

Example:

<([A-Z][A-Z0-9]*)\b[^>]*>.*?</\1>

This pattern matches an HTML tag and its corresponding closing tag. The opening tag name is captured in the first backreference, and \1 is used to ensure the closing tag matches the same name.


Numbering Backreferences

Backreferences are numbered based on the order of opening brackets in the regex:

  • The first opening bracket creates backreference \1.

  • The second opening bracket creates backreference \2.

Non-capturing groups do not count toward the numbering.

Example:

([a-c])x\1x\1

This pattern matches:

  • "axaxa"

  • "bxbxb"

  • "cxcxc"

If a group is optional and not matched, the backreference will be empty, but the regex will still work.


Looking Inside the Regex Engine

Let’s see how the regex engine processes the following pattern:

<([A-Z][A-Z0-9]*)\b[^>]*>.*?</\1>

when applied to the string:

Testing <B><I>bold italic</I></B> text
  1. The engine matches <B> and stores "B" in the first backreference.

  2. It skips over the text until it finds the closing </B>.

  3. The backreference \1 ensures the closing tag matches the same name as the opening tag.

  4. The entire match is <B><I>bold italic</I></B>.


Backreferences to Failed Groups

There’s a difference between a backreference to a group that matched nothing and one to a group that did not participate at all:

Example:

(q?)b\1

This pattern matches "b" because the optional q? matched nothing.

In contrast:

(q)?b\1

This pattern fails to match "b" because the group (q) did not participate in the match at all.

In most regex flavors, a backreference to a non-participating group causes the match to fail. However, in JavaScript, backreferences to non-participating groups match an empty string.


Forward References and Invalid References

Some modern regex flavors, like .NET, Java, and Perl, allow forward references. A forward reference is a backreference to a group that appears later in the regex.

Example:

(\2two|(one))+

This pattern matches "oneonetwo". The forward reference \2 fails at first but succeeds when the group is matched during repetition.

In most flavors, referencing a group that doesn’t exist results in an error. In JavaScript and Ruby, such references result in a zero-width match.


Repetition and Backreferences

The regex engine doesn’t permanently substitute backreferences in the regex. Instead, it uses the most recent value captured by the group.

Example:

([abc]+)=\1

This pattern matches "cab=cab".

In contrast:

([abc])+\1

This pattern does not match "cab" because the backreference holds only the last value captured by the group (in this case, "b").


Useful Example: Checking for Doubled Words

You can use the following regex to find doubled words in a text:

\b(\w+)\s+\1\b

In your text editor, replace the doubled word with \1 to remove the duplicate.

Example:

  • Input: "the the cat"

  • Output: "the cat"


Limitations

  • Round brackets cannot be used inside character classes. For example:

[(a)b]

This pattern matches the literal characters "a", "b", "(", and ")".

  • Backreferences also cannot be used inside character classes. In most flavors, \1 inside a character class is treated as an octal escape sequence.

Example:

(a)[\1b]

This pattern matches "a" followed by either \x01 (an octal escape) or "b".

Grouping with round brackets allows you to:

  • Apply operators to entire groups of tokens.

  • Create backreferences for reuse in the regex or replacement text.

Use non-capturing groups (?: ... ) to avoid creating unnecessary backreferences and improve performance. Be mindful of the limitations and differences in behavior across various regex flavors.

Table of Contents

  1. Regular Expression Tutorial

  2. Different Regular Expression Engines

  3. Literal Characters

  4. Special Characters

  5. Non-Printable Characters

  6. First Look at How a Regex Engine Works Internally

  7. Character Classes or Character Sets

  8. The Dot Matches (Almost) Any Character

  9. Start of String and End of String Anchors

  10. Word Boundaries

  11. Alternation with the Vertical Bar or Pipe Symbol

  12. Optional Items

  13. Repetition with Star and Plus

  14. Grouping with Round Brackets

  15. Named Capturing Groups

  16. Unicode Regular Expressions

  17. Regex Matching Modes

  18. Possessive Quantifiers

  19. Understanding Atomic Grouping in Regular Expressions

  20. Understanding Lookahead and Lookbehind in Regular Expressions (Lookaround)

  21. Testing Multiple Conditions on the Same Part of a String with Lookaround

  22. Understanding the \G Anchor in Regular Expressions

  23. Using If-Then-Else Conditionals in Regular Expressions

  24. XML Schema Character Classes and Subtraction Explained

  25. Understanding POSIX Bracket Expressions in Regular Expressions

  26. Adding Comments to Regular Expressions: Making Your Regex More Readable

  27. Free-Spacing Mode in Regular Expressions: Improving Readability

0 Comments

Recommended Comments

There are no comments to display.

Guest
Add a comment...

Important Information

Terms of Use Privacy Policy Guidelines We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.