Jump to content

Welcome to CodeNameJessica

Welcome to CodeNameJessica!

💻 Where tech meets community.

Hello, Guest! 👋
You're just a few clicks away from joining an exclusive space for tech enthusiasts, problem-solvers, and lifelong learners like you.

🔐 Why Join?
By becoming a member of CodeNameJessica, you’ll get access to:
In-depth discussions on Linux, Security, Server Administration, Programming, and more
Exclusive resources, tools, and scripts for IT professionals
A supportive community of like-minded individuals to share ideas, solve problems, and learn together
Project showcases, guides, and tutorials from our members
Personalized profiles and direct messaging to collaborate with other techies

🌐 Sign Up Now and Unlock Full Access!
As a guest, you're seeing just a glimpse of what we offer. Don't miss out on the complete experience! Create a free account today and start exploring everything CodeNameJessica has to offer.

Table of Contents

  1. Regular Expression Tutorial

  2. Different Regular Expression Engines

  3. Literal Characters

  4. Special Characters

  5. Non-Printable Characters

  6. First Look at How a Regex Engine Works Internally

  7. Character Classes or Character Sets

  8. The Dot Matches (Almost) Any Character

  9. Start of String and End of String Anchors

  10. Word Boundaries

  11. Alternation with the Vertical Bar or Pipe Symbol

  12. Optional Items

  13. Repetition with Star and Plus

  14. Grouping with Round Brackets

  15. Named Capturing Groups

  16. Unicode Regular Expressions

  17. Regex Matching Modes

  18. Possessive Quantifiers

  19. Understanding Atomic Grouping in Regular Expressions

  20. Understanding Lookahead and Lookbehind in Regular Expressions (Lookaround)

  21. Testing Multiple Conditions on the Same Part of a String with Lookaround

  22. Understanding the \G Anchor in Regular Expressions

  23. Using If-Then-Else Conditionals in Regular Expressions

  24. XML Schema Character Classes and Subtraction Explained

  25. Understanding POSIX Bracket Expressions in Regular Expressions

  26. Adding Comments to Regular Expressions: Making Your Regex More Readable

  27. Free-Spacing Mode in Regular Expressions: Improving Readability

Welcome to this comprehensive guide on Regular Expressions (Regex). This tutorial is designed to equip you with the skills to craft powerful, time-saving regular expressions from scratch. We'll begin with foundational concepts, ensuring you can follow along even if you're new to the world of regex. However, this isn't just a basic guide; we'll delve deeper into how regex engines operate internally, giving you insights that will help you troubleshoot and optimize your patterns effectively.

What Are Regular Expressions? — Understanding the Basics

At its core, a regular expression is a pattern used to match sequences of text. The term originates from formal language theory, but for practical purposes, it refers to text-matching rules you can use across various applications and programming languages.

You'll often encounter abbreviations like regex or regexp. In this guide, we'll use "regex" as it flows naturally when pluralized as "regexes." Throughout this manual, regex patterns will be displayed within guillemets: «pattern». This notation clearly differentiates the regex from surrounding text or punctuation.

For example, the simple pattern «regex» is a valid regex that matches the literal text "regex." The term match refers to the segment of text that the regex engine identifies as conforming to the specified pattern. Matches will be highlighted using double quotation marks, such as "match."

A First Look at a Practical Regex Example

Let's consider a more complex pattern:

\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b

This regex describes an email address pattern. Breaking it down:

  • \b: Denotes a word boundary to ensure the match starts at a distinct word.

  • [A-Z0-9._%+-]+: Matches one or more letters, digits, dots, underscores, percentage signs, plus signs, or hyphens.

  • @: The literal at-sign.

  • [A-Z0-9.-]+: Matches the domain name.

  • .: A literal dot.

  • [A-Z]{2,4}: Matches the top-level domain (TLD) consisting of 2 to 4 letters.

  • \b: Ensures the match ends at a word boundary.

With this pattern, you can:

  • Search text files to identify email addresses.

  • Validate whether a given string resembles a legitimate email address format.

In this tutorial, we'll refer to the text being processed as a string. This term is commonly used by programmers to describe a sequence of characters. Strings will be denoted using regular double quotes, such as "example string."

Regex patterns can be applied to any data that a programming language or software application can access, making them an incredibly versatile tool in text processing and data validation tasks.

Next, we'll explore how to construct regex patterns step by step, starting from simple character matches to more advanced techniques like capturing groups and lookaheads. Let's dive in!

0 Comments

Recommended Comments

There are no comments to display.

Guest
Add a comment...

Important Information

Terms of Use Privacy Policy Guidelines We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.