Jump to content

Welcome to CodeNameJessica

Welcome to CodeNameJessica!

💻 Where tech meets community.

Hello, Guest! 👋
You're just a few clicks away from joining an exclusive space for tech enthusiasts, problem-solvers, and lifelong learners like you.

🔐 Why Join?
By becoming a member of CodeNameJessica, you’ll get access to:
In-depth discussions on Linux, Security, Server Administration, Programming, and more
Exclusive resources, tools, and scripts for IT professionals
A supportive community of like-minded individuals to share ideas, solve problems, and learn together
Project showcases, guides, and tutorials from our members
Personalized profiles and direct messaging to collaborate with other techies

🌐 Sign Up Now and Unlock Full Access!
As a guest, you're seeing just a glimpse of what we offer. Don't miss out on the complete experience! Create a free account today and start exploring everything CodeNameJessica has to offer.

  • Entries

    47
  • Comments

    0
  • Views

    21260

Entries in this blog

Regular Expressions Tutorial Table of Contents Regular Expression Tutorial pg 1 Word Boundaries pg 10 Understanding Atomic Grouping in Regular Expressions pg 19 Different Regular Expression Engines pg 2 Alternation with the Vertical Bar or Pipeline Symbol pg 11 Understanding Lookahead and Lookbehind in Regular Expressions (Lookaround) pg 20 Literal Characters pg 3 Optional Items pg 12 Testing Multiple Conditions on the Same Part of a String with Lookaround pg 21 Special Characters pg 4 Repetitio
Table of Contents Regular Expression Tutorial Different Regular Expression Engines Literal Characters Special Characters Non-Printable Characters First Look at How a Regex Engine Works Internally Character Classes or Character Sets The Dot Matches (Almost) Any Character Start of String and End of String Anchors Word Boundaries Alternation with the Vertical Bar or Pipe Symbol Optional Items Repetition with Star and Plus Grouping with Round Brackets Named Capturing Groups Unicode Re
A regular expression engine is a software component that processes regex patterns, attempting to match them against a given string. Typically, you won’t interact directly with the engine. Instead, it operates behind the scenes within applications and programming languages, which invoke the engine as needed to apply the appropriate regex patterns to your data or files. Variations Across Regex Engines As is often the case in software development, not all regex engines are created equal. Different
The simplest regular expressions consist of literal characters. A literal character is a character that matches itself. For example, the regex «a» will match the first occurrence of the character "a" in a string. Consider the string "Jack is a boy": this pattern will match the "a" after the "J". It’s important to note that the regex engine doesn’t care where the match occurs within a word unless instructed otherwise. If you want to match entire words, you’ll need to use word boundaries, a concep
To go beyond matching literal text, regex engines reserve certain characters for special functions. These are known as metacharacters. The following characters have special meanings in most regex flavors discussed in this tutorial: [ \ ^ $ . | ? * + ( ) If you need to use any of these characters as literals in your regex, you must escape them with a backslash (\). For instance, to match "1+1=2", you would write the regex as: 1\+1=2 Without the backslash, the plus sign would be interpreted as a q
Regular expressions can also match non-printable characters using special sequences. Here are some common examples: \t: Tab character (ASCII 0x09) \r: Carriage return (ASCII 0x0D) \n: Line feed (ASCII 0x0A) \a: Bell (ASCII 0x07) \e: Escape (ASCII 0x1B) \f: Form feed (ASCII 0x0C) \v: Vertical tab (ASCII 0x0B) Keep in mind that Windows text files use "\r\n" to terminate lines, while UNIX text files use "\n". Hexadecimal and Unicode Characters You can include any character in your regex usin
Understanding how a regex engine processes patterns can significantly improve your ability to write efficient and accurate regular expressions. By learning the internal mechanics, you’ll be better equipped to troubleshoot and refine your regex patterns, reducing frustration and guesswork when tackling complex tasks. Types of Regex Engines There are two primary types of regex engines: Text-Directed Engines (also known as DFA - Deterministic Finite Automaton) Regex-Directed Engines (also known as
Character classes, also known as character sets, allow you to define a set of characters that a regex engine should match at a specific position in the text. To create a character class, place the desired characters between square brackets. For instance, to match either an a or an e, use the pattern [ae]. This can be particularly useful when dealing with variations in spelling, such as in the regex gr[ae]y, which will match both "gray" and "grey." Key Points About Character Classes: A character
The dot, or period, is one of the most versatile and commonly used metacharacters in regular expressions. However, it is also one of the most misused. The dot matches any single character except for newline characters. In most regex flavors discussed in this tutorial, the dot does not match newlines by default. This behavior stems from the early days of regex when tools were line-based and processed text line by line. In such cases, the text would not contain newline characters, so the dot could
In previous sections, we explored how literal characters and character classes operate in regular expressions. These match specific characters in a string. Anchors, however, are different. They match positions in the string rather than characters, allowing you to "anchor" your regex to the start or end of a string or line. Using the Caret (^) Anchor The caret (^) matches the position before the first character of the string. For example: ^a applied to "abc" matches "a." ^b does not match "abc"
The \b metacharacter is an anchor, similar to the caret (^) and dollar sign ($). It matches a zero-length position called a word boundary. Word boundaries allow you to perform “whole word” searches in a string using patterns like \bword\b. What is a Word Boundary? A word boundary occurs at three possible positions in a string: Before the first character if it is a word character. After the last character if it is a word character. Between two characters where one is a word character and the ot
Previously, we explored how character classes allow you to match a single character out of several possible options. Alternation, on the other hand, enables you to match one of several possible regular expressions. The vertical bar or pipe symbol (|) is used for alternation. It acts as an OR operator within a regex. Basic Syntax To search for either "cat" or "dog," use the pattern: cat|dog You can add more options as needed: cat|dog|mouse|fish The regex engine will match any of these options. Fo
The question mark (?) makes the preceding token in a regular expression optional. This means that the regex engine will try to match the token if it is present, but it won’t fail if the token is absent. Basic Usage For example: colou?r This pattern matches both "colour" and "color." The u is optional due to the question mark. You can make multiple tokens optional by grouping them with round brackets and placing a question mark after the closing bracket: Nov(ember)? This regex matches both "Nov"
In addition to the question mark, regex provides two more repetition operators: the asterisk (*) and the plus (+). Basic Usage The * (star) matches the preceding token zero or more times. The + (plus) matches the preceding token one or more times. For example: <[A-Za-z][A-Za-z0-9]*> This pattern matches HTML tags without attributes: <[A-Za-z] matches the first letter. [A-Za-z0-9]* matches zero or more alphanumeric characters after the first letter. This regex will match tags like: <
In regular expressions, round brackets (()) are used for grouping. Grouping allows you to apply operators to multiple tokens at once. For example, you can make an entire group optional or repeat the entire group using repetition operators. Basic Usage For example: Set(Value)? This pattern matches: "Set" "SetValue" The round brackets group "Value", and the question mark makes it optional. Note: Square brackets ([]) define character classes. Curly braces ({}) specify repetition counts. Only ro
Named capturing groups allow you to assign names to capturing groups, making it easier to reference them in complex regular expressions. This feature is available in most modern regular expression engines. Why Use Named Capturing Groups? In traditional regular expressions, capturing groups are referenced by their numbers (e.g., \1, \2). As the number of groups increases, it becomes harder to manage and understand which group corresponds to which part of the match. Named capturing groups solve th
Unicode regular expressions are essential for working with text in multiple languages and character sets. As the world becomes more interconnected, supporting Unicode is increasingly important for ensuring that software can handle diverse text inputs. What is Unicode? Unicode is a standardized character set that encompasses characters and glyphs from all human languages, both living and dead. It aims to provide a consistent way to represent characters from different languages, eliminating the ne
Most regular expression engines discussed in this tutorial support the following four matching modes: Modifier Description /i Makes the regex case-insensitive. /s Enables "single-line mode," making the dot (.) match newlines. /m Enables "multi-line mode," allowing caret (^) and dollar ($) to match at the start and end of each line. /x Enables "free-spacing mode," where whitespace is ignored, and # can be used for comments. Specifying Modes Inside The Regular Expression You can specify these mode
When working with repetition operators (also known as quantifiers) in regular expressions, it’s essential to understand the difference between greedy, lazy, and possessive quantifiers. Greedy and lazy quantifiers affect the order in which the regex engine tries to match permutations of the pattern. However, both types still allow the regex engine to backtrack through the pattern to find a match. Possessive quantifiers take a different approach—they do not allow backtracking once a match is made,
Atomic grouping is a powerful tool in regular expressions that helps optimize pattern matching by preventing unnecessary backtracking. Once the regex engine exits an atomic group, it discards all backtracking points created within that group, making it more efficient. Unlike regular groups, atomic groups are non-capturing, and their syntax is represented by (?:?>group). Lookaround assertions like (?=...) and (?!...) are inherently atomic as well. Atomic grouping is supported by many popular r
Lookahead and lookbehind, often referred to collectively as "lookaround," are powerful constructs introduced in Perl 5 and supported by most modern regular expression engines. They are also known as zero-width assertions because they don’t consume characters in the input string. Instead, they simply assert whether a certain condition is true at a given position without including the matched text in the overall match result. Lookaround constructs allow you to build more flexible and efficient reg
In regular expressions, it’s common to need a match that satisfies multiple conditions simultaneously. This is where lookahead and lookbehind, collectively known as lookaround assertions, come in handy. These zero-width assertions allow the regex engine to test conditions without consuming characters in the string, making it possible to apply multiple requirements to the same portion of text. Why Lookaround Is Essential Let’s say you want to match a six-letter word that contains the sequence “ca
The \G anchor is a powerful tool in regular expressions, allowing matches to continue from the point where the previous match ended. It behaves similarly to the start-of-string anchor \A on the first match attempt, but its real utility shines when used in consecutive matches within the same string. How the \G Anchor Works The anchor \G matches the position immediately following the last successful match. During the initial match attempt, it behaves like \A, matching the start of the string. On s
Conditional logic isn’t limited to programming languages — many modern regular expression engines allow if-then-else conditionals. This feature lets you apply different matching patterns based on a condition. The syntax for conditionals is: (?(condition)then|else) If the condition is met, the then part is attempted. If the condition is not met, the else part is applied instead. You can omit the else part if it’s not needed. Conditional Syntax and How It Works The syntax for if-then-else conditio
XML Schema introduces unique character classes and features not commonly found in other regular expression flavors. These classes are particularly useful for validating XML names and values, making XML Schema regex syntax essential for working with XML data. Special Character Classes in XML Schema In addition to the six standard shorthand character classes (e.g., \d for digits, \w for word characters), XML Schema introduces four unique shorthand character classes designed specifically for XML na

Important Information

Terms of Use Privacy Policy Guidelines We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.