RegEx Metacharacters - Special Characters
Metacharacters are special characters that have a special meaning in regular expressions. They are used to match specific patterns in text. Here are some of the most commonly used metacharacters in Python (examples are demonstrated for the
'Hello 123 World 456 Hello World'
text):Metacharacter | Description | Example | Output |
. | Matches any character except a newline | re.search('. World', text) | <re.Match object; span=(8, 15), match='3 World'> |
^ | Matches the start of a string | re.search('^Hello', text) | <re.Match object; span=(0, 5), match='Hello'> |
$ | Matches the end of a string | re.search('Hello World$', text) | <re.Match object; span=(20, 31), match='Hello World'> |
[] | Matches any character within the square brackets | re.search('[0123456789]', text) | <re.Match object; span=(6, 7), match='1'> |
[^ ] | Matches any character not within the square brackets | re.search('[^0123456789]', text) | <re.Match object; span=(0, 1), match='H'> |
( ) | Matches the expression within the parentheses | re.search('(Hello) World', text) | <re.Match object; span=(20, 31), match='Hello World'> |
Special sequences
Special sequences are backslash-escaped (
\
) sequences that have a special meaning within regular expressions. Some of the most popular ones include:Metacharacter | Description | Example | Output |
\d | Matches any decimal digit | re.search('\d+', text) | <re.Match object; span=(6, 9), match='123'> |
\D | Matches any non-digit character | re.search(r'\D+', text) | <re.Match object; span=(0, 6), match='Hello '> |
\s | Matches any whitespace character | re.search('\s+', text) | <re.Match object; span=(5, 6), match=' '> |
\S | Matches any non-whitespace character | re.search('\S+', text) | <re.Match object; span=(0, 5), match='Hello'> |
\w | Matches any word character (alphanumeric) | re.search('\w+', text) | <re.Match object; span=(0, 5), match='Hello'> |
\W | Matches any non-word character | re.search(r'\W+', text) | <re.Match object; span=(5, 6), match=' '> |
\b | Matches a word boundary | re.search(r'\bHello\b', text) | <re.Match object; span=(0, 5), match='Hello'> |
\B | Matches a non-word boundary | re.search(r'\BHello\B', text) | None |
\A | Matches only at the start of the string | re.search(r'\AHello', text) | <re.Match object; span=(0, 5), match='Hello'> |
\Z | Matches only at the end of the string | re.search(r'World\Z', text) | <re.Match object; span=(26, 31), match='World'> |
Quantifiers
Quantifiers in Python allow you to specify the number of times a character or pattern should be matched:
Metacharacter | Description | Example | Output |
* | Matches zero or more occurrences of the preceding character | re.search('Hel*o', text) | <re.Match object; span=(0, 5), match='Hello'> |
+ | Matches one or more occurrences of the preceding character | re.search('He+llo', text) | <re.Match object; span=(0, 5), match='Hello'> |
? | Matches zero or one occurrence of the preceding character or pattern (Makes the preceding character optional) | re.search('Helll?o', text) | <re.Match object; span=(0, 5), match='Hello'> |
{m,n} | Matches from m to n occurrences of the preceding character or pattern | re.search(r'Hello{1,3} World', text) | <re.Match object; span=(20, 31), match='Hello World'> |
💡
It’s not mandatory to memorize all of the special characters. The main goal is to understand the principles of how regular expressions work, their potential use cases, and how to compose and interpret them.
You can keep this page as a reference and get back to it whenever needed.