Special Sequence Regular Expressions
In this class, We discuss Special Sequence Regular Expressions.
For Complete YouTube Video: Click Here
The reader should have prior knowledge of Regular expressions. Click here.
We discuss special symbols in the regular expression.
Special Sequence
\b Symbol
Take an example and understand the special symbols ‘\b’.
import re
string="hello how are you hello"
x=re.findall(r'\bhel',string)
print(x)
Output:
['hello', 'hello']
In the above example. The symbol ‘\b’ is written before the expression.
The symbol ‘\b’ check the pattern at the beginning of the word.
we used the character ‘r’ before the expression. because the expression is written as a string.
In strings ‘\b’ considered as bell sound. we use ‘r’ to avoid that string consideration.
To check the pattern at the end of the word. we write the symbol ‘\b’ after the expression.
The below example shows the program using ‘\b’ after the expression.
string="hello how are you hello"
x=re.findall(r'llo\b',string)
print(x)
Output:
['llo', 'llo']
\B Symbol
The symbol ‘\B’ check the pattern not at the beginning if used before the expression.
Similarly, check the pattern, not at the end. If used after the expression.
The below example shows the program using the symbol ‘\B’.
string="hello bhell how are you hello"
x=re.findall(r'\Bhel',string)
print(x)
Output:
['hel']
string="hello bhell how are you hello"
x=re.findall(r'hel\B',string)
print(x)
Output:
['hel', 'hel', 'hel']
\d Symbol
The symbol ‘\d’ identifies the digit in the string.
The below example shows the program using the symbol ‘\d’.
string="hello bhell how are you its 9oclock we go at 10"
x=re.findall(r'\d',string)
print(x)
Output:
['9', '1', '0']
\D Symbol
The symbol ‘\D’ identifies other than digits.
The below example shows the program using the symbol ‘\D’.
string="hello bhell how are you its 9oclock we go at 10"
x=re.findall(r'\D',string)
print(x)
Output:
['h', 'e', 'l', 'l', 'o', ' ', 'b', 'h', 'e', 'l', 'l', ' ', 'h', 'o', 'w', ' ', 'a', 'r', 'e', ' ', 'y', 'o', 'u', ' ', 'i', 't', 's', ' ', 'o', 'c', 'l', 'o', 'c', 'k', ' ', 'w', 'e', ' ', 'g', 'o', ' ', 'a', 't', ' ']
\s Symbol
The symbol ‘\s’ matches a single white space character space, newline, and tab.
The below example shows the program using the symbol ‘\s’.
string="hello bhell how are you its 9oclock we go at 10"
x=re.findall(r'\s',string)
print(x)
Output:
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
\S Symbol
The symbol ‘\S’ matches any non-whitespace character.
The below example shows the program using the symbol ‘\S’.
string="hello bhell how are you its 9oclock we go at 10"
x=re.findall(r'\S',string)
print(x)
Output:
['h', 'e', 'l', 'l', 'o', 'b', 'h', 'e', 'l', 'l', 'h', 'o', 'w', 'a', 'r', 'e', 'y', 'o', 'u', 'i', 't', 's', '9', 'o', 'c', 'l', 'o', 'c', 'k', 'w', 'e', 'g', 'o', 'a', 't', '1', '0']
\w Symbol
The symbol ‘\w’ matches a “word” character: a letter, digit, or underbar [a-zA-Z0-9_].
The below example shows the program using the symbol ‘\w’.
string="hello bhell how are you its 9_oclock we go at 10 $"
x=re.findall(r'\w',string)
print(x)
Output:
['h', 'e', 'l', 'l', 'o', 'b', 'h', 'e', 'l', 'l', 'h', 'o', 'w', 'a', 'r', 'e', 'y', 'o', 'u', 'i', 't', 's', '9', '_', 'o', 'c', 'l', 'o', 'c', 'k', 'w', 'e', 'g', 'o', 'a', 't', '1', '0']
\W Symbol
The symbol ‘\W’ matches any non-word character.
The below example shows the program using the symbol ‘\W’.
string="hello bhell how are you its 9_oclock we go at 10 $"
x=re.findall(r'\W',string)
print(x)
Output:
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '$']
\A Symbol
The symbol ‘\A’ matches the pattern at the start of the string.
The below example shows the program using the symbol ‘\A’.
string="hello bhell how are you its 9_oclock we go at 10 hello"
x=re.findall(r'\Ahello',string)
print(x)
Output:
['hello']
\Z Symbol
The symbol ‘\Z’ matches the pattern at the end of the string.
The below example shows the program using the symbol ‘\Z’.
string="hello bhell how are you its 9_oclock we go at 10 hello"
x=re.findall(r'hello\Z',string)
print(x)
Output:
['hello']