JavaScript, webdev

RegEx 101

Regular expression or in short Regex is a string of text that lets you create patterns that help match, locate, and manage text. It’s an important tool in a wide variety of computing applications, from programming languages like JS, Java and Perl, to text processing tools like grep, sed, and vim.

Here are a few helpers to refresh your mind when you need some ‘simple’ regex to do the job.

Characters

CharactersLegendExampleSample Match
[abc], [a-c]Match the given characters/range of charactersabc[abc]abca, abcb, abcc
[^abc], [^a-c]Negate and match the given characters/range of charactersabc[^abc]abcd, abce, abc1
.Any character except line breakbc.bca, bcd, bc1, b.
\dAny numeric character (equivalent to [0-9])c\dc1, c2, c3
\DAny non-numeric character (equivalent to [^0-9])c\Dca, c., c*
\wAny alphanumeric character (equivalent to [A-Za-z0-9_])a\waa, a1, a_
\WAny non-alphanumeric character (equivalent to [A-Za-z0-9_])a\Wa), a$, a?
\sUsually used for white space, but can be used for new linetab, etca\sa
\SNot a white space or equivalent like new linetab, etca\Saa
\tMatches a horizontal tabT\tabT ab
\rMatches a carriage returnAB\r\nCDAB
CD
\nMatches a linefeedAB\r\nCDAB
CD
\Escapes special characters\d0, 1
x|yMatches either “x” or “y”a|ba, b

Assertions

CharactersLegendExampleSample Match
^Start of string or start of line depending on multiline mode^abc.*abc, abd, abcd
$End of string or start of line depending on multiline mode.*xyz$xyz, wxyz, abcdxyz
\bMatches a word character is not followed by another word-characterMy.*\bpieMy apple pie
\BMatches a non-word boundaryc.*\Bcatcopycat
x(?=y)Lookahead assertion: Matches “x” only if “x” is followed by “y”\d+(?=€)$1 = 0.98€
x(?!y)Negative Lookahead assertion: Matches “x” only if “x” is followed not by “y”\d+\b(?!€)$1 = 0.98€
(?<=y)xLookbehind assertion: Matches “x” only if “x” is preceded by “y”(?<=\d)\d$1 = 0.9*8*€
(?<!y)xNegative Lookbehind assertion: Matches “x” only if “x” is not preceded by “y”(?<!\d)\d$1 = 0.98€

Groups

CharactersLegendExampleSample Match
(x)Capturing group: Matches x and remembers the matchA(nt|pple)Ant (remembers “nt”)
(?<name>x)Capturing group: Matches x and stores it in the mentioned variableA(?<m>nt|pple)Ant (m = “nt”)
(?:name>x)Non-capturing group: Matches x and does not remember the matchA(?:nt|pple)Ant
\nBack reference to the last substring matching the n parenthetical(\d)+(\d)=\2+\15+6=6+5

Quantifiers

CharactersLegendExampleSample Match
x*Matches the preceding item “x” 0 or more timesa*a, aa, aaa
x+Matches the preceding item “x” 1 or more times, equivalent to {1,}a+aa, aaa, aaaa
x?Matches the preceding item “x” 0 or 1 timeab?a, ab
x{n}Matches the preceding item “x” n times (n = positive integer)ab{5}cabbbbbc
x{n,}Matches the preceding item “x” at least n times (n = positive integer)ab{2,}cabbc, abbbc, abbbbc
x{n,m}Matches the preceding item “x” at least n times & at most m times (n<m)ab{2,3}cabbc, abbbc

NOTE

By default quantifiers are greedy (they try to match as much of the string as possible).
The ? character after the quantifier makes the quantifier non-greedy (it will stop as soon as it finds a match).

For Example: \d+? for a test string 12345 will match only 1, but \d+ will match the entire string 12345

Flags

Flags are put at the end of the regular expression. They are used to modify how the regular expression behaves.

For Example: /a/ for a test string a will match a only, but adding the flag i (/a/i) would match both a and A

CharactersLegend
dGenerate indices for substring matches
gGlobal search
iCase-insensitive search
mMulti-line search
sAllows . to match newline characters
uTreats a pattern as a sequence of Unicode code points
yPerform a sticky search that matches starting at the current position in the target string

If you wish to test your knowledge:

Have a good weekend! 👊🏽

Standard

Leave a comment