Here is the video version for this topic. You can read the written version which comes after the video section.

So far, we've seen how to create regular expressions, what flags are, and how regular expressions work under the hood. But all we've seen are simple string expressions like /code/ and /name/.

In this lesson, and the lessons after this, we'll learn the different advanced or complex patterns we can create in regex.

For this lesson, we'll look at character classes.

A character class in regex, also called a character set, allows you to specify a set of characters, for which the pattern can match one of those characters that apply to a string.

You create a character class using square brackets [...]:

/[characters]/

In these square brackets, you specify the different characters for which one of them could match a string.

Let's look at a character class for letters.

For example, let's say you want to match an a or an e before β€œpple” in a string, you can do this:

/[ae]pple/g

In this pattern, you have the character class which contains "a" and "e", then after the character class, you have "pple". I applied the g flag so we can get all matches.

Let's say we have a string such as β€œIs it spelt epple or apple?”:

Regex/[ae]pple/g
Input
Match

As you can see, "epple" and "apple" are matches for the pattern.

For this pattern, the regex engine looks for an "a" or an "e". Upon finding either of these characters, the regex engine checks if the character is followed by "pple", before it considers the substring a match.

This pattern will match "epple", which begins with "e", and "apple" which begins with "a".

Note that this regular expression will not match β€œaepple”:

Regex/[ae]pple/g
Input
Match

You see, "aepple" is not a match--only the "epple" part is a match. This is because a character class specifies a "this OR that" value not a "this and that". It doesn't match β€œae”, it only matches β€œa” or β€œe”.

You can create character classes for digits too. For example:

/[246]-people/g

This character class in this pattern specifies 2 or 4 or 6. Remember, not 2 followed by 4 followed by 6. This pattern means, 2 or 4 or 6, followed by "-", then "people".

Let's say we have a string like β€œHe says there are 4-people, 6-people and 46-people”:

Regex/[246]-people/g
Input
Match

You see that "4-people" and "6-people" are matched.

You also see that the string "46-people" is not matched.

You can create character classes for symbols too. For example:

/[&-]s/g

The character class in this pattern specifies the ampersand "&"" or hyphen symbol "-". And the whole pattern means "&" or "-" followed by the letter s.

Let's say we have a string like β€œHe used 5&s, 6%s and 7-s”:

Regex/[&-]s/g
Input
Match

You see that "&s" and "-s" are matched.

"%s" is not matched because "%"" is not part of the character class. But if we add it to our character class, "%s" becomes a match:

Regex/[&%-]s/g
Input
Match

In character classes, you can also specify a range.

Let's say we want to match "bing", "sing", "ring", "king", and "ping". You can use a character set in a regex pattern like this:

/[bsrkp]ing/g

What if you want to match "ALL four letter words that end in 'ing'", then your character set would look like this:

/[abcdefghijklmnopqrstuvwxyz]ing/g

While this works, there's a way to improve it--making it shorter and more concise. You do this with ranges. With a range, instead of typing a, b, c, d, up all to z, you can simply specify the range of a to z like this:

/[a-z]ing/g

This range in the character class will match any word that starts with a letter between a to z, followed by "ing".

Let's say we have a string like β€œHe is a king, he likes to sing. He rings the bell. He pings the phone. He uses bing. Oh what a thing. He doesn't know the word ling”.

Feel my rhymes? 🌝

Regex/[a-z]ing/g
Input
Match

For the matches, you see that all letters between a to z that are followed by ing, are matched. So using ranges makes your pattern shorter and concise.

you don't always have to use full ranges like a to z. You can also have shorter ranges like:

/[a-m]ing/g

This pattern will match a substring which starts with a letter between "a" and "m" followed by "ing".

Here's another example:

/[r-w]ing/g

This pattern will match a substring which starts with a letter between "r" and "w" followed by "ing".

You can also combine multiple ranges

/[b-jr-x]ses/g

The ranges we have looked at so far are lowercase ranges. You can have uppercase ranges too:

/[A-Z]ool/g

And if you do not want case strictness, you can simply add the case insensitive flag i as we saw in the previous videos:

/[a-z]est/i

The character class in this pattern will match any letter between a and z whether in lowercase or uppercase.

You can also apply ranges to digits. For example 0-9 will match all digits between 0 to 9:

/[0-9]/g

[2-5] will match all digits between 2 and 5 and so on.

Unlike letters and digits, symbols do not have ranges like $-#, so you have to directly specify the symbols as we've seen earlier.

You can mix letter and digit ranges in the character set:

/[a-z0-9]/g

You can also add symbols to the character set:

/[a-z0-9&-]ing/g

Don't forget, this character set will match any letter between "a" and "z", OR (emphasis on OR), any number between 0 and 9, OR the ampersand symbol "&", OR the hyphen symbol "-", followed by "ing".


A character class as we have seen allows us to match either-or characters in a set. But we can also create a class to match either-or characters that are not in a set. This is where a Negated Character Class comes in.