Here is the video version for this topic. You can read the written version which comes after the video section.

Previously, we looked at character classes which allows you to specify a "this or that". Another similar concept in regular expressions is alternation, also called alternating characters.

Alternation allows you to specify a "this" or "that" character or group of characters. You create them with the pipe | symbol. For example:

/this|that/g

This pattern will match the characters "this" or "that":

Regex/this|that/g
Input
Match

Because we have the g flag, this pattern matches "this" and "that".

How does this differ from character classes?

The difference is:

Important

Character classes only allows you to provide characters: [abc]. It does not allow you to provide other expressions like groups and quantifiers.

For example, let's say you have a character class of the \w meta character, and the + plus quantifier which means 1 OR MORE:

/[\w+]/

Passing plus + in a character class will treat it as a “normal character” and not a quantifier. Every special character in a character class is treated as a normal character.

Regex/[\w+]/g
Input
Match

As you can see above, every character except "$" and "&" are not matched. These are the only characters that are neither \w (word characters) nor + in the character class.

Also if you attempt to have groups in a character class, like this:

/[(ha)]ing/

It also won't work, as the parentheses would be treated as normal bracket characters. So, this character class would mean “open parentheses (” or “h” or “a” or “close parentheses )”.

Regex/[(ha)]ing/g
Input
Match

From the matches, you see "(ing", "aing", and ")ing", because they begin with a character from the character class followed by "ing".

This is how alternation is different.

Important

Unlike character classes, alternation allows you to combine other special characters in your "this or that" pattern.

With alternation, you can provide expressions like characters, groups, quantifiers or any other valid regular expression. For example:

/th(is|at)/

Here we have a group and in it, we use alternation to indicate "is" or "at". So the pattern here means:

"t" followed by "h" followed by "is" or "at":

Regex/th(is|at)/g
Input
Match

The orange colors signifies the groups that are captured. Remember you can turn off capturing with a question mark followed by a colon in the group: (?:..)

Let's see another example. Let's say we had a string like:

"I grabbed a refreshing can of Coca-Cola, or as some people call it, Coke, while others prefer to refer to it as Coca Cola"

And you want to match "Coca-Cola", "Coke" and "Coca Cola". Here are few ways you can use alternation for this:

Regex/Coca-Cola|Coke|Coca Cola/ig
Input
Match
The i flag to turn off case restriction.

Here we basically said "Coca-Cola" or "Coke" or "Coca Cola". Very easy to read, but we can make this shorter.


Regex/Co(ca-Cola|ke|ca Cola)/ig
Input
Match

Very slightly shorter, but what we're doing here is "Co" followed by:

  • "co-Cola" or
  • "ke" or
  • "ca Cola"

And of course, the groups are captured.

Now let's make this shorter.


Regex/Co(ca(-| )Cola|ke)/ig
Input
Match

Here we say, "Co" followed by:

  • "ca" followed by
    • "- or |"
    • followed by "Cola"
  • "ke"

Though shorter, this becomes complex, and readibility becomes difficult. Like I mentioned at the beginning of this course, there are many patterns you can write to achieve the same thing, but always keep reability in mind.

In this pattern, we have a nested group ca(-| ) which includes an alternation between "ca-" and "ca ". This way, we can match Coca-Cola and Coca Cola. While this does the job, I'd rather use our earlier solution:

/Coca-Cola|Coke|Coca Cola/ig

This is easier to read and pretty clear what we're trying to achieve.


Try to solve these exercises on your own, and you can share on Twitter and tag me @iamdillion.

To share your solution on Twitter, click on Edit. After entering your solution, click on the checkmark, then Copy Regex URL. Then, share.

Match all filenames with their extensions here.

Regex/your-regex/g
Input
Match


Match all domains in this string.

Regex/your-regex/g
Input
Match

Now let's move onto a more interesting concept known as Lookaheads.