Introduction to Regular Expressions using JavaScript - Part 4

Published on 10th of September 2008. Copyright Tavs Dokkedahl. Displayed 1103 time(s)

Special characters

As we have seen a number of characters like, + and . have a special meaning when used inside a regex. Below is a list of all of them. We haven't used all of them yet but don't worry about that for now.

^ $ [ ] . ? * + | { } = ( ) : \ / !

To match one of these characters you need to escape it from being interpreted as it would normally. This is done using a backslash (\).

If you want to litterally match a colon (:) you would write

 1 // Match a string which contains a ':'
 2 var rgx = /\:/;
 3 // Will match
 4 '23:56'

The same goes for any other of the special characters. If you want to match the backslash itself you simply write

 1 // Match a string which contains a '\'
 2 var rgx = /\\/;
 3 // Will match
 4 'c:\windows'

Grouping and alternation

Different components of the regex can be grouped together. For this we use ( and ) (parenthesis).

Grouping is useful for a variaty of situations. If we want to match the strings 'select' and 'selection' we can do

 1 // Match 'select' optionally ending in 'ion'
 2 var rgx = /select(ion)?$/;
 3 // Will match
 4 'select'
 5 'selection'
 6 // but not
 7 'selected'

Now the ? (zero or one) applies to the previous item being the expression in the parenthesis.

With alternation we can extend the match further. Alternation is done using the pipe (|) or is equivalent to logical OR.

 1 // Match 'select' optionally ending in 'ion' or 'ed'
 2 var rgx = /select(ion|ed)?$/;
 3 // Will match
 4 'select'
 5 'selection'
 6 'selected'
 7 // but not
 8 'selects'
 9 'selector'

Groups can be nested to any level but try not to over do nesting as the pattern quickly becomes very complicated to read.

Grouping can be used for repeating parts of a regex. If we want to match the pattern 'abc_' 2 or 3 times we can write

 1 // Match a string which contains 'abc_abc_' 
 2 var rgx = /(abc_){2,3}/;
 3 // Will match
 4 'abc_abc_'
 5 'abc_abc_abc_'
 6 'abc_abc_abc_abc_'
 7 // but not
 8 'abc_'

Note how the string 'abc_abc_abc_abc_' is also matched as this contain the pattern we are looking for. Adding the ',3' in {2,3} does enforce matching at most 3 times but will still match any string with 2 are more repetitios of 'abc_abc_'.

Stay tuned for more content...

« Part 3  

Leave a comment

Name

Email (if you want a response)

Comment (no HTML)

Spam challenge
Sorry to bother you but spam is a royal pain, so please answer this simple question to verify that you are in fact human(oid)

Question: "What is the opposite of upload?"

Answer: