To read the article online, visit http://www.4GuysFromRolla.com/webtech/031302-1.shtml

# Formatting with Regular Expressions

By Scott Mitchell

Introduction
If you're familiar with regular expressions, you know that they can be used for pattern matching and replacement in strings. (If you are not familiar with regular expressions, you're missing out on a powerful technology that has a myriad of applications. Read An Introduction to Regular Expressions to learn more!) Regular expressions true power lies in its ability to search for certain patterns and replace these patterns. For example, say that you have the contents of an HTML page in a string, and you want to find any occurrence of a particular word and highlight it in the user's browser. This can be done, rather simply, with regular expressions; check out this FAQ and this demo to learn more.

Regular expression pattern matching can also be used to format strings. For example, perhaps in a database you have phone numbers stored as 10 consecutive digits, such as 8005553211. When displaying this in an Web page, though, you'd like to apply formatting so that the phone number appears as: (800) 555-3211. While this can be accomplished using string manipulation functions, you can perform the task more succinctly using regular expressions.

What are you Looking For and What Should it be Replaced With?
When performing this style of formatting with regular expressions you must know what it is you want to format and how. In the phone number example from above, we want to find series of 10 digit numbers, and format them so that we have an open parenthesis followed by the first three digits, followed by a closed parenthesis, followed by a space, followed by the next three digits, followed by a hyphen and finally followed by the remaining four digits.

Once you've identified what to look for, you can write your regular expression pattern. For the phone number example, the pattern would be:

 `(\d{3})(\d{3})(\d{4})`

which specifies that we are looking for 10 consecutive digits. You may wonder why in the world the pattern is not simply `\d{10}`. Keep in mind that to perform this style of formatting we will have to specify how to replace the pattern (the 10 consecutive digits) with our desired replacement string. Note that above, when defining our desired replacement string, I kept saying: "We should such and such character, then three digits from the pattern, then such and such character, and then the next three digits from the pattern, etc."

One of the nifty things regular expressions allow for is what are called back references. Back references allow you to refer back to a portion of the string specified in the pattern. Back references take the form `\$N`, where N specifies what back reference you're interested in - the first back reference in the pattern is accessed via `\$1`, the second accessed via `\$2`, and so on. To specify that a particular portion of the pattern can be used as a back reference, simply surround that portion with parentheses in the pattern.

So, in our pattern we have the first three digits surrounded by parentheses (which we'll reference in our replacement string as `\$1`), our next three digits surrounded by parentheses (which we'll reference in our replacement string as `\$2`), and our final four digits surrounded by parentheses as well (which, not surprisingly, we'll reference in our replacement string as `\$3`). That being said, our replacement string follows naturally, as:

 `(\$1) \$2-\$3`
[View a Live Demo!]

Examining the Code
The above description included no code for one very good reason - because I didn't want to tie the concept to classic ASP or ASP.NET. Formatting using regular expressions as shown above will work with languages used by either technology. (Note that for classic ASP you will need to have VBScript version 5.0 or higher installed; you can determine what version you have by following the instructions at: Determining the Server-Side Scripting Language and Version.) While the live demo shows accomplishing the task with VBScript via an ASP page, it can also be done using VB.NET (or C#) via an ASP.NET Web page.

Note that the idea here is to use a regular expression, specify its pattern as what you want to look for (using proper parenthesization to enable back referencing), and then use the `Replace` method to replace the pattern with the format string. Below you will find short code examples for doing this in both classic ASP (VBScript) and ASP.NET (VB.NET). The below code examples assume that there is a string named `strUnformattedPN` that contains the unformatted phone number (10 consecutive digits, like 8005553121); when the code snippet completes, the `strFormattedPN` contains the formatted phone number (i.e., (800) 555-3121).

Using VBScript through a Classic ASP Page
 ```'Create a regular expression object Dim re Set re = New RegExp 'Specify the pattern re.Pattern = "(\d{3})(\d{3})(\d{4})" 'Use the replace method to perform the formatting Dim strFormattedPN strFormattedPN = re.Replace(strUnformattedPN, "(\$1) \$2-\$3") ```

Using VB.NET through an ASP.NET Page
 ```'Perform the formatting by calling the Replace 'method of the Regex class Dim strFormattedPN as String strFormattedPN = Regex.Replace(strUnformattedPN, _ "(\d{3})(\d{3})(\d{4})", _ "(\$1) \$2-\$3") ```

Notice that, essentially, the two code snippets are doing the exact same thing: specifying a pattern, searching on the contents of `strUnformattedPN`, replacing the pattern with the format string, and assigning the resulting string to `strFormattedPN`. Where the VBScript version has to create a `Regexp` object and set its `Pattern` property, the VB.NET version can do it all in one line by using the static `Replace` method of the `Regex` class. (The `Regex` class can be found in the `System.Text.RegularExpressions` namespace.)

Summary
The uses of regular expressions seem never ending. The article Common Applications of Regular Expressions provides a good look at the various tasks that can be accomplished with regular expressions. Add to that list the formatting shown in this article. For more on regular expressions be sure to check out the Regular Expressions FAQ Category on ASPFAQs.com, or check out the Regular Expressions Information page. For more information on using regular expressions in ASP.NET, check out this FAQ.

Happy Programming!

• By Scott Mitchell

•  Article Information Article Title: Formatting with Regular Expressions Article Author: Scott Mitchell Published Date: Wednesday, March 13, 2002 Article URL: http://www.4GuysFromRolla.com/webtech/031302-1.shtml