To read the article online, visit http://www.4GuysFromRolla.com/webtech/043001-1.2.shtml

Utilizing Regular Expression SubMatches, Part 2

By Scott Mitchell


  • Read Part 1

  • In Part 1 we examined how to use the $N to refer to found strings when using the Replace function. However, as we noted in Part 1, there may be times when it would be nice to access these matched dollar-sign notation strings in the Match object when we run the Execute method. Fortunately, we can access these values this way through the SubMatches property of the Match object.

    Using the SubMatches Collections Property
    When running the Execute method of the regular expression object a series of Match objects are returned via the Matches collection. For example, if we extended the code in Part 1 to:

    Dim strHTML
    strHTML = "<html><body><a href=""http://www.aspfaqs.com/"">" & _
              "ASPFAQs.com</a><br><a href=""http://www.aspmessageboard.com/"">" & _
              "ASPMessageboard.com</a></body></html>"
    
    'First, create a reg exp object
    Dim objRegExp
    Set objRegExp = New RegExp
    
    objRegExp.IgnoreCase = True
    objRegExp.Global = True
    objRegExp.Pattern = "<a\s+href=""http://(.*?)"">\s*((\n|.)+?)\s*</a>"
    
    'Display all of the matches
    Dim objMatch
    For Each objMatch in objRegExp.Execute(strHTML)
      Response.Write("<xmp>" & objMatch.Value & "</xmp><br>")
    Next
    

    The output of the above code would be:

    <a href="http://www.aspfaqs.com/">ASPFAQs.com</a> <a href="http://www.aspmessageboard.com/">ASPMessageboard.com</a>

    If we want to get just the URL or URL description portion of the HREF tag, we could use the $1 or $2 notation, respectively, in a Replace statement, but to access these values through the Match object we have to use the SubMatches property of the Match object. Therefore, if we wanted to list just the URL portion of the matches in the above code we could alter our For Each ... Next loop to output the value of SubMatches(0) instead of Match.Value:

    '... continued from above ...
    
    For Each objMatch in objRegExp.Execute(strHTML)
      Response.Write("http://" & objMatch.SubMatches(0) & "<br>")
    Next
    
    [View the live demo!]

    which will give us the following output:

    	http://www.aspfaqs.com/
    	http://www.aspmessageboard.com/
    

    The SubMatches property is really a collection object that's indexed starting at zero. With the dollar-sign notation we started at $1 and worked up incrementally for each parenthetical match; using the SubMatches property, however, we'd start at zero and work up. Pretty neat, eh?

    Conclusion
    This article examined some advanced features of regular expressions: using the dollar-sign notation to refer back to a string match when using the Replace function, and using the SubMatches property of the Match object to access the same information when using the Execute method. For more information on regular expressions be sure to check out the Regular Expressions Article Index.

    Happy Programming!

    By Scott Mitchell


    Article Information
    Article Title: Utilizing Regular Expression SubMatches, Part 2
    Article Author: Scott Mitchell
    Published Date: Monday, April 30, 2001
    Article URL: http://www.4GuysFromRolla.com/webtech/043001-1.2.shtml


    Copyright 2017 QuinStreet Inc. All Rights Reserved.
    Legal Notices, Licensing, Permissions, Privacy Policy.
    Advertise | Newsletters | E-mail Offers