Lisitng the URLs in a Web Page

This demo illustrates how to use regular expressions to display all of the URLs referenced in HREF tags in a Web page. (In this example, URLs are defined as appearing in an HREF tag and starting with http://.) The below list of URLs are those from the start page at http://www.asp.net/.


List of URLs in www.ASP.NET


Source Code
<%
  Dim strFileName 
  strFileName = Server.MapPath("/demos/asp.net.html")

  Dim objFSO, objTS, strHTML
  Set objFSO = Server.CreateObject("Scripting.FileSystemObject")
  Set objTS = objFSO.OpenTextFile(strFileName)
  strHTML = objTS.ReadAll
  objTS.Close
  Set objTS = Nothing
  Set objFSO = Nothing

  Dim objRegExp
  Set objRegExp = new RegExp

  objRegExp.IgnoreCase = True
  objRegExp.Global = True
  objRegExp.Pattern = "<a\s+href=""http://(.*?)"">\s*((\n|.)+?)\s*</a>"

  For Each objMatch in objRegExp.Execute(strHTML)
    Response.Write("<li>http://" & objMatch.SubMatches(0) & "<br>")
  Next

  Set objRegExp = Nothing
%>


[Return to the Article] | [View the HREF-to-Text Demo]