To read the article online, visit http://www.4GuysFromRolla.com/webtech/120499-1.4.shtml

Searching Through the Text of Each File on a WebSite, Part 4


In Part 3 we examined the source code for the recusive function GetFiles. Now, we still need to look at the function FormatURL. This function translates a physical path into a pseudo-URL path. For example, C:\InetPub\wwwroot\scott\test.asp would be translated into /scott/test.asp.

Function FormatURL(strPath) 'Cut off everything before wwwroot and replace all \ with / Dim iPos iPos = InStr(1,strPath,"wwwroot",1) Dim str str = Mid(strPath,iPos+7,Len(strPath)) FormatURL = Replace(str,"\","/") End Function

Finally, we need to display the results of the search. This is accomplished by a single call to the GetFiles function. Before we call the GetFiles function, we should take a moment to see if the strLastFile parameter was passed in or not. If it was, we want to set bolLFFound to False; else, we can just set bolLFFound to True, since we do not need to first look for a particular file. Anyway, here is the code: Below are the results of your search in no particular order...

Dim iResults iResults = 0 'Now, recurse the directories If Len(strLastFile) = 0 then GetFiles objFolder,termsArray,strLastFile,True,bolAnd,iResults Else GetFiles objFolder,termsArray,strLastFile,False,bolAnd,iResults End If Set objFolder = Nothing

Note that the variable iResults will contain the number of records listed by GetFiles. If iResults is less than 10, then we know that there are no more files that match the search terms entered by the user. However, if iResults does indeed equal 10, then their might be more results. In this case, we'll show the Show more results link. This link will send the user to search.asp, passing all of the form field variables through the querystring, including the last visited file. If, on the other hand, iResults equals 0, then no results were found, and we should display a message to the user. The following code will accomplish these tasks.

If iResults = 10 then
    'Show next page link
%>
    <P><HR><P><LI><FONT SIZE=2><B>
    <A HREF="search.asp?terms=<%=Server.URLEncode(strKeywords)%>&boolean=<%=Request("boolean")%>&selSearchWhere=<%=Request("selSearchWhere")%>&lf=<%=Server.URLEncode(strLastFile)%>">
       Show more results...
     </A></FONT>
    <P>
<% Elseif iResults = 0 then
    'No results found %>
    <B>No results found!</B><BR>
    <FONT SIZE=2><A HREF="/search/">Try another search...</A></FONT>
<% End IF %>

Note that we use Server.URLEncode to ensure that the variables we are passing through the querystring are properly formatted. If you are unfamiliar with Server.URLEncode, be sure to read the technical documentation.

That wraps up the code for the search algorithm! The complete source is available at the end of this article. Before we wrap things up, though, let's take a moment to analyze this algorithm. Recall from earlier in the article that this approach is not ultra-efficient. However, to see its efficiency more precisely, a thorough analysis is needed.

If we choose N to be the number of files that need to be searched through, then our analysis needs to concentrate on GetFiles, since the remainder of the code will take a constant time, regardless of the number of files to be searched. Clearly, since we have to iterate at most N times, the search algorithm is in linear time asymtoticly. What search engine, isn't, though? In the best case scenario, we will only have to iterate through C files, where C is the number of links we are showing per paged result. In the worst case scenario, though, we will have to iterate through N-C files. Of course, when iterating through these N-C files, we don't have to open the files, rather just pass on by them. Still, as N gets large we have to step through a large number of files. Also, if we choose C too big, we will have to open and close many files.

This is a simple analysis. I leave it to you to extend and apply it! For a more thorough discussion on the efficiency of this and what other options apply, be sure to check out this messageboard post.

Happy Programming!

  • Read Part 3
  • Read Part 2
  • Read Part 1


    Attachments

  • The HTML interface to the search engine, in text format
  • The source code for search.asp, in text format


  • Article Information
    Article Title: Searching Through the Text of Each File on a WebSite, Part 4
    Article Author: Scott Mitchell
    Published Date: Saturday, December 04, 1999
    Article URL: http://www.4GuysFromRolla.com/webtech/120499-1.4.shtml


    Copyright 2017 QuinStreet Inc. All Rights Reserved.
    Legal Notices, Licensing, Permissions, Privacy Policy.
    Advertise | Newsletters | E-mail Offers