When you think ASP, think...
Recent Articles
All Articles
ASP.NET Articles [1.x] [2.0]
ASPFAQs.com
Message Board
Related Web Technologies
User Tips!
Coding Tips
Search

Sections:
Book Reviews
Sample Chapters
Commonly Asked Message Board Questions
Headlines from ASPWire.com
JavaScript Tutorials
MSDN Communities Hub
Official Docs
Security
Stump the SQL Guru!
Web Hosts
XML Info
Information:
Advertise
Feedback
Author an Article
Technology Jobs

















internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers
ASP ASP.NET ASP FAQs Message Board Feedback ASP Jobs
Print this page.

Windows Systems Administrator
Jupitermedia
US-CT-Darien

Justtechjobs.com Post A Job | Post A Resume

Published: Thursday, November 30, 2000

Searching 4GuysFromRolla.com


Have you ever performed a search on 4Guys? If not, take a moment to visit the search page and try a search. I've received a number of questions from users asking how to search a Web site, so I thought it would make a great article to describe how I search 4Guys!

- continued -

Originally, when 4Guys was a lot smaller and received less traffic on a daily basis, my search engine used the FileSystemObject to search through the text of each file on the Web server whenever a search was performed. In fact, I wrote up an article on how to do this back in December, 1999: Searching Through the Text of Each File on a WebSite.

There are two common techniques used for content-rich sites like 4Guys. One method is to have the contents of each article stored in a database and to create a single ASP page to display each of these. Sites like ASPWatch.com and SQLTeam.com use this technique. The other approach is to have a Web page for each article. Sites like 4Guys and ASP101.com follow this model.

On 4Guys, each article exists as its own file; this makes a textual search using the FileSystemObject a plausible solution. However, as the number of articles and visitors on 4Guys grew (as of 11/28/00 there are over 725 total articles and over 100,000 daily page views), the FileSystemObject approach slowed down considerably. I looked at using Index Server, but had fits getting it setup; also, I was wanting to create some sort of custom-database repository of the content on 4Guys. I then looked at using a product like XCache. For those unfamiliar with XCache, it is an application that allows a Web master to build a database of the site's content. Then, on a regular schedule, XCache will go through the database and turn it into a series of static HTML pages. This approach is useful for enhancing performance, since you remove all database calls (and all ASP execution time) from the site.

Rather than go with any of these solutions, I decided to create my own. Since I already (at the time) had about 300 articles (and I was very comfortable with the process I had for adding new content to the site), I didn't want to make any changes that would disrupt existing content (or my methodologies for adding new articles). Therefore, I decided to sort of do the inverse of what XCache does: rather than creating a database of my site's content and scheduling the creation of a static version for the site, I decided to write a script that would my existing (and future) static content, and build up a database of this information.

With that in mind, I created a database table, tblArticleIndex, with the following format:

tblArticleIndex
ArticleIndexIDint, PK
Titlevarchar(100)
Descriptionvarchar(255)
URLvarchar(100)
Contentstext

For each article on 4Guys, I'd add a row to the table. I automated this process by creating a simple script that would iterate through the ASP pages that comprised each article on 4Guys and use the FileSystemObject to populate each of the columns. I then used the task scheduler to schedule this script to execute once a day, late at night. (Each time it ran, it obliterated all of the contents of the tblArticleIndex table and then rebuilt the entire table by iterating through all of the articles. While this may seem like a waste of time/resources, I've found it to be no big deal, seeing as the entire operation takes under fifteen seconds. (So, yes, for ~15 seconds in the middle of the night, searching the 4Guys site may not return all of the results that are really there (since they are still being populated into the database).)

The script that builds up the database each night borrows a lot of its code from Searching Through the Text of Each File on a WebSite. The same code presented in Part 3 of that article is used in the database-building script. Some code has been added, though, to insert a row into the database for each article found. I am not going to go into detail on how the database-building script works, for I think it is pretty self-explanatory if you've thoroughly read Searching Through the Text of Each File on a WebSite. The database-building script's source can be viewed here.

Please do take a moment and check out the database-building script. It is important to realize that each article on 4Guys has an HTML header. Go ahead and do a View/Source on this article and you will see what I mean. The title of the article is wedged between a <!--TITLE: and --> pair of delimiters, while the description for an article is slapped between a pair of <!--DESC: and --> delimiters. (This stems back from the day when a search blasted through the entire contents of each file - the reason the titles and descriptions were stored at the top of the files was so that when a match was found when searching the contents of the file, I could intelligently list the title and description of the article in the search results.) Note that the database-building script picks out the included title and description and stuffs those in the Title and Description columns, respectively.

Now that we have the tblArticleIndex table built up on a nightly basis, all we need is an ASP page that will accept some search terms and intelligently search through this database table, returning paged results. We'll examine this page, search.asp in Part 2 of this article!

  • Read Part 2!


    Windows Internet Technology | ASP.NET [1.x] [2.0] | ASPMessageboard.com | ASPFAQs.com | Advertise | Feedback | Author an Article



  • JupiterOnlineMedia

    internet.comearthweb.comDevx.commediabistro.comGraphics.com

    Search:

    Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

    Jupitermedia Corporate Info


    Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

    Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

    Solutions
    Whitepapers and eBooks
    Microsoft Article: HyperV-The Killer Feature in WinServer ‘08
    Avaya Article: How to Feed Data into the Avaya Event Processor
    Microsoft Article: Install What You Need with Win Server ‘08
    HP eBook: Putting the Green into IT
    Whitepaper: HP Integrated Citrix XenServer for HP ProLiant Servers
    Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 1
    Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 2--The Future of Concurrency
    Avaya Article: Setting Up a SIP A/S Development Environment
    IBM Article: How Cool Is Your Data Center?
    Microsoft Article: Managing Virtual Machines with Microsoft System Center
    HP eBook: Storage Networking , Part 1
    Microsoft Article: Solving Data Center Complexity with Microsoft System Center Configuration Manager 2007
    MORE WHITEPAPERS, EBOOKS, AND ARTICLES
    Webcasts
    Intel Video: Are Multi-core Processors Here to Stay?
    On-Demand Webcast: Five Virtualization Trends to Watch
    HP Video: Page Cost Calculator
    Intel Video: APIs for Parallel Programming
    HP Webcast: Storage Is Changing Fast - Be Ready or Be Left Behind
    Microsoft Silverlight Video: Creating Fading Controls with Expression Design and Expression Blend 2
    MORE WEBCASTS, PODCASTS, AND VIDEOS
    Downloads and eKits
    Sun Download: Solaris 8 Migration Assistant
    Sybase Download: SQL Anywhere Developer Edition
    Red Gate Download: SQL Backup Pro and free DBA Best Practices eBook
    Red Gate Download: SQL Compare Pro 6
    Iron Speed Designer Application Generator
    MORE DOWNLOADS, EKITS, AND FREE TRIALS
    Tutorials and Demos
    How-to-Article: Preparing for Hyper-Threading Technology and Dual Core Technology
    eTouch PDF: Conquering the Tyranny of E-Mail and Word Processors
    IBM Article: Collaborating in the High-Performance Workplace
    HP Demo: StorageWorks EVA4400
    Intel Featured Algorhythm: Intel Threading Building Blocks--The Pipeline Class
    Microsoft How-to Article: Get Going with Silverlight and Windows Live
    MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES