Published: Monday, June 24, 2002
Spell Check a String with ASP and Microsoft Word Automation
By Eric Blanpain
Introduction
More than 50% of search queries on my site return no results simply because of misspelled words in the
search query. Search engines like Google provide helpful suggestions on new searches if you happen
to misspell a word in your search query. Perhaps you've wondered how you could
do the same for the search engine on your own Web site?
In this article we'll look at a server-side solution that uses Microsoft Word to provide spell checking
and suggestions for your search engine.
Essentially the code to perform a spell check opens the Word object on the server, adds the user's
search query to it, and then uses the proofing capabilities of Word, returning the suggested spelling
corrections (if any). We want to do this server-side, so we can not use Word's SpellCheck,
since it opens a message box and waits until a user interactively manually validates the correction.
Instead we will use the SpellingErrors method, which returns a collection of spelling errors
and their suggestions.
Using Microsoft Word on the server-side has some performance and security implications, as discussed
later in the article. Note that for this code to work you will need to have Word installed on the Web
server and setup so that the anonymous Web user (IUSR_machinename) can perform
Word automation (how to do this is discussed later on in this article). The performance is a
bit sluggish, too, and can take up to several full seconds for larger search queries. You should
consider using this approach only for lightly loaded Web servers.
Spell Checking with Microsoft Word
To spell check the user's search engine query, we will first create an instance of the Microsoft Word
object, like so:
Set objWord = CreateObject("Word.Application")
|
Next, we must create a document in the Word object whose content is the user's query that we wish to
spell check. This is accomplished by creating a new Word document and adding to the document the
user's query, like so:
Set objDocument = objWord.Documents.Add
objDocument.Content = QueryText
|
(Here QueryText is a string variable that holds the user's query...)
Now we want to check if there are any spelling errors in the document. To do this we call the
SepllingErrors method, which returns a collection of misspelled words. The Count
property indicates how many words were misspelled. Hence, if the Count property is
greater than 0, then we know we have misspelled words.
'Get the number of words in the document
NumberOfWords = objDocument.Words.count
'How many spelling errors are there in the document?
NumberOfErrors = ObjDocument.SpellingErrors.Count
If NumberOfErrors = 0 Then
'There are no spelling errors...
Else
'There is at least one spelling error...
End If
|
If there are spelling words we want to loop through each word in our document, determining if
the word is spelled correctly or not. If it is, we want to display that it was spelled correctly,
otherwise we want to get the suggestions for the correct spelling (if any exist). To do this
we simply loop through all of the words in our document and see if the word contains any spelling
errors. If it does, we call the GetSpellingSuggestions method, which returns a collection
of suggested spellings for the misspelled word. We opt to select the first item in this collection as
a suggestion to the user on how to re-enter their search query.
If NumberOfErrors = 0 Then
'There are no spelling errors...
Else
'There is at least one spelling error...
'loop through each word in the document
i=1
while i < NumberOfWords
'See if there are any spelling errors for this word
if ObjDocument.Words(i).SpellingErrors.Count > 0 then
'Yes, there are errors, see if there are any suggestions
if objDocument.Words(i).GetSpellingSuggestions.count > 0 then
'Yes, there is one or more suggestions. You can grab the first
'suggested spelling by referencing:
'objDocument.Words(i).GetSpellingSuggestions.Item(1).Name
else
'There were no suggested spellings for this misspelled word
end if
else
'The word was spelled correctly
end if
i = i + 1
wend
End If
|
That's all there is to it code-wise. Of course, to get the code to work it is imperative that you setup
the security settings for Word properly.
Setting Up Word Security
To run this code you will need Microsoft Word installed on the Web server with the proper security
settings. These security settings allow the anonymous Internet user to access Word on the server.
There is a good Microsoft article on how to achieve this, available at:
http://support.microsoft.com/default.aspx?scid=kb;en-us;Q288367.
Make sure you read this article! If you don't properly specify the security settings the code will not work,
resulting in an (0x800A175D) Could not open macro storage... error message.
I've found it best is to create a MSWordUser group account and let IUSR_machinename
account be a member. Alternatively you may set up Word so that it works with the interactive user
(faster to set-up, but less safe); information on setting up Word to use the interactive user
can be found at: http://support.microsoft.com/default.aspx?scid=kb;en-us;Q288366.
Feel free to modify the script to fit your needs. Word automation with VBS is not well documented, but
there is a good documentation
for VBA available here.
Performance of this Approach
At each Web request to the ASP page, Word is opened and closed, which can be costly; hence, this is best
suited for sites with limited load. In my informal tests I've found that
execution times for a three word queries is typically 0.5 seconds, once the objects are created
(usually between 0.5 and 1 seconds). I have not run any formal performance benchmarks, but would be
delighted if someone was interested in doing this and sharing their results.
Since many users will likely search using the same search queries, one potential optimization would be
to save a search string when it has been spell checked and shown to contain no errors. Such a "correct" query
could be stored in a database. When a user did a search, a quick check against this database table could
be performed to determine if the spell checker needed to be invoked or not.
Conclusion
While using Microsoft Word server-side is not for everyone (be sure to read
this Microsoft article first),
but there are certain situations where the use of server-side Office automation is really helpful. Just
be sure to do performance testing adequately before deploying your Office automation-based Web site,
and be certain that you have the security issues setup properly to allow you to utilize Office automation
through an ASP page.
Happy Programming!
By Eric Blanpain
View a live demo!
Download the code/references (in ZIP format)
Eric Blanpain, just turned 40, runs a company that markets scientific instrumentation, based in
Paris, France. As a Marketing and Internet/e-commerce/ASP consultant he has helped many companies in
the field expend their sales dramatically! Inquiries welcome.