Wednesday, June 4, 2008

Controlling SQL Text Indexing Services with SQL Managed Objects and VB.NET (1)

By Paul Kimmel

Some applications create whole databases and tables dynamically. Entity Diagramming tools such as ERwin generate databases. Application builders or customizers—like ACT—modify databases, and applications that permit customized data storage sometimes generate new tables or even whole databases.

The challenge with dynamic generation is that your application is already running, and it is unlikely that you will be there to perform administrative tasks, such as enabling full text indexing.

Full text indexing uses an indexing service and permits you to index text fields in SQL Server. The benefit is that you can use functions like FREETEXT in SQL Server queries to query one or more fields for fragments of text. The net effect is that full text indexing makes your database capable of performing Google-like searches. That is, with full text indexing you can let the user enter any text data and SQL Server can search multiple fields for that fragment with expressing a WHERE predicate for every combination. Instead of WHERE CustomerName LIKE 'somedata' OR ADDRESS CONTAINS 'somedata', and so forth, you write one predicate using the FREETEXT function and SQL Server searches all of the indexed columns for that text.

Here is where full text searching is really useful. Many, many applications support search and find capabilities. Many of these applications typically have a label indicating the kind of data that will be used to build the filter and an input control of some sort. Based on the controls containing values a dynamic search is assembled with as many WHERE predicates as there are kinds of input data. All of these labels, controls, and dynamic assemblage of queries are time consuming.

The no-brainer is that Google is a metaphor that everyone seems to get; hence its popularity. So, why not have one input field every time where the user can input any data and let the indexing service and full text indexing find the matches for you? Full text indexing makes this possible. My article from August 2005, FREETEXT Searches with SQL Server and ADO.NET, demonstrates how to configure indexing services manually and perform full text searches. (This article builds on that concept in a different way.)

The aforementioned article walks you through FREETEXT searches and manually configuring indexing services. This article shows you how to enable full text indexing and FREETEXT searches dynamically.

With SQL Server 2005, the SQL Management Objects (SMO) capabilities were added to SQL Server and the .NET framework. The result is that many of the things that one could do manually using SQL Server tools can now be accomplished programmatically using SMO and VB.NET. In this article, you will learn how to support full text searches on SQL Server Express 2005—it's already supported in non-Express versions—enable full text indexing with SMO, and write FREETEXT queries against indexed columns, all with .NET code.
Preparing the Demo

When you install Visual Studio 2005 or up, SQL Server Express 2005 is installed. Some of you may have access only to SQL Express and some more of you will be developing on your workstation with SQL Server Express. Unfortunately, the out-of-the-box install doesn't support full text indexing for SQL Server Express. Therefore, if you are using SQL Server Express and you want to try the full text indexing demo, read the next paragraph. Otherwise, if you are using SQL 2005 developer edition or above, you can skip the next paragraph.

Querying the Database Using FREETEXT

As written, the code supports immediate querying using the indexes. The code in Listing 1, starting with the connectionString variable, contains the vanilla ADO.NET code. The description of the SQL code is provided after the listing.

Listing 1: The complete list creates the full text index with SMO and uses the index to perform a FREETEXT query.

Imports Microsoft.SqlServer.Management.Smo
Imports Microsoft.SqlServer.Management.Common
Imports System.Data
Imports System.Data.SqlClient

Module Module1

Sub Main()
Dim server As Server = New Server(".\SQLExpress")
Dim db As Database = server.Databases("StockHistory")
Dim table As Table = db.Tables("Company")

Dim catalogName As String = "Company_Catalog"

If (db.FullTextCatalogs(catalogName) Is Nothing) Then
Dim catalog As FullTextCatalog = _
New FullTextCatalog(db, catalogName)
catalog.IsDefault = True
catalog.Create()
End If
If (table.FullTextIndex Is Nothing) Then
Dim index As FullTextIndex = New FullTextIndex(table)
Dim column1 As FullTextIndexColumn = _
New FullTextIndexColumn(index, "CompanyName")
index.IndexedColumns.Add(column1)
index.ChangeTracking = ChangeTracking.Automatic
index.UniqueIndexName = "PK_Company_1"
index.CatalogName = catalogName
index.Create()

index.StartPopulation(IndexPopulationAction.Incremental)
End If
Dim connectionString As String = _
"Data Source=.\SQLExpress;Initial Catalog=StockHistory;" + _
"Integrated Security=True;Pooling=False"

Using connection As SqlConnection = _
New SqlConnection(connectionString)
connection.Open()
Dim command As SqlCommand = New SqlCommand( _
"SELECT * FROM COMPANY WHERE FREETEXT(*, 'MICROSOFT')", _
connection)

Dim reader As SqlDataReader = command.ExecuteReader

While (reader.Read())
Console.WriteLine("{0}, {1}, {2}", reader("CompanyID"), _
reader("CompanyName"), reader("CompanySymbol"))
End While

Console.ReadLine()
End Using
End Sub
End Module

In the second half of the example, a connectionString is declared. The connectionString is used to initialize a SqlConnection object in a Using statement. The Using statement ensures the SqlConnection is closed and disposed of at the end of the using block. The connection is opened. A SqlCommand object is created. The SqlCommand object contains the SQL text and the connection object. Notice that the WHERE clause contains the FREETEXT predicate.

The FREETEXT predicate supports specifying the columns to search. The asterisk (*) means searching all indexed columns. The text—in the listing, 'MICROSOFT'—indicates the text to look for. Finally, a SqlDataReader is used to read and display the resultset.

As the example, you could have solved the same problem with the following query:

WHERE CompanyName LIKE 'MICROSOFT%'

Clearly, this works. Where LIKE begins to fail is when the text can appear anywhere, when there are multiple columns to check search, and those columns are dependent on specific columns to search as expressed by the user. For example, if you have ten possible fields to search, building the WHERE clause becomes very clumsy; by using FREETEXT and full text Indexing, the WHERE clause changes very little regardless of the fields to search.
Summary

SQL Management Objects (SMO) is the ability to manage SQL programmatically. This capability was added to SQL Server 2005 and the .NET Framework 2.0. In the example, you see how Full text Indexing can be enabled, indexes configured, and the index populated and used—all in the same chunk of code.

There is a lot more to SMO then shown here, but this article should get you started. If nothing else, perhaps you can save time by simply adding a single point of entry search to your applications and letting the Indexing Service and SQL Server figure out how to construct the actual search against the indexes.
About the Author

Paul Kimmel is the VB Today columnist for www.codeguru.com and has written several books on object-oriented programming and .NET. Check out his upcoming book, LINQ Unleashed for C#, due in July 2008. Paul Kimmel is an Application Architect for EDS. You may contact him for technology questions at pkimmel@softconcepts.com.

Lansing is having a free day of .NET training at Lansing Community College on June 21st. Check out the web site for details. The group likes to think of it as a poor man's TechEd (because it's free), but the content will be excellent.

Copyright © 2008 by Paul T. Kimmel. All Rights Reserved.

No comments: