Keyword search using a tally or sequence table

By Damian Maclennen on 8 October 2001 | Tags: SELECT

Ryan writes "I am looking for a way of creating a stored procedure that will carry out a keyword search without building up the SQL dynamically in the stored procedure. What I'm after is a method of doing this that doesn't require me to build up a huge string."

"When I have to match a list of IDs I would typically parse the comma delimted list of IDs and insert them in to a temporary table and then use an IN clause as follows:
SELECT @strEmployeeID = @strEmployeeID + ','

CREATE TABLE #employees (EmployeeID int not null)

IF CHARINDEX(',', @strEmployeeID) > 0 BEGIN
     WHILE CHARINDEX(',',@strEmployeeID) > 0 BEGIN
	INSERT #employees VALUES(SUBSTRING(@strEmployeeID, 1, CHARINDEX(',', @strEmployeeID) - 1))
	SELECT @strEmployeeID = STUFF(@strEmployeeID, 1, CHARINDEX(',', @strEmployeeID),'')

SELECT * FROM Job WHERE EmployeeID IN (SELECT EmployeeID FROM #employees)

Can anyone offer me a similar solution where I parse the keywords in to a temporary table of some sort or is my only option to build a monster string ?"

Sure Ryan. There are a few ways to do this. One way is used on SQL Team, Graz wrote a great article about it here. A few months ago, I was building a similar site to this one and had an idea of a different way to go about it.

First some background. Rob Volk wrote an article at the beginning of this year on Parsing CSV Values Into Multiple Rows using a "tally" or "sequence" table method. If you are not familiar with this article and/or technique I strongly suggest you go read that now. If it is new to you, read it a few times, it may take a bit to get your head around it. Essentially, it splits out a comma delimited list and returns a set.

The technique I am going to demonstrate takes this approach.

Firstly, we need our tally table. I am going to call my table "Sequence".

Create Table Sequence(
       Seq int

--Generate some data for it
Declare @i int
Set @i = 1
WHILE @i <= 8000
         Insert Into Sequence Values (@i)
         Set @i = @i + 1

Now take our list of keywords (in this case, they are space delimited) and convert them into a Set. Paste this code into Query Analyzer and have a look.

Declare @Keywords varchar(2000)
Select @Keywords = 'my space delimited string'

		Substring(' ' + @keywords + ' ',seq,
		CharIndex(' ' , ' ' + @keywords + ' ' , seq) - seq)
	 seq <= len(' ' + @keywords + ' ') and
	 Substring(' ' + @keywords + ' ', seq - 1, 1) = ' ' and
	 CharIndex(' ' , ' ' + @keywords + ' ' , seq) - seq > 0 

Pretty cool huh ?

Next up, we JOIN the article table to this set based on the body text. Where a string in the article body matches one of the words in this set, the join condition will match and the table will join. Where the word appears more than once, the table will join multiple times. If you group these and and count the number of joins ... voila! ... instant ranked search results.

Here is some code...

--Assuming we have an Articles table
Create Table Articles(
	ArticleID int NOT NULL,
	ArticleTitle VarChar(200) NOT NULL,
	ArticleDescription VarChar(500) NOT NULL,
	ArticleBody Text

Declare @Keywords varchar(2000)
Select @Keywords = 'my space delimited string'

	Select ArticleID, ArticleTitle, ArticleDescription, count(AA_ID) hits
	INNER JOIN Articles ON Articles.ArticleBody like '%' +  
		Substring(' ' + @keywords + ' ',seq,
		CharIndex(' ' , ' ' + @keywords + ' ' , seq) - seq)
		+ '%'
	 seq <= len(' ' + @keywords + ' ') and
	 Substring(' ' + @keywords + ' ', seq - 1, 1) = ' ' and
	 CharIndex(' ' , ' ' + @keywords + ' ' , seq) - seq > 0 
	Group by ArticleID, ArticleTitle, ArticleDescription

So there you have it, a simple way to use the "tally" table to produce ranked keyword searches.

The "tally" table technique (say that 6 times fast) is a very powerful tool and a good one to have in your arsenal whenever you need to split delimited strings of words or numbers.

Good luck...and happy coding.

Related Articles

Joining to the Next Sequential Row (2 April 2008)

Writing Outer Joins in T-SQL (11 February 2008)

How to Use GROUP BY with Distinct Aggregates and Derived tables (31 July 2007)

How to Use GROUP BY in SQL Server (30 July 2007)

SQL Server 2005: Using OVER() with Aggregate Functions (21 May 2007)

Server Side Paging using SQL Server 2005 (4 January 2007)

Using XQuery, New Large DataTypes, and More (9 May 2006)

Counting Parents and Children with Count Distinct (10 January 2006)

Other Recent Forum Posts

Wildcard In A Date Field? (16h)

Select Top 1 from a nested CTE (1d)

Problem with Inner Join (1d)

Query performance date range (2d)

Upload a CSV file to sql using a query (3d)

Column type for case ... end value (4d)

Inserting a default value for a datetimeoffset column from a json data field that has a null value (5d)

Problem with Inner Join (5d)

- Advertisement -