Keyword search using a tally or sequence table

By Damian Maclennen on 8 October 2001 | 8 Comments | Tags: SELECT


Ryan writes "I am looking for a way of creating a stored procedure that will carry out a keyword search without building up the SQL dynamically in the stored procedure. What I'm after is a method of doing this that doesn't require me to build up a huge string."

"When I have to match a list of IDs I would typically parse the comma delimted list of IDs and insert them in to a temporary table and then use an IN clause as follows:
SELECT @strEmployeeID = @strEmployeeID + ','

CREATE TABLE #employees (EmployeeID int not null)

IF CHARINDEX(',', @strEmployeeID) > 0 BEGIN
     WHILE CHARINDEX(',',@strEmployeeID) > 0 BEGIN
	INSERT #employees VALUES(SUBSTRING(@strEmployeeID, 1, CHARINDEX(',', @strEmployeeID) - 1))
	SELECT @strEmployeeID = STUFF(@strEmployeeID, 1, CHARINDEX(',', @strEmployeeID),'')
	END
END

SELECT * FROM Job WHERE EmployeeID IN (SELECT EmployeeID FROM #employees)

Can anyone offer me a similar solution where I parse the keywords in to a temporary table of some sort or is my only option to build a monster string ?"

Sure Ryan. There are a few ways to do this. One way is used on SQL Team, Graz wrote a great article about it here. A few months ago, I was building a similar site to this one and had an idea of a different way to go about it.

First some background. Rob Volk wrote an article at the beginning of this year on Parsing CSV Values Into Multiple Rows using a "tally" or "sequence" table method. If you are not familiar with this article and/or technique I strongly suggest you go read that now. If it is new to you, read it a few times, it may take a bit to get your head around it. Essentially, it splits out a comma delimited list and returns a set.

The technique I am going to demonstrate takes this approach.

Firstly, we need our tally table. I am going to call my table "Sequence".


Create Table Sequence(
       Seq int
)

--Generate some data for it
SET NOCOUNT ON
Declare @i int
Set @i = 1
WHILE @i <= 8000
     BEGIN
         Insert Into Sequence Values (@i)
         Set @i = @i + 1
     END
SET NOCOUNT OFF   

Now take our list of keywords (in this case, they are space delimited) and convert them into a Set. Paste this code into Query Analyzer and have a look.


Declare @Keywords varchar(2000)
Select @Keywords = 'my space delimited string'

	Select
		Substring(' ' + @keywords + ' ',seq,
		CharIndex(' ' , ' ' + @keywords + ' ' , seq) - seq)
	FROM SEQUENCE
	
	WHERE 
	 seq <= len(' ' + @keywords + ' ') and
	 Substring(' ' + @keywords + ' ', seq - 1, 1) = ' ' and
	 CharIndex(' ' , ' ' + @keywords + ' ' , seq) - seq > 0 

Pretty cool huh ?

Next up, we JOIN the article table to this set based on the body text. Where a string in the article body matches one of the words in this set, the join condition will match and the table will join. Where the word appears more than once, the table will join multiple times. If you group these and and count the number of joins ... voila! ... instant ranked search results.

Here is some code...

--Assuming we have an Articles table
Create Table Articles(
	ArticleID int NOT NULL,
	ArticleTitle VarChar(200) NOT NULL,
	ArticleDescription VarChar(500) NOT NULL,
	ArticleBody Text
)

Declare @Keywords varchar(2000)
Select @Keywords = 'my space delimited string'


	Select ArticleID, ArticleTitle, ArticleDescription, count(AA_ID) hits
	FROM SEQUENCE
	INNER JOIN Articles ON Articles.ArticleBody like '%' +  
		Substring(' ' + @keywords + ' ',seq,
		CharIndex(' ' , ' ' + @keywords + ' ' , seq) - seq)
		+ '%'
	
	WHERE 
	 seq <= len(' ' + @keywords + ' ') and
	 Substring(' ' + @keywords + ' ', seq - 1, 1) = ' ' and
	 CharIndex(' ' , ' ' + @keywords + ' ' , seq) - seq > 0 
	
	Group by ArticleID, ArticleTitle, ArticleDescription
	
	ORDER BY Hits DESC	

So there you have it, a simple way to use the "tally" table to produce ranked keyword searches.

The "tally" table technique (say that 6 times fast) is a very powerful tool and a good one to have in your arsenal whenever you need to split delimited strings of words or numbers.

Good luck...and happy coding.

Discuss this article: 8 Comments so far. Print this Article.

If you like this article you can sign up for our weekly newsletter. There's an opt-out link at the bottom of each newsletter so it's easy to unsubscribe at any time.

Email Address:

Related Articles

Joining to the Next Sequential Row (2 April 2008)

Writing Outer Joins in T-SQL (11 February 2008)

How to Use GROUP BY with Distinct Aggregates and Derived tables (31 July 2007)

How to Use GROUP BY in SQL Server (30 July 2007)

SQL Server 2005: Using OVER() with Aggregate Functions (21 May 2007)

Server Side Paging using SQL Server 2005 (4 January 2007)

Using XQuery, New Large DataTypes, and More (9 May 2006)

Counting Parents and Children with Count Distinct (10 January 2006)

Other Recent Forum Posts

Creating an instance of COM component issue (0 Replies)

Output search criteria for multiple OR (0 Replies)

same stored procedures on similar DB delay (0 Replies)

How to add a dropdown to a texbox inside a report. (0 Replies)

Take substrings and put them in IN function (7 Replies)

sql server global variables (0 Replies)

vm to azre disk db copy issue (2 Replies)

question on 'lag' funtion (3 Replies)

Subscribe to SQLTeam.com

Weekly SQL Server newsletter with articles, forum posts, and blog posts via email. Subscribers receive our white paper with performance tips for developers.

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -