Keyword search using a tally or sequence table

By Damian Maclennen on 8 October 2001 | Tags: SELECT


Ryan writes "I am looking for a way of creating a stored procedure that will carry out a keyword search without building up the SQL dynamically in the stored procedure. What I'm after is a method of doing this that doesn't require me to build up a huge string."

"When I have to match a list of IDs I would typically parse the comma delimted list of IDs and insert them in to a temporary table and then use an IN clause as follows:
SELECT @strEmployeeID = @strEmployeeID + ','

CREATE TABLE #employees (EmployeeID int not null)

IF CHARINDEX(',', @strEmployeeID) > 0 BEGIN
     WHILE CHARINDEX(',',@strEmployeeID) > 0 BEGIN
	INSERT #employees VALUES(SUBSTRING(@strEmployeeID, 1, CHARINDEX(',', @strEmployeeID) - 1))
	SELECT @strEmployeeID = STUFF(@strEmployeeID, 1, CHARINDEX(',', @strEmployeeID),'')
	END
END

SELECT * FROM Job WHERE EmployeeID IN (SELECT EmployeeID FROM #employees)

Can anyone offer me a similar solution where I parse the keywords in to a temporary table of some sort or is my only option to build a monster string ?"

Sure Ryan. There are a few ways to do this. One way is used on SQL Team, Graz wrote a great article about it here. A few months ago, I was building a similar site to this one and had an idea of a different way to go about it.

First some background. Rob Volk wrote an article at the beginning of this year on Parsing CSV Values Into Multiple Rows using a "tally" or "sequence" table method. If you are not familiar with this article and/or technique I strongly suggest you go read that now. If it is new to you, read it a few times, it may take a bit to get your head around it. Essentially, it splits out a comma delimited list and returns a set.

The technique I am going to demonstrate takes this approach.

Firstly, we need our tally table. I am going to call my table "Sequence".


Create Table Sequence(
       Seq int
)

--Generate some data for it
SET NOCOUNT ON
Declare @i int
Set @i = 1
WHILE @i <= 8000
     BEGIN
         Insert Into Sequence Values (@i)
         Set @i = @i + 1
     END
SET NOCOUNT OFF   

Now take our list of keywords (in this case, they are space delimited) and convert them into a Set. Paste this code into Query Analyzer and have a look.


Declare @Keywords varchar(2000)
Select @Keywords = 'my space delimited string'

	Select
		Substring(' ' + @keywords + ' ',seq,
		CharIndex(' ' , ' ' + @keywords + ' ' , seq) - seq)
	FROM SEQUENCE
	
	WHERE 
	 seq <= len(' ' + @keywords + ' ') and
	 Substring(' ' + @keywords + ' ', seq - 1, 1) = ' ' and
	 CharIndex(' ' , ' ' + @keywords + ' ' , seq) - seq > 0 

Pretty cool huh ?

Next up, we JOIN the article table to this set based on the body text. Where a string in the article body matches one of the words in this set, the join condition will match and the table will join. Where the word appears more than once, the table will join multiple times. If you group these and and count the number of joins ... voila! ... instant ranked search results.

Here is some code...

--Assuming we have an Articles table
Create Table Articles(
	ArticleID int NOT NULL,
	ArticleTitle VarChar(200) NOT NULL,
	ArticleDescription VarChar(500) NOT NULL,
	ArticleBody Text
)

Declare @Keywords varchar(2000)
Select @Keywords = 'my space delimited string'


	Select ArticleID, ArticleTitle, ArticleDescription, count(AA_ID) hits
	FROM SEQUENCE
	INNER JOIN Articles ON Articles.ArticleBody like '%' +  
		Substring(' ' + @keywords + ' ',seq,
		CharIndex(' ' , ' ' + @keywords + ' ' , seq) - seq)
		+ '%'
	
	WHERE 
	 seq <= len(' ' + @keywords + ' ') and
	 Substring(' ' + @keywords + ' ', seq - 1, 1) = ' ' and
	 CharIndex(' ' , ' ' + @keywords + ' ' , seq) - seq > 0 
	
	Group by ArticleID, ArticleTitle, ArticleDescription
	
	ORDER BY Hits DESC	

So there you have it, a simple way to use the "tally" table to produce ranked keyword searches.

The "tally" table technique (say that 6 times fast) is a very powerful tool and a good one to have in your arsenal whenever you need to split delimited strings of words or numbers.

Good luck...and happy coding.


Related Articles

Joining to the Next Sequential Row (2 April 2008)

Writing Outer Joins in T-SQL (11 February 2008)

How to Use GROUP BY with Distinct Aggregates and Derived tables (31 July 2007)

How to Use GROUP BY in SQL Server (30 July 2007)

SQL Server 2005: Using OVER() with Aggregate Functions (21 May 2007)

Server Side Paging using SQL Server 2005 (4 January 2007)

Using XQuery, New Large DataTypes, and More (9 May 2006)

Counting Parents and Children with Count Distinct (10 January 2006)

Other Recent Forum Posts

As I gain experience and get older, I'm working much slower, but producing better quality, but (29m)

Master DB 2019 problem (15h)

Please help, I can't login remote to MS SQL 2019 without sysadmin role (22h)

SSMS Cannot Connect to Newly Installed Instance (2017) (1d)

SQL server 2019 alwayson problem (2d)

Finding Possible Duplicates (4d)

SQL Agent Service will not start - timeout error (5d)

Adding a SQL connection to Microsoft Visual Studio (5d)

- Advertisement -