Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 Old Forums
 CLOSED - General SQL Server
 Need suggestion and help!

Author  Topic 

vishal_7
Posting Yak Master

127 Posts

Posted - 2005-05-31 : 20:12:46
Hello,

I have two tables. The first tables contains a set of keywords and looks like this:

It has approx 40 rows.

CREATE TABLE [REBLOGGER_KEYWORDS] (
[id] [int] IDENTITY (1, 1) NOT NULL ,
[keyword_display] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[keyword1] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[keyword2] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[keyword3] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[keyword4] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[keyword5] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[keyword6] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[priority] [int] NULL
) ON [PRIMARY]
GO


The second table looks like this and has approx 6000 rows.

CREATE TABLE [REBLOGGER_POSTS] (
[ID] [int] IDENTITY (1, 1) NOT NULL ,
[BLOG_ID] [int] NOT NULL ,
[POST_PATH] [varchar] (500) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[POST_URL] [varchar] (500) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[POST_TITLE] [varchar] (500) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[POST] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[POST_DATE] [datetime] NOT NULL ,
[POST_ACTIVE] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL CONSTRAINT [DF_REBLOG_POSTS_POST_ACTIVE] DEFAULT ('Y'),
[keyword_id] [int] NULL ,
CONSTRAINT [PK_REBLOG_POSTS] PRIMARY KEY NONCLUSTERED
(
[ID]
) WITH FILLFACTOR = 90 ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO



The first problem is that when I do a select * from REBLOGGER_POSTS it takes over 40 secs to just return 3000 rows.

The second problem is that I need to check each keyword with the posts table to see if a keyword is available in a post or post_title - where post is a text field. If it has a keyword then I just add the keywordid to the posts table. I need to run this so that it checks each post regardless if it already has a keywordid or not. The idea is to reply the keywords to the posts. For that purpose I am using the following sp, which I am sure is totally not correct.

CREATE procedure pr_REBLOGGER_RankPosts 
as

set nocount on
declare @mId int, @mKeyWord1 varchar(25),@mKeyWord2 varchar(25),@mKeyWord3 varchar(25),@mKeyWord4 varchar(25),@mKeyWord5 varchar(25),@mKeyWord6 varchar(25)
declare @updSQL nvarchar(4000)
declare rate_crsr CURSOR FAST_FORWARD FOR select id, keyword1, keyword2, keyword3, keyword4, keyword5, keyword6 from reblogger_keywords order by priority

/*open cursor*/
open rate_crsr

/* get first row from cursor to start with*/

fetch next from rate_crsr into @mid, @mKeyWord1, @mKeyWord2, @mKeyWord3, @mKeyWord4, @mKeyWord5,@mKeyWord6

/* loop through all records fetching one row at a time*/
set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_id = NULL'

exec sp_executesql @updSQL

while @@fetch_status = 0
begin
/* update reblogger_posts for matching keyword in the post title*/
if len(@mkeyword1) > 0
begin
set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_id = ''' + cast(@mid as varchar(10)) + ''' where post_title like ''%' + @mkeyword1 + '%'' and KEYWORD_ID is null'
/*select @updsql*/
exec sp_executesql @updSQL

/* update reblogger_posts for matching keyword in the post body*/
set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_ID = ''' + cast(@mid as varchar(10)) + ''' where post like ''%' + @mkeyword1 + '%'' and KEYWORD_ID is null'

/*select @updsql*/
exec sp_executesql @updSQL
end
if len(@mkeyword2) > 0
begin
set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_id = ''' + cast(@mid as varchar(10)) + ''' where post_title like ''%' + @mkeyword2 + '%'' and KEYWORD_ID is null'
/*select @updsql*/
exec sp_executesql @updSQL

/* update reblogger_posts for matching keyword in the post body*/

set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_ID = ''' + cast(@mid as varchar(10)) + ''' where post like ''%' + @mkeyword2 + '%'' and KEYWORD_ID is null'
/*select @updsql*/
exec sp_executesql @updSQL
end
if len(@mkeyword3) > 0
begin
set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_id = ''' + cast(@mid as varchar(10)) + ''' where post_title like ''%' + @mkeyword3 + '%'' and KEYWORD_ID is null'
/*select @updsql*/
exec sp_executesql @updSQL

/* update reblogger_posts for matching keyword in the post body*/
set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_ID = ''' + cast(@mid as varchar(10)) + ''' where post like ''%' + @mkeyword3 + '%'' and KEYWORD_ID is null'
/*select @updsql*/
exec sp_executesql @updSQL
end

if len(@mkeyword4) > 0
begin
set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_id = ''' + cast(@mid as varchar(10)) + ''' where post_title like ''%' + @mkeyword4 + '%'' and KEYWORD_ID is null'
/*select @updsql*/
exec sp_executesql @updSQL

/* update reblogger_posts for matching keyword in the post body*/
set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_ID = ''' + cast(@mid as varchar(10)) + ''' where post like ''%' + @mkeyword4 + '%'' and KEYWORD_ID is null'
/*select @updsql*/
exec sp_executesql @updSQL
end

if len(@mkeyword5) > 0
begin
set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_id = ''' + cast(@mid as varchar(10)) + ''' where post_title like ''%' + @mkeyword5 + '%'' and KEYWORD_ID is null'
/*select @updsql*/
exec sp_executesql @updSQL

/* update reblogger_posts for matching keyword in the post body*/
set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_ID = ''' + cast(@mid as varchar(10)) + ''' where post like ''%' + @mkeyword5 + '%'' and KEYWORD_ID is null'
/*select @updsql*/
exec sp_executesql @updSQL
end

if len(@mkeyword6) > 0
begin
set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_id = ''' + cast(@mid as varchar(10)) + ''' where post_title like ''%' + @mkeyword6 + '%'' and KEYWORD_ID is null'
/*select @updsql*/
exec sp_executesql @updSQL

/* update reblogger_posts for matching keyword in the post body*/
set @updSQL = 'UPDATE REBLOGGER_POSTS SET KEYWORD_ID = ''' + cast(@mid as varchar(10)) + ''' where post like ''%' + @mkeyword6 + '%'' and KEYWORD_ID is null'
/*select @updsql*/
exec sp_executesql @updSQL
end

/* fetch next record from cursor */
fetch next from rate_crsr into @mid, @mKeyWord1, @mKeyWord2, @mKeyWord3, @mKeyWord4, @mKeyWord5, @mKeyWord6
end
/* mark post for deletion if no matching keyword found */
set @updSQL = 'UPDATE REBLOGGER_POSTS SET POST_ACTIVE = ''' + 'D' + ''' where KEYWORD_ID is null'
exec sp_executesql @updSQL
set @updSQL = 'UPDATE REBLOGGER_POSTS SET POST_ACTIVE = ''' + 'Y' + ''' where KEYWORD_ID is NOT null'
exec sp_executesql @updSQL
/*select @updsql*/

/* close cursor*/
deallocate rate_crsr


Now is there anything that I can boost this sp or the table? I know I should not use cursors, but I dont know how to check each keyword with the posts rows.

Any ideas or suggestions are very appreciated.

raclede
Posting Yak Master

180 Posts

Posted - 2005-05-31 : 20:36:44
I think cursors really hurts the performance. Well by looking at your Code hmm.. esp the Update part you can simplify it.

"If the automobile had followed the same development cycle as the computer, a Rolls-Royce would today cost $100, get a million miles per gallon, and explode once a year, killing everyone inside. "

raclede
Go to Top of Page

Merkin
Funky Drop Bear Fearing SQL Dude!

4970 Posts

Posted - 2005-05-31 : 21:13:34
Your main problem is your design.
The keyword1, keyword2 etc in one table isn't very normalized. I'd put each keyword as a separate row in a table which will make it much easier to write a set based query to find and update keywords.

Is it too late to change your design ?



Damian
Ita erat quando hic adveni.
Go to Top of Page

vishal_7
Posting Yak Master

127 Posts

Posted - 2005-05-31 : 21:22:26
Hello

thanks for the reponse to both. Merkin, how would adding the keywords into one row each make it different? The reason why we put several keyword columns is that a certain keyword can belong in several categories. For example keyword java.xml and jaxp is the same. We are already using the tables as they are, but I am willing to change them, if that it what it makes it faster. I am open to hear about the changes you would do.

Again thanks!
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2005-05-31 : 21:23:03
quote:

Is it too late to change your design ?



Its never too late to fix crap like that....

Sorry... what am I thinking... NULLs are GOOD............

DavidM

A front-end is something that tries to violate a back-end.
Go to Top of Page

vishal_7
Posting Yak Master

127 Posts

Posted - 2005-05-31 : 21:23:41
Furthermore how do I make the post table faster? It takes over 40 secs to retrieve 3.000 posts and there are approx. 6.000.
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2005-05-31 : 21:30:26
quote:

Furthermore how do I make the post table faster?



Would it be the fact that you have stored the Smithsonian Institute in you TEXT column?

Seriously...

SELECT MAX(DATALENGTH(POST)) as MaxBytes, AVG(DATALENGTH(POST)) as AvgBytes
FROM REBLOGGER_POSTS

DavidM

A front-end is something that tries to violate a back-end.
Go to Top of Page

vishal_7
Posting Yak Master

127 Posts

Posted - 2005-05-31 : 21:45:24
well its a post column. How else I am supposed to store the post content, where the lentgth can be very long?

MaxBytes 170796
AvgBytes 1545
Go to Top of Page

Merkin
Funky Drop Bear Fearing SQL Dude!

4970 Posts

Posted - 2005-05-31 : 21:48:05
Why do you need to SELECT * from that table with no filtering ? If you have lots of large posts the slowdown is probably lots of IO and network traffic, there isn't anything you can do about that, except design your application better so that your queries don't select EVERYTHING in one go.

As for your design, if I get some time later I'll try to explain it.



Damian
Ita erat quando hic adveni.
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2005-05-31 : 21:48:18
Not excessive numbers...

Try the "TEXT IN ROW" option on this table..

DavidM

A front-end is something that tries to violate a back-end.
Go to Top of Page

vishal_7
Posting Yak Master

127 Posts

Posted - 2005-05-31 : 21:52:30
sorry for the stupid question, but how do I set that option when using:

SELECT MAX(DATALENGTH(POST)) as MaxBytes, AVG(DATALENGTH(POST)) as AvgBytes
FROM REBLOGGER_POSTS
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2005-05-31 : 22:25:16
It a configuration setting...

BOL.. sp_tableoption

DavidM

A front-end is something that tries to violate a back-end.
Go to Top of Page

vishal_7
Posting Yak Master

127 Posts

Posted - 2005-05-31 : 23:01:11
Thanks, but it returns the same number.

sp_tableoption N'REBLOGGER_POSTS', 'text in row', 'ON'
SELECT MAX(DATALENGTH(POST)) as MaxBytes, AVG(DATALENGTH(POST)) as AvgBytes
FROM REBLOGGER_POSTS
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2005-05-31 : 23:08:21
You missed several vital aspects..

1) The amount to place in row
2) How SQL deals with..
Keep reading....

sp_tableoption N'REBLOGGER_POSTS', 'text in row', '2048'

It will not alter the data.. only its physical location..
You MUST update exisitng data for it to be "re located".

UPDATE REBLOGGER_POSTS SET POST = POST
WHERE DATALENGTH(POST) < = 2048

DavidM

A front-end is something that tries to violate a back-end.
Go to Top of Page

vishal_7
Posting Yak Master

127 Posts

Posted - 2005-05-31 : 23:16:17
Sorry for another stupid question, but what exactly will the 'table in row' option do? I dont understand it. Thanks for helping.
Go to Top of Page

vishal_7
Posting Yak Master

127 Posts

Posted - 2005-06-26 : 12:27:53
Is there any possibility to change the sp to not use cursors?
Go to Top of Page

vishal_7
Posting Yak Master

127 Posts

Posted - 2005-06-26 : 20:09:40
Can anybody help me please? I can change the design if necessary.
Go to Top of Page

vishal_7
Posting Yak Master

127 Posts

Posted - 2005-06-26 : 20:21:56
Would it help to enable full text search on that database and then using FreeText?
Go to Top of Page
   

- Advertisement -