Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 MSIDXS OpenQuery, Paged, Very Slow

Author  Topic 

ferr
Starting Member

4 Posts

Posted - 2008-06-12 : 20:59:14
I have a paging query that uses OpenQuery to access the MSIDXS indexing service and return records where text is matched. The query looks like this:

select top 25 * from
OpenQuery(FileSystem, 'SELECT DocTitle, FileName, Rank, Size, Create from Scope() where contains(''report'') ORDER BY Rank DESC')
as Q, LookupDocuments_dbv AS v where Q.FileName=v.Loc_cst and catname_cst='Main'
and (v.docid_cin not in (select top 600 v.docid_cin from
OpenQuery(FileSystem, 'SELECT DocTitle, FileName, Rank, Size, Create from Scope() where contains(''report'') ORDER BY Rank DESC')
as Q, LookupDocuments_dbv AS v where Q.FileName=v.Loc_cst order by v.docid_cin)) order by v.docid_cin

Not the exact query, but it is just as slow as the query in my program. This query will retrieve the top 25 records starting at record 601 where 'report' is found in a document and where the category = 'Main'

What is really weird about this is that in management studio this query will sometimes return in like 3 seconds, while the exact same query will take 30 seconds using ASP.Net / System.Data.SqlClient (and I mean it actually sits on sqlCommand.ExecuteReader() for a long period of time). Also, does management studio cache the table randomly, because sometimes it'll take forever and other times it'll be fast.

I do not have response issues with this paging query in non-MSIDXS queries.. I can for example return the 5,000th - 5,025th of 50,000 records in less than a second while the OpenQuery manages to get progressively slower very quickly (like 3 extra seconds per page of 25 records, so it takes like 20 seconds to return 150-175 of 600).

ferr
Starting Member

4 Posts

Posted - 2008-06-13 : 10:00:09
I think I may have found something to speed it up. I can use the Path field from msidxs to set a where clause that narrows down what documents to return from the openquery. However I don't know if you can do this with a variable.. isn't there something about not being able to use variables in OpenQuery? So I would not be able to do something like:

OpenQuery(FileSystem, 'SELECT DocTitle, path, FileName, Rank, Size, Create from Scope() where contains(''report'') and path like ''%' + @CatName + '%'' ORDER BY Rank DESC')

I can just put the text directly into it instead using a variable.. this is a security problem, but I guess I can do some pre-processing for size/etc.

Edit: Nope, even when setting a where clause for path to narrow down results I still get progressively slower at the same rate.

Edit2: If I remove the outer where clause for the category I get results much faster. It seems that some constraints have a huge impact while other do not. This is only true when using the openquery to get to MSIDXS, I'm sure there is some slow down as you get further into a normal query, but it appears to be negligible.
Go to Top of Page

ferr
Starting Member

4 Posts

Posted - 2008-06-13 : 10:33:21
This takes 2 seconds to return in Management Studio, but 45 seconds to return after firing SqlCommand.ExecuteReader()
Why? Is there some caching going on in MS? Can I turn that off so that I can experience similar results that I experience with ExecuteReader??

declare @CatName as varchar(80)
set @CatName='Main'
declare @Limit as int
set @Limit = 25
declare @Start as int
set @Start = 555


SELECT TOP (@Limit) DocID_cin, DateCreated_cdt, DocTitle_cst, ProgID_cin, DocTypeID_cin, Loc_cst, AOPG_cst,
OfficeDocNum_cst, DocDescription_cst, ProgName_cst, CurrentProgram_cby, LName_cst, FName_cst, ContactName_cst,
DocTypeAcronym_cst, FilePath_cst, UploadedBy_cst, CatID_cin, CatName_cst, CatAcronym_cst, DateRecordCreated_cdt,
CalendarYear_cin, DocTypeDesc_cst, ProjCode_cst, Project_cst, OfficeID_cin, DocContactID_cin, DocProject_cst, DocContactName_cst, ContentType_cst
FROM OpenQuery(FileSystem, 'SELECT DocTitle, Path, FileName, Rank, Size, Create from Scope() where contains(''report'') ORDER BY Rank DESC')
as Q, LookupDocuments_dbv AS v WHERE (CatName_cst = @CatName AND Q.FileName=v.Loc_cst) AND (docid_cin NOT IN (SELECT TOP (@Start) docid_cin
FROM OpenQuery(FileSystem, 'SELECT DocTitle, Path, FileName, Rank, Size, Create from Scope() where contains(''report'') ORDER BY Rank DESC')
as Q, LookupDocuments_dbv AS v WHERE CatName_cst = @CatName AND Q.FileName=v.Loc_cst ORDER BY DocID_cin DESC)) ORDER BY DocID_cin DESC



This, on the other hand, a non-msidxs/openquery query returns with my software in 300 milliseconds according to firebug, so it can't be something with the paging query style, there's something up with MSIDXS/OpenQuery..

declare @CatName as varchar(80)
set @CatName='Main'
declare @Limit as int
set @Limit = 25
declare @Start as int
set @Start = 455
declare @title1 as varchar(80)
set @title1 = 'life'

SELECT TOP (@Limit) DocID_cin, DateCreated_cdt, DocTitle_cst, ProgID_cin, DocTypeID_cin, Loc_cst, AOPG_cst,
OfficeDocNum_cst, DocDescription_cst, ProgName_cst, CurrentProgram_cby, LName_cst, FName_cst, ContactName_cst,
DocTypeAcronym_cst, FilePath_cst, UploadedBy_cst, CatID_cin, CatName_cst, CatAcronym_cst, DateRecordCreated_cdt,
CalendarYear_cin, DocTypeDesc_cst, ProjCode_cst, Project_cst, OfficeID_cin, DocContactID_cin, DocProject_cst, DocContactName_cst, ContentType_cst
FROM LookUpDocuments_dbv WHERE (CatName_cst = @CatName AND (DocTitle_cst like '%' + @title1 + '%' or DocDescription_cst like '%' + @title1 + '%'))
AND (docid_cin NOT IN (SELECT TOP (@Start) docid_cin FROM LookUpDocuments_dbv WHERE CatName_cst = @CatName AND (DocTitle_cst like '%' + @title1 + '%'
or DocDescription_cst like '%' + @title1 + '%') ORDER BY DocID_cin DESC)) ORDER BY DocID_cin DESC
Go to Top of Page

ferr
Starting Member

4 Posts

Posted - 2008-06-13 : 11:52:54
I've decided to throw out the idea of paging if I'm using OpenQuery.. so now I just retrieve every record within where clause constraints without a top/limit, and I do paging on the C# side after getting all of the records. I now get results in the 200ms range. yay
Go to Top of Page
   

- Advertisement -