LIKE statement on large numbers of rows in table

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

LIKE statement on large numbers of rows in table

Author

Topic

vladicaognjanovic
Starting Member

26 Posts

Posted - 2006-07-05 : 09:51:39

Hy,
I need suggestion
how to performe SELECT statement with %LIKE% in WHERE on large number of records in table.
Thanks.

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-07-05 : 09:58:01

It is not recommended since LIKE '%Text%' will not use any index. Otherwise, just write

SELECT * FROM MyTable WHERE MyField LIKE '%slow%'

Peter Larsson
Helsingborg, Sweden

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2006-07-06 : 02:37:13

Although it cant make use of index, you can make it to use it if you have primary and use it in where clause



Declare @t table(i int primary key clustered, data varchar(100))
insert into @t
	select 1,'test' union all
	select 2,'Testing' union all
	select 3,'tes' union all
	select 4,'Moretest' union all
	select 5,NULL

Select * from @t where data like '%test%' --No index Usage

Select * from @t where data like '%test%' and i>=1 --Index Usage

Madhivanan

Failing to plan is Planning to fail

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-07-06 : 03:56:57

Great tip with and i >= 1

Does it also work with and i = i? Or any other comparison? I have noticed that some deletions took about 30 minutes for me. When I added and uid = uid to the DELETE clause, the query ran in less than 12 seconds. Didn't bother to know why then, but this might be the answer?

Peter Larsson
Helsingborg, Sweden

vladicaognjanovic
Starting Member

26 Posts

Posted - 2006-07-06 : 04:24:46

Thanks for replying,
He is my problem. I have table:
CREATE TABLE [dbo].[t](
[id] [int] IDENTITY(1,1) NOT NULL,
[str] [nvarchar](500) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
CONSTRAINT [PK_like] PRIMARY KEY CLUSTERED
and this table can have more than million records (sinhornization with exchange, that why so many records) and I have some string (say string A). My task is to find if that string(A) is substring in column [str] or if a string from [str] is substring of my string(A). Big problem is speed and I need to ask this question very often.
Any suggestion is very much appeciated.

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2006-07-06 : 05:10:58

quote:
Originally posted by Peso

Great tip with and i >= 1

Does it also work with and i = i? Or any other comparison? I have noticed that some deletions took about 30 minutes for me. When I added and uid = uid to the DELETE clause, the query ran in less than 12 seconds. Didn't bother to know why then, but this might be the answer?

Peter Larsson
Helsingborg, Sweden

I dont think using i=i will make use of index. You need to compare it against Literal value. But I am not sure if it is ok with Delete statement. Set the execution plan and see

Madhivanan

Failing to plan is Planning to fail

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-07-06 : 08:48:38

Peso, this is why:
http://weblogs.sqlteam.com/mladenp/articles/9502.aspx

Go with the flow & have fun! Else fight the flow

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-07-06 : 08:51:54

vladicaognjanovic:
you have a few choices:
1. use full text indexing
2. create a table helper that contains split words in your text field. then you can just use like 'sometext%'
3. Horizontaly partition your table and union all query results from all partitioned tables.

Go with the flow & have fun! Else fight the flow

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-07-06 : 09:24:49

quote:
Originally posted by spirit1

Peso, this is why:
http://weblogs.sqlteam.com/mladenp/articles/9502.aspx

Thanks for the reading. Now i am somewhat wiser than a few minutes ago.

I did some testing. Look at following example

CREATE TABLE	#Test
		(
			v TINYINT
		)

INSERT	#Test
SELECT	DISTINCT Number
FROM	master..spt_values
WHERE	Number BETWEEN 0 AND 255

CREATE UNIQUE INDEX IX_Test ON #Test (v)

select * from #test where v > -1
select * from #test where v >= 0

The WHERE does EXACTLY the same thing but look at the execution plans

Peter Larsson
Helsingborg, Sweden

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-07-06 : 09:35:17

interesting...
my guess would because TINYINT is from 0-255 so -1 isn't in that range
so it can't be in the index value and has to calculated
if you changed tinyint to int it gives the same execution plan in both cases.

Go with the flow & have fun! Else fight the flow

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2006-07-06 : 09:35:29

It is because tinyInt doesnt allow negative numbers that may cause Scanning. Change the datatype to SMALLINT or INT and see the difference

Madhivanan

Failing to plan is Planning to fail

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2006-07-06 : 09:36:16

Somewhat

Madhivanan

Failing to plan is Planning to fail

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-07-06 : 09:47:43

Yes I know that TINYINT is 0-255. That's why I forced the WHERE to an invalid number for that datatype.
I believe that compute scalar (since -1) makes SQL server cast column v to smallint on the fly.

But why does SQL Server do a nested loop? Is it because the casting of column v and SQL server has to loop through all "new" v's?

Peter Larsson
Helsingborg, Sweden

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-07-06 : 09:53:52

yes.

Go with the flow & have fun! Else fight the flow

blog thingie: http://weblogs.sqlteam.com/mladenp

vladicaognjanovic
Starting Member

26 Posts

Posted - 2006-07-06 : 10:04:23

spirit1:
Thanks,
I have performed some testing on table with 230000 rows. One thing first string is just one word, exctualy its a identifier.
My testing showed that query

select * from [t]
where [t].[str] LIKE 'AJQJRKLLPOOPRKKKQWMMPOIA1JKLL==%' or 'AJQJRKLLPOOPRKKKQWMMPOIA1JKLL==' LIKE [like].[str] + '%'
and id >= 1

execute in 1-3 seconds.
I must say that I'm very suspicious, because its imposible or microsoft is a genious.
:)
Are these resuolts rely posible or not.

harsh_athalye
Master Smack Fu Yak Hacker

5581 Posts

Posted - 2006-07-06 : 10:08:53

quote:
Originally posted by vladicaognjanovic

spirit1:
select * from [t]
where [t].[str] LIKE 'AJQJRKLLPOOPRKKKQWMMPOIA1JKLL==%' or 'AJQJRKLLPOOPRKKKQWMMPOIA1JKLL==' LIKE [like].[str] + '%'
and id >= 1

execute in 1-3 seconds.
I must say that I'm very suspicious, because its imposible or microsoft is a genious.
:)
Are these resuolts rely posible or not.

I think that the above speed is achieved because you gave expression 'xyz%' and not '%xyz%' as stated in your original post. In the first case, Index scan is possible but not in the second case.

Harsh Athalye
India.
"Nothing is Impossible"

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-07-06 : 10:09:49

yes it is possible.
you've used what it's know i think... as a "reverse like operation"

even better results would be if you changed your OR to UNION ALL:
select * from [t]
where [t].[str] LIKE 'AJQJRKLLPOOPRKKKQWMMPOIA1JKLL==%' and id >= 1
Union all
select * from [t]
where 'AJQJRKLLPOOPRKKKQWMMPOIA1JKLL==' LIKE [like].[str] + '%' and id >= 1

Go with the flow & have fun! Else fight the flow

blog thingie: http://weblogs.sqlteam.com/mladenp

vladicaognjanovic
Starting Member

26 Posts

Posted - 2006-07-06 : 10:15:55

spirit1:
I must try this

select * from [t]
where [t].[str] LIKE 'AJQJRKLLPOOPRKKKQWMMPOIA1JKLL==%' and id >= 1
Union all
select * from [t]
where 'AJQJRKLLPOOPRKKKQWMMPOIA1JKLL==' LIKE [like].[str] + '%' and id >= 1

I get back with results in a minute.

vladicaognjanovic
Starting Member

26 Posts

Posted - 2006-07-06 : 10:20:25

spirit1:
Wait I need to put some more data in table

vladicaognjanovic
Starting Member

26 Posts

Posted - 2006-07-06 : 10:31:36

So here it is:
Now I have 330000 string in table.
First time query

select * from [like]
where [like].[str] LIKE 'AJQJRKLLPOOPRKKKQWMMPOIA1JKLL==%' and id >= 1
Union
select * from [like]
where 'AJQJRKLLPOOPRKKKQWMMPOIA1JKLL==' LIKE [like].[str] + '%' and id >= 1

exectuted in 2 min and 40 seconds.

after that it needed 2-5 seconds even if I add 10 more rows on every execution.

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-07-06 : 10:35:43

well the server cached the plan.
run
DBCC DROPCLEANBUFFERS

and try it again

Go with the flow & have fun! Else fight the flow

blog thingie: http://weblogs.sqlteam.com/mladenp

Next Page

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources