Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 query help

Author  Topic 

ssunny
Posting Yak Master

133 Posts

Posted - 2009-03-20 : 16:17:07
Hi there,
I have a table called dbo.test as follows:

create table dbo.test
(
id int not null,
value text null
)

smaple data:

id value

1 blah. blah.blah ABC blah blah ABC
2 ABC blah ABC DEFG blah blah blah
3 blah blah blah blah HAHA blah blah
4 NULL
..........
1000

Now I want to know how many times ABC appears in value column of dbo.test table.I want the total count of ABC.
I am using sql 2000.

Thanks.

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2009-03-20 : 16:46:04
What is the length of your largest [value] value? Assuming it is < 8000 characters then you could do this:

use junk

create table #test
(
[id] int not null
,[value] text null
)
go


insert #test ([id], [value])
select 1, 'blah. blah.blah ABC blah blah ABC' union all
select 2, 'ABC blah ABC DEFG blah blah blah' union all
select 3, 'blah blah blah blah HAHA blah blah' union all
select 4, NULL


select sum((len(v) - len(replace(v, 'ABC', '')))/3)
from (
select convert(varchar(8000), [value]) v from #test
) d
go
drop table #test

OUTPUT:
-----------
4


If any of your values are longer than 8000 characters then this will be a little more difficult.
If no value is that long then why did you choose the [TEXT] datatype?

Be One with the Optimizer
TG
Go to Top of Page

ssunny
Posting Yak Master

133 Posts

Posted - 2009-03-20 : 17:00:51
TG,
Thanks for the quick reply. The max length of values field is 24049 characters. So I guess this solution will not work in that case. So what are my options now?

Thanks.
Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2009-03-20 : 18:17:27
well, if you have to use a method from within sql server then I guess you could first use my original suggestion for all values < 8000.
For the remaining rows I think you'll need to abandon the set based approach.

PATINDEX and SUBSTRING both work with text data. So one idea would be to return an index from PATINDEX. While that index > 0 you'll get another patindex from the substring starting from the 1 character after the previous index. That would be in a loop. The while condition would be while the index > 0. From within the loop you'd need to increment a counter to know how many occurances were found for that value. Then reset your index and move on the the next row.

We're talking about 2 loops (one occurance counter nested inside a row loop) It won't be fast. I'm not at a server now so I can't provide tested code. I would also be surprised if there's not a (much) better way. Let me think on it - maybe someone else will offer a suggestion.

Be One with the Optimizer
TG
Go to Top of Page

ssunny
Posting Yak Master

133 Posts

Posted - 2009-03-21 : 11:05:55
TG,

Will try your approach over the weekend and post back the result. Thanks for the help.

Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2009-03-21 : 13:59:24
Here is a tested version. I commented out the >8000 condition (in two places) so my little test table would work. So both results now return 4:

create table #test
(
[id] int not null
,[value] text null
)
go


insert #test ([id], [value])
select 1, 'blah. blah.blah ABC blah blah ABC' union all
select 2, 'ABC blah ABC DEFG blah blah blah' union all
select 3, 'blah blah blah blah HAHA blah blah' union all
select 4, NULL

select sum((len(v) - len(replace(v, 'ABC', '')))/3) as [totalcount<=8000]
from (
select convert(varchar(8000), [value]) v
from #test
where datalength([value]) <= 8000
) d

----------------------------------------
--for values > 8000
declare @i int
,@totcnt int
,@id int
,@pos int
,@str varchar(200)
set @str = 'ABC'

select @id = min([id])
,@totcnt = 0
from #test
--where datalength([value]) > 8000

while @id is not null
begin
--first check in current row
select @i = patindex('%' + @str + '%', [value])
,@pos = @i
from #test
where id = @id

--if any found then increment the counter and look for another occurance
while @i > 0
begin
--increment count
set @totcnt = @totcnt + 1

--look for index of next occurance starting from first character after last occurance
select @i = patindex('%' + @str + '%', substring([value], @pos + len(@str) + 1, datalength([value])))
from #test
where id = @id

set @pos = @pos + @i
end

--go to next row
select @id = min([id])
from #test
where id > @id
--and datalength([value]) > 8000
end
select @totcnt [totalCount>8000]

go
drop table #test

OUTPUT:
totalcount<=8000
----------------
4

totalCount>8000
---------------
4



Be One with the Optimizer
TG
Go to Top of Page

ssunny
Posting Yak Master

133 Posts

Posted - 2009-03-23 : 10:32:29
Hey TG,

Thanks so much for the code. I'll have to modify it according to my scenario. I'll test it and update you.

Thanks again.
Go to Top of Page

ssunny
Posting Yak Master

133 Posts

Posted - 2009-03-23 : 12:37:30
Hurrayyyyyyyyyy!!!! This works great TG. Thanks a ton. I always had trouble in the past working with text data. Now I can use this approach in many reports.
Thank you so so much.

Go to Top of Page

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2009-03-23 : 12:46:37
Glad its working for you Thanks for the update.




Be One with the Optimizer
TG
Go to Top of Page
   

- Advertisement -