Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 SQL Query for Most Probable Sequence text in a Column

Author  Topic 

AskSQLTeam
Ask SQLTeam Question

0 Posts

Posted - 2006-11-06 : 07:59:03
Sandip writes "I am doing a Project for detection of advertisement in a Web page.

For that I have used one column called ImageText

For giving Training to my database I stored (say) 100000 records in ImageText

Now I want to First n Sequence of Text Which Comes Most Times

For Exa:

ImageText
---------
ABc
ABcd
ABABABABAB
ABcdxyAB
ABcdABcdABcdcdAB

Here,
String No Of Times String Occurs
------ ---------------
AB 13
cd 6
ABc 6
ABcd 3
cdA 3
.....
....

ABc 1
ABcd 1
ABABABABAB 1
ABcdxyAB 1
ABcdABcdABcdcdAB 1

That is i want to find within a string also

For First 3 Most Popular Sequences

Output
------
AB
cd
ABc

Please Reply
If you do not understand the question
then
send me mail"

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2006-11-06 : 08:28:08
[code]CREATE FUNCTION dbo.fnFindWord
(
@SearchIn VARCHAR(8000),
@SearchFor VARCHAR(8000)
)
RETURNS INT
AS

BEGIN
DECLARE @Position INT,
@Items INT

WHILE @Position > 0 OR @Position IS NULL
SELECT @Position = CHARINDEX(@SearchFor, @SearchIn, ISNULL(@Position, 0) + 1),
@Items = CASE WHEN @Position > 0 THEN ISNULL(@Items, 0) + 1 ELSE @Items END

RETURN ISNULL(@Items, 0)
END[/code]Call with
select dbo.fnFindWord('ababab', 'abab')


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

samuelclay
Yak Posting Veteran

71 Posts

Posted - 2006-11-06 : 12:13:23
Wow, what is supposed to be used to define a word (or sequence of text). It looks like the OP has it returning almost ever possible sequence 2 char or longer... but misses some... they have AB and ABc and cd, but doesnt have Bc or dA (but those might be included in the ...).

So they want to go through 100000 records and count the number of every possible 2+ char sequence.. wow... Maybe more information would help here...
Go to Top of Page
   

- Advertisement -