Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Remove duplicates (Special version)

Author  Topic 

bjornh
Yak Posting Veteran

87 Posts

Posted - 2003-01-25 : 17:25:58
Hi, I found a function on the site that was pretty handy (http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=19647) but now I want more...

let's say that the result of the query is something like this:

NewText:
Home, Home, Info, contact, info, contact, contact
Home, contact, contact, info, info
...


Now I want these result to come out like this:

Home, info, contact, info, contact
Home, contact, info
...


So I need a query that removes, or does not add, if the previous page (it's a clickpath) is the same as the one that's going to be added...

Any idea's?

Thanks

Bjorn


skillile
Posting Yak Master

208 Posts

Posted - 2003-01-25 : 22:18:24
How are you stroring your paths, what is your data structure?

slow down to move faster...
Go to Top of Page

ValterBorges
Master Smack Fu Yak Hacker

1429 Posts

Posted - 2003-01-26 : 00:32:43
You mean something like this?
Can't do any better withou ddl.

CREATE TABLE #Test (letter nvarchar(1))

INSERT INTO #Test (letter) VALUES ('a')
INSERT INTO #Test (letter) VALUES ('a')
INSERT INTO #Test (letter) VALUES ('b')
INSERT INTO #Test (letter) VALUES ('b')


DECLARE @CSV NVARCHAR(50)
SET @CSV = ''

DECLARE @Lastletter NVARCHAR(1)
SET @LastLetter = ''

SELECT @CSV =
CASE letter
WHEN @LastLetter THEN @CSV + ''
ELSE @CSV + ',' + letter
END,
@LastLetter = letter
FROM #Test


SET @CSV = SUBSTRING(@CSV, 2,LEN(@CSV))


PRINT @CSV

DROP TABLE #Test

Results:
a,b


Edited by - ValterBorges on 01/26/2003 00:37:04
Go to Top of Page

bjornh
Yak Posting Veteran

87 Posts

Posted - 2003-01-26 : 04:14:40
yeah thanks ValterBorges, It works. I didn't knew how I could store the last letter so I was trying something with RIGHT(@CSV, 3) but it wasn't working, this does.
Thanks again

Bjorn

Go to Top of Page

bjornh
Yak Posting Veteran

87 Posts

Posted - 2003-01-26 : 05:02:18
The function I created looks like this:

alter function Getklikpad
(@i int)
returns varchar(8000)
as
begin
declare @String varchar(8000)
DECLARE @lastPage int

SET @lastPage = ''

Select @String = CASE paginaID
WHEN @lastPage THEN @String + ''
ELSE @String + (select alias from pagina_258679 where pagina_258679.id = paginaID) + ', '
END,
@lastPage = paginaID

from log2_258679 where bezoekerID = @I order by id
return left(@String, len(@String)-1)
end
go


The 2 tables:
log2_258679:
id, bezoekerID, paginaID

pagina_258679
id, alias

The 258679 is the user id and there's the problem.

I need to convert this function to a dynamic function. But when I do so, It returns the error "Invalid use of 'EXECUTE' within a function."


select @String ='CASE paginaID
WHEN @lastPage THEN @String + ''''
ELSE @String + (select alias from pagina_258679 where pagina_258679.id = paginaID) + '', ''
END,
@lastPage = paginaID
from log2_258679 where bezoekerID = @I order by id'

exec(@String)

What is wrong and even better, how do I solve this.

Bjorn

Go to Top of Page

ValterBorges
Master Smack Fu Yak Hacker

1429 Posts

Posted - 2003-01-26 : 19:00:48
first I would probably use a join the log2 and pagina tables instead of a correlated subquery.

You can join on pagina_258679.id = paginaID

Next I would built the whole select into a string @SQL
use debug or print statements to make sure the syntax is correct.

Finally I would use EXEC sp_executesql @SQL to execute it.

http://www.sqlteam.com/item.asp?ItemID=4599
http://www.sqlteam.com/item.asp?ItemID=4619


Go to Top of Page

bjornh
Yak Posting Veteran

87 Posts

Posted - 2003-01-28 : 08:25:09
MMM it still doens't work. If I change it as you sugested, I'll get another Error back:

'Only functions and extended stored procedures can be executed from within a function.'

Here's the script:

alter function Getklikpad
(
@i int,
@siteID int
)
returns varchar(8000)
as
begin
declare @String varchar(8000)
DECLARE @lastPage int
declare @SQL nvarchar(1000)

SET @lastPage = ''

select @SQL = 'select @String =CASE paginaID
WHEN @lastPage THEN @String + ''''
ELSE @String + alias + '', ''
END,
@lastPage = paginaID
from log2_258679 join pagina_258679 on paginaID = pagina_258679.id where bezoekerID = @I order by log2_258679.id'

Exec sp_executesql @SQL, N'@String varchar(8000) output', @String output


return left(@String, len(@String)-1)
end
go


set concat_null_yields_null off
select klikpad, count(*) as totaal from (
Select bezoekerID, dbo.Getklikpad(bezoekerID, 258679) as klikpad from log2_258679
group by bezoekerID) a group by klikpad order by count(*) desc

set concat_null_yields_null on



So is it even possible to use sp_executesql in a UDF?
Thanks



Edited by - bjornh on 01/28/2003 08:26:05
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2003-01-28 : 09:58:07
Why not just use the regular methods to concatenate strings, but just pass in a query that already has duplicates removed?

That is, instead of:

SELECT ID, dbo.fn_Concat(Fieldname)
FROM Table

Use:

SELECT ID, dbo.fn_Concat(Fieldname)
FROM
(SELECT ID, FieldName FROM Table GROUP BY ID, FieldName) A

Doesn't that make things much simplier?


- Jeff

Edited by - jsmith8858 on 01/28/2003 09:58:42
Go to Top of Page

ValterBorges
Master Smack Fu Yak Hacker

1429 Posts

Posted - 2003-01-28 : 10:09:03
No it's not possible but you could move it to a stored procedure instead of a function.



Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2003-01-28 : 10:47:56
Why is EXEC and dynamic SQL even being used?


- Jeff
Go to Top of Page

bjornh
Yak Posting Veteran

87 Posts

Posted - 2003-01-28 : 11:22:58
because the 258679 will be replaced by ' + convert(varchar(20), @siteID) + '

Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2003-01-28 : 12:28:19
How many possible siteID's are there? Why are they all in different tables?

- Jeff
Go to Top of Page

bjornh
Yak Posting Veteran

87 Posts

Posted - 2003-01-28 : 13:06:50
It is a website statistic service. (just like Nedstat or Webtrendslive, etc) so I hope I'll have a lot of siteID's... So, with that thought (thousands of sites, with milions of hits, yeah, I'am a dreamer.. :D) It wouldn't be smart to put them in 1 table... So I use 3 tables a site.

Go to Top of Page

bjornh
Yak Posting Veteran

87 Posts

Posted - 2003-01-29 : 14:21:08
quote:

No it's not possible but you could move it to a stored procedure instead of a function.





Can you give me an example or a few more tips because I don't know how I could manage it..

Thanks
Bjorn

Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2003-01-29 : 14:45:17
quote:

It is a website statistic service. (just like Nedstat or Webtrendslive, etc) so I hope I'll have a lot of siteID's... So, with that thought (thousands of sites, with milions of hits, yeah, I'am a dreamer.. :D) It wouldn't be smart to put them in 1 table... So I use 3 tables a site.



Is this accurate? I'm not sure that it is better to have 1000 tables with 10,000 records each than it is to have 1 table with 10,000,000 records ... can one of the Guru's comment on this? I really have no idea what the best answer is, but I think I would lean towards 1 big table with proper indexes.

With differnt tables, how do you query this database? You would always need dynamic SQL. It seems like big headaches ...

If the data is all in 1 table, then you don't need dynamic SQL, and then you could use a UDF, and you are good to go.

- Jeff
Go to Top of Page

ValterBorges
Master Smack Fu Yak Hacker

1429 Posts

Posted - 2003-01-29 : 17:39:45
From what I understand a b+ tree search runs in log(n) time steps.

Which means it would theoretically take twice as long at best to search 10,000,000 than it would take to search 1,000 even with the proper indexing. Now considering i/o issues it probably slow that down even further depending on what kind of system you have.




Go to Top of Page

bjornh
Yak Posting Veteran

87 Posts

Posted - 2003-01-29 : 17:50:40
Ok, i've done some testing, And here are ther results:

ps. I've used server trace in query analyser to get these numbers

I created 2 stored procedures. one that was useing the old way (dynamic SQL) and one that had 1 table with more records in it.

The table with the dynamic SQL had 10.000 records.
The static table had 500.000 records.

Both SP retrieve the same information only one is using the dynamic SQL and the other uses 'where siteID ='.

The dynamic SP:
total duration was 40 + 50 + 50 = 140ms
And the total of reads was 220 + 220 + 226 = 666 reads

Statich SP:
duration: 115 + 115 = 230ms
reads: 19 + 23 = 42 reads



So I don't really know what to do now.
dynamic is faster, but gives more reads and there isn't a way to use my clickpath function. But a static table takes more time...

What option is the best option..

Thanks again.
Bjorn

Go to Top of Page

bjornh
Yak Posting Veteran

87 Posts

Posted - 2003-01-29 : 17:59:54
quote:

From what I understand a b+ tree search runs in log(n) time steps.

Which means it would theoretically take twice as long at best to search 10,000,000 than it would take to search 1,000 even with the proper indexing. Now considering i/o issues it probably slow that down even further depending on what kind of system you have.





So you're saying, keep the dynamic table structure?

Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2003-01-29 : 19:13:34
quote:

From what I understand a b+ tree search runs in log(n) time steps.

Which means it would theoretically take twice as long at best to search 10,000,000 than it would take to search 1,000 even with the proper indexing. Now considering i/o issues it probably slow that down even further depending on what kind of system you have.



I would agree. Two things, though:

1) SQL requires some time to parse the dynamic SQL and locate the table among the thousand tables in it's database.

2) The log of 10,000,000 (base 2) is about 23 .... so it requires 23 links to find 1 record in 10,000,000. With number such as 23, twice as fast or slow isn't really a factor -- I don't know exactly how long it takes to traverse a node in the index tree, but it can't be so long that the difference between 23 jumps and 12 jumps is too much. And, as you know, the LOG function grows very slowly ....

However, I really have no idea what I am talking about, just thinking out loud .....

But, assuming it IS twice as slow just to locate the records, consider the fact that EVERY SQL statment in your database must be dynamic and can't truly be optimized, and the overall headache of never being able to write a VIEW, use a UDF or a plain old SELECT statement in your stored procs, and I think it is worth trying to keep everything in one table.

Just my $.02

- Jeff
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2003-01-29 : 19:22:58
One more thing to consider when deciding on your structure --

Think of the following tasks, where each "Group" is in a different table (potentially hundreds of tables) or not:

a) Return the # records for each group
b) Archive records from each group into another based on date
c) Which group has the most activity in a date range?
d) update records for all groups
...etc...

Think of writing T-SQL to perform tasks such as those.

In the case of multiple tables, you'd be looping though every table in the database and using temp tables and/or cursors and such for the above. In the other case, you'd be running single SELECT or UPDATE queries to handle all data in your database at once.


- Jeff

Edited by - jsmith8858 on 01/29/2003 19:23:51
Go to Top of Page

ValterBorges
Master Smack Fu Yak Hacker

1429 Posts

Posted - 2003-01-29 : 20:50:57
I agree Jay,

Plus, I believe that the bottle neck is all in the i/o.
As far as the downside of dynamic sql I also agree but I believe this can be solved by using the horizontal partioning scheme as described here.

http://www.sqlteam.com/item.asp?ItemID=684

Go to Top of Page
    Next Page

- Advertisement -