Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 comparing rows in a table

Author  Topic 

deadfish
Starting Member

38 Posts

Posted - 2004-01-14 : 22:59:32
Hi,
I have a table like this

lineno group item1 item2 item3
1 a i1 i2 i3
2 a i1 i3 i4
3 a i1 i4 i4
4 a i4 i5 i6
5 b i4 i5 i7
6 b i4 i6 i8

I have to compare the value of "item" within each group, and set it to null if the same item appears in the first record of the group (in the same group, same column too)

the result shd be like this

lineno group item1 item2 item3
1 a i1 i2 i3
2 a null i3 i4
3 a null i4 null
4 a i4 i5 i6
5 b i4 i5 i7
6 b null i6 i8

There are 50 "item" columns in each row, it will be tedious to declare 50 variables for storing the item of the first record, and compare other reocrds one by one. Is there any other way to achieve this?

Thanks a lot!

SamC
White Water Yakist

3467 Posts

Posted - 2004-01-15 : 01:19:12
If I understand you correctly, the first row in each group is the row in the group with the smallest value of lineno:

SELECT group, MIN(lineno) AS LineNo
FROM MyTable
Group by [Group]

A rowset of "first rows" of each group can be SELECTed by using the above subquery to select every group's row where the lineno matches the first lineno like this:

SELECT A.*
FROM MyTable A
INNER JOIN (
SELECT group, MIN(lineno) AS LineNo
FROM MyTable
Group by [Group]
) B on B.group = A.group AND B.lineno = A.lineno

then an UPDATE can be written that will NULL item(s) in a row which match the corresponding items in the first row of that group like this:

UPDATE X
SET X.item1 = CASE WHEN X.item1 = Y.item1 THEN NULL ELSE X.item1 END ,
.. REPEAT FOR item 2 and 3
FROM MyTable X
INNER JOIN (
-- the select above follows here
SELECT A.*
FROM MyTable A
INNER JOIN (
SELECT group, MIN(lineno) AS LineNo
FROM MyTable
Group by [Group]
) B on B.group = A.group AND B.lineno = A.lineno
) Y ON Y.group = X.group
WHERE Y.lineno > X.lineno
Go to Top of Page

deadfish
Starting Member

38 Posts

Posted - 2004-01-17 : 11:39:59
Thank you very much SamC.

This is a much better way!

But it seems doesn't work for the following case:

lineno group item1 item2 item3
1 a i5 i2 i3
2 a i1 i3 i4
3 a i1 i4 i4
4 a i4 i5 i6
5 b i4 i5 i7
6 b i4 i6 i8


For group a, the repeated item starts from lineno = 2. The item1 in lineno = 3 cannot be updated.....

Any way to solve this?

Thanks in advance!


Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2004-01-17 : 11:44:45
quote:
I have to compare the value of "item" within each group, and set it to null if the same item appears in the first record of the group (in the same group, same column too)


I had taken it a step further by stating it was the smallest value of lineno:

quote:
If I understand you correctly, the first row in each group is the row in the group with the smallest value of lineno:



Seems I've interpeted the problem incorrectly. Could you be more exact about what is meant by "first record of the group"?

Go to Top of Page

deadfish
Starting Member

38 Posts

Posted - 2004-01-17 : 11:57:58
I have got the correct result. Actually the "first record" means the first time the item appears in the same time slot. I got what I really need after remove the "select min" in the sql. Thank you very much for the detailed explaination!
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2004-01-17 : 12:29:26
Could we have a look at the final query?
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2004-01-17 : 13:03:56

UPDATE X
SET X.item1 = CASE WHEN X.item1 = Y.item1 THEN NULL ELSE X.item1 END
FROM MyTable X
INNER JOIN (
SELECT group, item1, MIN(lineno) AS LineNo
FROM MyTable
Group by [Group], item1
) Y ON Y.group = X.group AND Y.item1 = X.item1
WHERE Y.lineno > X.lineno


This might work, but it needs to be run for each itemN column.
Go to Top of Page

HansVE
Starting Member

2 Posts

Posted - 2004-01-17 : 14:18:13
select a.[lineno], a.[group], nullif(a.item1,b.item1), nullif(a.item2,b.item2), nullif(a.item3,b.item3)
from mytable a left join mytable b on b.[lineno] =
(select max([lineno])
from mytable
where [group] = a.[group] and [lineno] < a.[lineno])
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2004-01-17 : 21:44:15
I'd choose this way:

if object_id('tempdb..#t')>0 drop table #t
create table #t (ln int primary key, gr char(1),
i1 char(2), i2 char(2), i3 char(2))
insert into #t
select 1, 'a', 'i5', 'i2', 'i3' union all
select 2, 'a', 'i1', 'i3', 'i4' union all
select 3, 'a', 'i1', 'i4', 'i4' union all
select 4, 'a', 'i4', 'i5', 'i6' union all
select 5, 'b', 'i4', 'i5', 'i7' union all
select 6, 'b', 'i4', 'i0', 'i0' union all
select 7, 'b', 'i4', 'i5', 'i7'


update #t set

i1=case when (select count(*) from #t t where
t.gr=#t.gr and t.ln<#t.ln and t.i1=#t.i1)>0 then null else i1 end,

i2=case when (select count(*) from #t t where
t.gr=#t.gr and t.ln<#t.ln and t.i2=#t.i2)>0 then null else i2 end,

i3=case when (select count(*) from #t t where
t.gr=#t.gr and t.ln<#t.ln and t.i3=#t.i3)>0 then null else i3 end

select * from #t
drop table #t

ln gr i1 i2 i3
----------- ---- ---- ---- ----
1 a i5 i2 i3
2 a i1 i3 i4
3 a NULL i4 NULL
4 a i4 i5 i6
5 b i4 i5 i7
6 b NULL i0 i0
7 b NULL NULL NULL

and HansVE's result is (I'm not abs. sure which one is "better"):

ln gr
----------- ---- ---- ---- ----
1 a i5 i2 i3
2 a i1 i3 i4
3 a NULL i4 NULL
4 a i4 i5 i6
5 b i4 i5 i7
6 b NULL i0 i0
7 b NULL i5 i7
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2004-01-17 : 22:53:03
Good solution, but slow for large numbers of rows.
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2004-01-17 : 23:11:53
This UPDATE will be faster without a select in each SET, but we need a recordset of every Group/Item and it's minimum row (lineno)

With this recordset, MyTable can be Joined so each row has access to all 3 minimum linenos for each of the 3 items...

UPDATE A
SET A.Item1 = NULLIF(A.Item1, B.Item1) ,
A.Item2 = NULLIF(A.Item2, B.Item2) ,
A.Item3 = NULLIF(A.Item3, B.Item3)
FROM MyTable A
LEFT OUTER JOIN (
SELECT Group, Item1, MIN(lineno) as Minlineno
FROM MyTable
) B on A.Group = B.Group and A.Item1 = B.Item1 AND A.lineno <> B.Minlineno -- omit join on same row
LEFT OUTER JOIN (
SELECT Group, Item2, MIN(lineno) as Minlineno
FROM MyTable
) B on A.Group = B.Group and A.Item2 = B.Item2 AND A.lineno <> B.Minlineno -- omit join on same row
LEFT OUTER JOIN (
SELECT Group, Item3, MIN(lineno) as Minlineno
FROM MyTable
) B on A.Group = B.Group and A.Item1 = B.Item3 AND A.lineno <> B.Minlineno -- omit join on same row
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2004-01-18 : 14:54:38
Worth comparing. Really interesting. Is the optimizer soooooo stupid?

Cheers, Sam!
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2004-01-18 : 15:34:36
Sam, where gone GROUP BYs in your derived tables (which all aliased as "B")?

Could you post a working query or did I miss something here?
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2004-01-18 : 19:26:41
It's fulla bugs. Here's the debugged version.

CREATE TABLE dbo.MyTable (
[lineno] int not null,
[group] varchar(10) not null,
item1 varchar(10) null,
item2 varchar(10) null,
item3 varchar(10) null
)
GO
SET NOCOUNT ON

INSERT INTO MyTable ([lineno], [group], item1, item2, item3)
SELECT 1, 'a', 'I5', 'I2', 'I3'
union all
SELECT 2, 'a', 'I1', 'I3', 'I4'
union all
SELECT 3, 'a', 'I1', 'I4', 'I4'
union all
SELECT 4, 'a', 'I4', 'I5', 'I6'
union all
SELECT 5, 'b', 'I4', 'I5', 'I7'
union all
SELECT 6, 'b', 'I4', 'I5', 'I8'

SELECT * FROM MyTable


UPDATE A
SET A.Item1 = NULLIF(A.Item1, B.Item1) ,
A.Item2 = NULLIF(A.Item2, C.Item2) ,
A.Item3 = NULLIF(A.Item3, D.Item3)
FROM MyTable A
LEFT OUTER JOIN (
SELECT [group], Item1, MIN([lineno]) as Minlineno
FROM MyTable
GROUP BY [group], Item1
) B on A.[group] = B.[group] and A.Item1 = B.Item1 AND A.[lineno] <> B.Minlineno -- omit join on same row
LEFT OUTER JOIN (
SELECT [group], Item2, MIN([lineno]) as Minlineno
FROM MyTable
GROUP BY [group], item2
) C on A.[group] = C.[group] and A.Item2 = C.Item2 AND A.[lineno] <> C.Minlineno -- omit join on same row
LEFT OUTER JOIN (
SELECT [group], Item3, MIN([lineno]) as Minlineno
FROM MyTable
GROUP BY [group], item3
) D on A.[group] = D.[group] and A.Item3 = D.Item3 AND A.[lineno] <> D.Minlineno -- omit join on same row

SELECT [lineno], [group], IsNull(item1, 'NULL') item1, IsNull(item2, 'NULL') item2, IsNull(item3, 'NULL') item3
FROM MyTable

DROP TABLE dbo.MyTable



Here's the results:


lineno~group~item1~item2~item3
1~a~I5~I2~I3
2~a~I1~I3~I4
3~a~I1~I4~I4
4~a~I4~I5~I6
5~b~I4~I5~I7
6~b~I4~I5~I8

lineno~group~item1~item2~item3
1~a~I5~I2~I3
2~a~I1~I3~I4
3~a~NULL~I4~NULL
4~a~I4~I5~I6
5~b~I4~I5~I7
6~b~NULL~NULL~I8
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2004-01-18 : 20:23:34
quote:
Really interesting. Is the optimizer soooooo stupid?


I'm not sure I really know what the SQL optimizer this query does but...

update #t set

i1=case when (select count(*) from #t t where
t.gr=#t.gr and t.ln<#t.ln and t.i1=#t.i1)>0 then null else i1 end,

i2=case when (select count(*) from #t t where
t.gr=#t.gr and t.ln<#t.ln and t.i2=#t.i2)>0 then null else i2 end,

i3=case when (select count(*) from #t t where
t.gr=#t.gr and t.ln<#t.ln and t.i3=#t.i3)>0 then null else i3 end


The optimizer must generate code to calculate COUNT(*) for each item in each row. I've had experiences of queries taking minutes to run when there are just 30,000 rows or so. After rewriting to use JOINs to derived rowsets and impressively, query times fall from several minutes to a couple of seconds.


----------------------------------

After comparing the execution plans of the two queries, they don't look much different.

But it's just a plan...
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2004-01-18 : 22:10:31
Sam--

Great! Many thanks.
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2004-01-18 : 23:42:26
Wait ... I may be wrong about all of this...
Go to Top of Page
   

- Advertisement -