SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 duplicates
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

ovince
Starting Member

Hungary
32 Posts

Posted - 11/03/2006 :  11:03:32  Show Profile  Click to see ovince's MSN Messenger address  Reply with Quote
Hello to All Forumers,

this may be a trivial question but my knowledge of SQL is not enough to solve it.

I have 2 databeses and I would like to match them using 3 different IDs. I have noticed that 'ginfo' table is full of rows with duplicated IDs (or maybe even truplicated). As a result, in the output I obtain planty of not-wanted data. This is the query I have used:


select g.*, D.*
from ginfo g, Dhisn D
where
D.pID1 = g.pID1 and
D.pID2 = g.pID2 and
D.pID3 = g.pID3


I have tried somethig like this:

select g.*, D.*
from ginfo g, Dhisn D
where
D.pID1 = g.pID1 and
D.pID2 = g.pID2 and
D.pID3 = g.pID3

in order to avoid duplicates in the output by does not work. what to do?

It is important to have all data from both columns in the output and with uniq IDs

thanks in advance
oliver

rockmoose
SQL Natt Alfen

Sweden
3279 Posts

Posted - 11/03/2006 :  12:30:07  Show Profile  Reply with Quote
You have to know which is the logical primary key in each table.
Make sure that there is a primary key or unique constraint on those columns,
talk to the persosns responsible for the database.

If you are going to code around duplicates and other design faults,
then there is will be no end to your troubles.
Go to Top of Page

ovince
Starting Member

Hungary
32 Posts

Posted - 11/03/2006 :  12:46:26  Show Profile  Click to see ovince's MSN Messenger address  Reply with Quote
thank you for quick prompt

I made the databases by myself so I can modify both. What I should do?

Thanks
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30111 Posts

Posted - 11/03/2006 :  12:56:47  Show Profile  Visit SwePeso's Homepage  Reply with Quote
select distinct g.*, D.*
from ginfo g
inner join Dhisn D on D.pID1 = g.pID1 and D.pID2 = g.pID2 and D.pID3 = g.pID3

But you should only select the columns you really want.


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

ovince
Starting Member

Hungary
32 Posts

Posted - 11/03/2006 :  13:36:04  Show Profile  Click to see ovince's MSN Messenger address  Reply with Quote
hello Peter,


your proposition for query does not work in a way we want :)

This are the 2 tables with their columns:
1st
=====================
tableName: Dhisn
=====================
D4HdID ------------> primary key
pID1
pID2
pID3
D4n
D4nerr
HdA
HdAerr
SN



2nd
=====================
tableName: ginfo
=====================
catID -------------> primary key
pID1
pID2
pID3
ragar
decgar


The idea is to join two tables using their pID1,pID2 and pID3. But there are duplicated pID1, pID2 and pID3 in the ginfo table that makes a mess in the output.

So this does not work

select distinct g.*, D.*
from ginfo g
inner join Dhisn D on D.pID1 = g.pID1 and D.pID2 = g.pID2 and D.pID3 = g.pID3


thank you



Go to Top of Page

rockmoose
SQL Natt Alfen

Sweden
3279 Posts

Posted - 11/03/2006 :  13:48:32  Show Profile  Reply with Quote
In the tables should the combination of pID1,pID2,pID3 be unique?
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30111 Posts

Posted - 11/03/2006 :  14:35:42  Show Profile  Visit SwePeso's Homepage  Reply with Quote
most probably the pk is also identity... no wonder!


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

rockmoose
SQL Natt Alfen

Sweden
3279 Posts

Posted - 11/03/2006 :  16:28:55  Show Profile  Reply with Quote
I knew that from the beginning, just trying to set things right.
There are workarounds to get the wanted resultset using distinct and group by,
but in many cases they are just not needed if the data model and constraints are set up properly.

Dirty data is like wading through mud, a correct data model is like carving through butter.
I think ovince is doing the former, when he could be doing the latter.

rockmoose
Go to Top of Page

ovince
Starting Member

Hungary
32 Posts

Posted - 11/03/2006 :  18:15:08  Show Profile  Click to see ovince's MSN Messenger address  Reply with Quote
>In the tables should the combination of pID1,pID2,pID3 be unique?

yes, the combination of pID1,pID2,pID3 shuold NOT repeat i.e. one combination of these numbers should corespond to ONE and ONLY ONE row.

Maybe I should clean it first. Not manually I hope because there are over million rows. Is there an easy way to clean it first?


Go to Top of Page

rockmoose
SQL Natt Alfen

Sweden
3279 Posts

Posted - 11/03/2006 :  18:48:41  Show Profile  Reply with Quote
Run something like this:

delete ginfo
where catID in
(
select min(catID) from ginfo
group by pID1,pID2,pID3
having count(*) > 1
)

You may need to run the query several times if there are combinations with more than 2 occurrences.

PS.
Understand and test the query before running it.

PPS.
After cleaning the table, create a unique constraint on pID1,pID2,pID3 so that duplicates can't be entered in the future.

rockmoose
Go to Top of Page

samuelclay
Yak Posting Veteran

USA
71 Posts

Posted - 11/03/2006 :  18:59:29  Show Profile  Visit samuelclay's Homepage  Reply with Quote
And as always, backup your data before deleting large amounts of data
Go to Top of Page

rockmoose
SQL Natt Alfen

Sweden
3279 Posts

Posted - 11/03/2006 :  19:06:37  Show Profile  Reply with Quote
quote:
Originally posted by samuelclay

And as always, backup your data before deleting large amounts of data



Yes, we don't want a sequel to this...
Go to Top of Page

rockmoose
SQL Natt Alfen

Sweden
3279 Posts

Posted - 11/03/2006 :  19:12:52  Show Profile  Reply with Quote
quote:
Originally posted by samuelclay

And as always, backup your data before deleting large amounts of data

Go to Top of Page

madhivanan
Premature Yak Congratulator

India
22744 Posts

Posted - 11/03/2006 :  19:52:47  Show Profile  Send madhivanan a Yahoo! Message  Reply with Quote

http://sqlteam.com/forums/topic.asp?TOPIC_ID=6256

Madhivanan

Failing to plan is Planning to fail
Go to Top of Page

ovince
Starting Member

Hungary
32 Posts

Posted - 11/04/2006 :  00:32:13  Show Profile  Click to see ovince's MSN Messenger address  Reply with Quote
hi

I think deleting helps. I did checking and seems to work ok


As a new user of the forum, I have noticed that the "insert image" option in a Post New Reply doest not work. Is it temporary?

thanks for help
oliver
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.09 seconds. Powered By: Snitz Forums 2000