Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Getting duplicate records, don't know why

Author  Topic 

meef
Posting Yak Master

113 Posts

Posted - 2014-11-26 : 14:23:37
I am trying to pull a dataset here and keep getting duplicate records. I have a few joins so I thought maybe they were the problem but no matter what join I use I still get the same problem. I'm not sure what other information you may need so if I'm missing something let me know, maybe it's something simple that someone who knows more can pick out just by looking at the query.

SELECT c.bol_id,c.bol_number,a.filename, a.scac, a.pro, CONVERT(varchar(20), CONVERT(date, CONVERT(varchar(8), a.delivered_date), 112),110)as date_delivered, b.status_code, convert

(time(0),LEFT(b.status_time,2) +':'+ RIGHT(b.status_time,2)) as status_time, a.match_method, a.date_last_updated, a.message_date

FROM tbl_214_datatable AS a inner JOIN

tbl_214_status AS b ON b.bol_id = a.bol_id

inner join tbl_bol as c on b.bol_id=c.bol_id

WHERE c.dba_id = '09acea7e-4cb5-4dba-a829-d1f8c411909b' AND (b.date_added between '20140901' and '20140930') AND (b.status_code = 'D1')

ORDER BY date_delivered DESC

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-11-26 : 14:29:24
You are probably getting duplicates because of the 1:many relationship between the tables. Show us some sample data and expected output for us to help.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

meef
Posting Yak Master

113 Posts

Posted - 2014-11-26 : 14:59:03
Sample data from above query:

bol_id					bol_number				filename		scac	   pro	     date_delivered status_code status_time
4392625C-3385-48F2-A9ED-13CE4B350B6F 26676919 O:\FTP\Carrier214\PYLE\PROC\VIG-11106139.TXT PYLE 298826108 11/7/2014 D1 7:59:00
4392625C-3385-48F2-A9ED-13CE4B350B6F 26676919 O:\FTP\Carrier214\PYLE\PROC\VIG-11106139.TXT PYLE 298826108 11/7/2014 D1 9:46:00
4392625C-3385-48F2-A9ED-13CE4B350B6F 26676919 O:\FTP\Carrier214\PYLE\PROC\VIG-11106139.TXT PYLE 298826108 11/7/2014 D1 10:12:00
4392625C-3385-48F2-A9ED-13CE4B350B6F 26676919 O:\FTP\Carrier214\PYLE\PROC\VIG-11106139.TXT PYLE 298826108 11/7/2014 D1 10:36:00
4392625C-3385-48F2-A9ED-13CE4B350B6F 26676919 O:\FTP\Carrier214\PYLE\PROC\VIG-11106139.TXT PYLE 298826108 11/7/2014 D1 11:28:00
4392625C-3385-48F2-A9ED-13CE4B350B6F 26676919 O:\FTP\Carrier214\PYLE\PROC\VIG-11106139.TXT PYLE 298826108 11/7/2014 D1 11:38:00
4392625C-3385-48F2-A9ED-13CE4B350B6F 26676919 O:\FTP\Carrier214\PYLE\PROC\VIG-11106139.TXT PYLE 298826108 11/7/2014 D1 11:53:00
4392625C-3385-48F2-A9ED-13CE4B350B6F 26676919 O:\FTP\Carrier214\PYLE\PROC\VIG-11106139.TXT PYLE 298826108 11/7/2014 D1 12:44:00
4392625C-3385-48F2-A9ED-13CE4B350B6F 26676919 O:\FTP\Carrier214\PYLE\PROC\VIG-11106139.TXT PYLE 298826108 11/7/2014 D1 13:44:00
4392625C-3385-48F2-A9ED-13CE4B350B6F 26676919 O:\FTP\Carrier214\PYLE\PROC\VIG-11106139.TXT PYLE 298826108 11/7/2014 D1 16:38:00


Expected data would be to just have one single status. Each of these get a D1 status which means delivered, this is telling me that the same exact shipment has been delivered dozens of times at different times, and looking at the file where it's getting that data, there is indeed only one record it should be pulling.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-11-26 : 15:19:17
But there are multiple rows in tbl_214_datatable that match the join criteria.

Run this to check:

select * from tbl_214_datatable where bol_id = '4392625C-3385-48F2-A9ED-13CE4B350B6F'



Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-11-26 : 15:19:47
Are you sure that only joining on bol_id is correct?

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

meef
Posting Yak Master

113 Posts

Posted - 2014-12-01 : 08:33:16
There aren't multiple records though, that only returns one record if I try that query you posted.

All of these tables have a unique BOL_ID field, that's what I usually join on. As far as I can tell it should be correct.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-12-01 : 11:34:06
Show us the output of these:

select * from tbl_214_datatable where bol_id = '4392625C-3385-48F2-A9ED-13CE4B350B6F'
select * from tbl_214_status where bol_id = '4392625C-3385-48F2-A9ED-13CE4B350B6F'
select * from tbl_bol where bol_id = '4392625C-3385-48F2-A9ED-13CE4B350B6F'



Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

meef
Posting Yak Master

113 Posts

Posted - 2014-12-01 : 11:46:41
Funny you should say that, I was just replying to post results of that :)

I get one record for everything but the tbl_214_status, there I get 889 records. Now I was expecting to have multiple records as each shipment has different statuses throughout its delivery, but not 889 spanned over a period of months. I think there's a data issue here with that table, would you agree?
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-12-01 : 12:17:06
I wouldn't be able to answer that for you as it would required business rules knowledge.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

meef
Posting Yak Master

113 Posts

Posted - 2014-12-01 : 12:46:05
Ok, well I'm sure that's the issue. At most there should be maybe 10-15 status records for one BOL_ID, there's no way a shipment would be out there for 6 months before being delivered.

Is there any sort of bandaid effect I can apply to ignore these duplicates? Here is the output from the tbl_214_status:

tbl_214_status_id	                              tbl_214_id	                       bol_id	               status_date    status_time   status_code
26C78EC5-D6C7-4654-87FF-896346191917 35EC1F00-A494-4100-AC36-FA42D7B9F0EE 4392625C-3385-48F2-A9ED-13CE4B350B6F 20140723 1138 D1
A7DF47FD-D584-4561-B782-A7E4C2C9A9E5 35EC1F00-A494-4100-AC36-FA42D7B9F0EE 4392625C-3385-48F2-A9ED-13CE4B350B6F 20140723 1236 D1
BA21EB80-24CA-4FB5-B7A0-E4642B043D8D 6AB9969C-7C94-4686-8949-FA07FE44D8E1 4392625C-3385-48F2-A9ED-13CE4B350B6F 20140715 1200 NULL
45D8D171-357D-42A1-AF2A-940AE487E20E 6AB9969C-7C94-4686-8949-FA07FE44D8E1 4392625C-3385-48F2-A9ED-13CE4B350B6F 20140710 0 AG
3FB5F19B-7A2B-4251-9428-9C27E982E264 6AB9969C-7C94-4686-8949-FA07FE44D8E1 4392625C-3385-48F2-A9ED-13CE4B350B6F 20140709 1550 AF
892F7CF9-8AB4-4DF1-973B-3CA62ABF0013 6AB9969C-7C94-4686-8949-FA07FE44D8E1 4392625C-3385-48F2-A9ED-13CE4B350B6F 20140710 1150 D1
F0DF59C1-8B7C-48FB-9A5A-0B15F936DA73 6AB9969C-7C94-4686-8949-FA07FE44D8E1 4392625C-3385-48F2-A9ED-13CE4B350B6F 20140710 1153 D1


I think I could use the tbl_214_id field to group things together and maybe only show one record per group.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-12-01 : 12:52:57
For the sample data you posted, which one do you want to show? You'll likely need to use ROW_NUMBER() function.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

meef
Posting Yak Master

113 Posts

Posted - 2014-12-01 : 13:13:23
Looking at it again, maybe just each unique tbl_214_status_id? That's actually the primary key of that table. I think this is a problem of working with someone else who designed the tables and populates them based on a dozen different methods so when something goes wrong they don't even know how to correct the data which leaves me trying to clean it up on the fly. I don't even know how to tackle this anymore.
Go to Top of Page
   

- Advertisement -