Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 SQL Query confusion

Author  Topic 

tinamiller1
Yak Posting Veteran

78 Posts

Posted - 2009-05-07 : 16:57:37
I have a table that has 1643870 rows. There are massive dupes and that is normal because I have 1 ID that might have 10 different names attached to it.

So...I have in my table: custname, id, add1, add2, city, state, zip, phone, month, year.

I can do a query that is having query to find out how many custid have a count of >=5 custname. But I am trying to get the rest of the data to also appear when I run the query.

So say I have id 11111 and there are 7 names attached I want it to display like this

11111 john doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 jane doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 dave doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 june doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 tina doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 tara doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 sara doe 1111 apple str greenbay wi 1111 1111 10 2008

This is so we can see how many dupe cust IDs we have with distinct names going to the same address so we can start sending 1 invoice instead of 7 invoices. Cutting down on paper. We can't say ok John we will delete the others and make you the only because each name and cust connect to other tables that have distinct data by the name. And this is just a fake example of the data. The real data is the TaxID number that has multiple provider names attached to it that have the same address. So you know you go to the doc and there might be 7 providers working at that same office but doctor john doe see's you so when his claim is paid the money goes into his pot. But they all have the same Tax ID at the provider office so it isn't a simple method of deleting all the dupes Tax IDs because it will delete the names except John Doe too. I know it seems kind of dumb but just the way health insurance works. got to make sure the right doctor gets paid. would be easier if the govt just said each doctor on the planet needs their own tax id.

tina m miller

khtan
In (Som, Ni, Yak)

17689 Posts

Posted - 2009-05-07 : 20:33:44
[code]
select t.*
from table1 t
inner join
(
select id
from table1
group by id
having count(*) > 1 5
) m on t.id = m.id
[/code]

[edit]didn't read the requirement properly[/edit]


KH
[spoiler]Time is always against us[/spoiler]

Go to Top of Page

tinamiller1
Yak Posting Veteran

78 Posts

Posted - 2009-05-08 : 08:22:46
This didn't work. I only have 1 table that has 1643870 rows. So in essence this is what I am trying to do:

Select cosmos*
from cosmos
inner join
(
select id, count(custname) as total
from cosmos
group by id
having count(custname) >= 5)

I need all ID's that have = or more than 5 custnames attached but I also need the rest of the columns listed in the table that include the add, city, state...etc. So the end result if say I have the following:

11111 john doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 jane doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 dave doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 june doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 tina doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 tara doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 sara doe 1111 apple str greenbay wi 1111 1111 10 2008
22222 mary bun 2030 green ave greenbay wi 2222 2222 10 2008
22222 sara bun 2030 green ave greenbay wi 2222 2222 10 2008

When I run the query it needs to return >=5. So it would return this:
11111 john doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 jane doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 dave doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 june doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 tina doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 tara doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 sara doe 1111 apple str greenbay wi 1111 1111 10 2008

It wouldn't return the bun family because they only have 2 custnames and don't mee the criteria.

tina m miller
Go to Top of Page

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2009-05-08 : 08:31:18
Try this


select t.*
from cosmos t
inner join
(
select id
from cosmos
group by id
having count(custname) >=5
) m on t.id = m.id

Madhivanan

Failing to plan is Planning to fail
Go to Top of Page

tinamiller1
Yak Posting Veteran

78 Posts

Posted - 2009-05-08 : 09:57:09
That worked. Thanks so much!!!

tina m miller
Go to Top of Page

tinamiller1
Yak Posting Veteran

78 Posts

Posted - 2009-05-08 : 11:21:07
Can I do this:
select distinct custname, t.*
from cosmos t
inner join
(
select id
from cosmos
group by id
having count(custname) >=5
) m on t.id = m.id


I am asking becuase I noticed when I run this I also get duped custnames. So I will have the custnbr listed 10 times with same custname, custadd and all the fields. I can't delete this. I used custname and id as an example. This is a healthcare company I work for and I can't delete any dupes at all since I am not admin. I am not sure why we have dupes but we do. Drives me nuts no one cleans up the data.

tina m miller
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2009-05-08 : 11:28:11
quote:
Originally posted by tinamiller1

Can I do this:
select distinct custname, t.*
from cosmos t
inner join
(
select id
from cosmos
group by id
having count(custname) >=5
) m on t.id = m.id


I am asking becuase I noticed when I run this I also get duped custnames. So I will have the custnbr listed 10 times with same custname, custadd and all the fields. I can't delete this. I used custname and id as an example. This is a healthcare company I work for and I can't delete any dupes at all since I am not admin. I am not sure why we have dupes but we do. Drives me nuts no one cleans up the data.

tina m miller


you can use it but wont necessarily provide what you're loking. please note distinct just return distinct set of entire resultset not a single field alone, so you wont be getting distinct values of custname field alone. What you need as per explanation above is to group on custname and apply some kind of agregate function like min,max on other field to get unique set of custname in output. if you can post sample data of what you want, i can provide you with sample query.
Go to Top of Page

tinamiller1
Yak Posting Veteran

78 Posts

Posted - 2009-05-08 : 11:36:08
column names:

id custname address city ST ZIP fone mth year

11111 john doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 jane doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 dave doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 june doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 tina doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 tara doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 sara doe 1111 apple str greenbay wi 1111 1111 10 2008
22222 mary bun 2030 green ave greenbay wi 2222 2222 10 2008
22222 sara bun 2030 green ave greenbay wi 2222 2222 10 2008
11111 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008
11111 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008
11111 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008
11111 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008
11111 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008
11111 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008

So as you see if I run the above query, it returns all the does and the tibs. But they are not wanting dupe stuff. Even though the does have dupe addresses they have unique custname. So with this above data as example when I run the query based on I want all the fields in the entire table, the ID information and custname not duped, and I want any custname having the count >=5, the result I should get from the above example needs to be:

11111 john doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 jane doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 dave doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 june doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 tina doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 tara doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 sara doe 1111 apple str greenbay wi 1111 1111 10 2008

the 2 bun wouldn't be included because they are less than the 5 and the tib wouldn't be included even though it mees the >=5, it is entirely a dupe. I only want all the information within the whole table including id and then unique custnames.


tina m miller
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2009-05-08 : 12:38:16
sorry didnt understand reasoning behind how buns got excluded
Go to Top of Page

tinamiller1
Yak Posting Veteran

78 Posts

Posted - 2009-05-08 : 12:56:06
because bun doesn't meet criteria. bun only have 2 custnames. I should have put tibs with diff id number. so let me try this again. Now this table has over 1.6million rows. massive dupes. the columns are id, custname, address, city, state, zip, fone, month, year.

11111 john doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 jane doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 dave doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 june doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 tina doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 tara doe 1111 apple str greenbay wi 1111 1111 10 2008
11111 sara doe 1111 apple str greenbay wi 1111 1111 10 2008
22222 mary bun 2030 green ave greenbay wi 2222 2222 10 2008
22222 sara bun 2030 green ave greenbay wi 2222 2222 10 2008
33333 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008
33333 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008
33333 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008
33333 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008
33333 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008
33333 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008

The goal is to find all id's that have unique custnames >=5 listed within the id. so bun for instance there are only 2 distinct custnames so that doesn't meet the criteria. doe does because there are 7 distinct custnames. tib meets the criteria but you can see it is an exact dupe. they don't want that. they only want all the data within the table but only want anything that is >=5 and want only distinct cust names returned. the reason is right now we send 1 explanation of benefits form to every provider name listed under a tax id. i am using this as an example cuz i can't give you my real data. healthcare is very secure. but no one in reporting dept seems to be able to help me. i know the basic query i need but not sure how to get rid of the duped custname or in this case dupe provname. they want this because instead of sending 1 EOB to john doe, 1 eob to jane doe, 1 eob to dave joe etc etc etc....they want to combine them into 1 eob that goes to the whole group. but before they can program their target list of providers, they need to know who they are. we have to contact them b4 we just flip a switch. healthcare already pisses people off we dont want to add fuel to the fire. this is part of our go green stuff. when we run across a dupe like with tib because all rows are the same, only 1 eob is generated because all rows are exact. everything is exactly the same. so it views it as 1. i know it would be easier to get rid of dupes and it would be easier if the govt would say all docs have to have their own tax id, but that isn't the case. thnks and i hope this makes sense.

tina m miller
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2009-05-08 : 12:59:59
ok.. i think this is what you want

SELECT t.*
FROM YourTable t
INNER JOIN (SELECT id
FROM YourTable
GROUP BY id
HAVING COUNT(DISTINCT custname) >=5) t1
ON t1.id=t.id
Go to Top of Page

tinamiller1
Yak Posting Veteran

78 Posts

Posted - 2009-05-08 : 13:01:22
ah yeah i wasn't sure where to put that distinct

tina m miller
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2009-05-08 : 13:03:21
ok. check and get back...
Go to Top of Page

tinamiller1
Yak Posting Veteran

78 Posts

Posted - 2009-05-11 : 08:21:15
It doesn't like the distinct there either.

tina m miller
Go to Top of Page

khtan
In (Som, Ni, Yak)

17689 Posts

Posted - 2009-05-11 : 08:24:26
tina,

you are not using MS SQL Server right ?


KH
[spoiler]Time is always against us[/spoiler]

Go to Top of Page

tinamiller1
Yak Posting Veteran

78 Posts

Posted - 2009-05-11 : 09:45:42
Right. I am using Access and the SQL section of the design.

tina m miller
Go to Top of Page

khtan
In (Som, Ni, Yak)

17689 Posts

Posted - 2009-05-11 : 09:52:07
You should post your question over at the Access forum. visakh's query will work fine in a SQL Server but not necessary in MS Access.


KH
[spoiler]Time is always against us[/spoiler]

Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2009-05-11 : 10:27:06
ah...so you were using Access then
Go to Top of Page
   

- Advertisement -