| Author |
Topic |
|
tinamiller1
Yak Posting Veteran
78 Posts |
Posted - 2009-05-07 : 16:57:37
|
| I have a table that has 1643870 rows. There are massive dupes and that is normal because I have 1 ID that might have 10 different names attached to it.So...I have in my table: custname, id, add1, add2, city, state, zip, phone, month, year.I can do a query that is having query to find out how many custid have a count of >=5 custname. But I am trying to get the rest of the data to also appear when I run the query.So say I have id 11111 and there are 7 names attached I want it to display like this11111 john doe 1111 apple str greenbay wi 1111 1111 10 200811111 jane doe 1111 apple str greenbay wi 1111 1111 10 200811111 dave doe 1111 apple str greenbay wi 1111 1111 10 200811111 june doe 1111 apple str greenbay wi 1111 1111 10 200811111 tina doe 1111 apple str greenbay wi 1111 1111 10 200811111 tara doe 1111 apple str greenbay wi 1111 1111 10 200811111 sara doe 1111 apple str greenbay wi 1111 1111 10 2008This is so we can see how many dupe cust IDs we have with distinct names going to the same address so we can start sending 1 invoice instead of 7 invoices. Cutting down on paper. We can't say ok John we will delete the others and make you the only because each name and cust connect to other tables that have distinct data by the name. And this is just a fake example of the data. The real data is the TaxID number that has multiple provider names attached to it that have the same address. So you know you go to the doc and there might be 7 providers working at that same office but doctor john doe see's you so when his claim is paid the money goes into his pot. But they all have the same Tax ID at the provider office so it isn't a simple method of deleting all the dupes Tax IDs because it will delete the names except John Doe too. I know it seems kind of dumb but just the way health insurance works. got to make sure the right doctor gets paid. would be easier if the govt just said each doctor on the planet needs their own tax id.tina m miller |
|
|
khtan
In (Som, Ni, Yak)
17689 Posts |
Posted - 2009-05-07 : 20:33:44
|
[code]select t.*from table1 t inner join ( select id from table1 group by id having count(*) > 1 5 ) m on t.id = m.id[/code][edit]didn't read the requirement properly[/edit] KH[spoiler]Time is always against us[/spoiler] |
 |
|
|
tinamiller1
Yak Posting Veteran
78 Posts |
Posted - 2009-05-08 : 08:22:46
|
| This didn't work. I only have 1 table that has 1643870 rows. So in essence this is what I am trying to do:Select cosmos*from cosmos inner join ( select id, count(custname) as total from cosmos group by id having count(custname) >= 5)I need all ID's that have = or more than 5 custnames attached but I also need the rest of the columns listed in the table that include the add, city, state...etc. So the end result if say I have the following:11111 john doe 1111 apple str greenbay wi 1111 1111 10 200811111 jane doe 1111 apple str greenbay wi 1111 1111 10 200811111 dave doe 1111 apple str greenbay wi 1111 1111 10 200811111 june doe 1111 apple str greenbay wi 1111 1111 10 200811111 tina doe 1111 apple str greenbay wi 1111 1111 10 200811111 tara doe 1111 apple str greenbay wi 1111 1111 10 200811111 sara doe 1111 apple str greenbay wi 1111 1111 10 200822222 mary bun 2030 green ave greenbay wi 2222 2222 10 200822222 sara bun 2030 green ave greenbay wi 2222 2222 10 2008When I run the query it needs to return >=5. So it would return this:11111 john doe 1111 apple str greenbay wi 1111 1111 10 200811111 jane doe 1111 apple str greenbay wi 1111 1111 10 200811111 dave doe 1111 apple str greenbay wi 1111 1111 10 200811111 june doe 1111 apple str greenbay wi 1111 1111 10 200811111 tina doe 1111 apple str greenbay wi 1111 1111 10 200811111 tara doe 1111 apple str greenbay wi 1111 1111 10 200811111 sara doe 1111 apple str greenbay wi 1111 1111 10 2008It wouldn't return the bun family because they only have 2 custnames and don't mee the criteria.tina m miller |
 |
|
|
madhivanan
Premature Yak Congratulator
22864 Posts |
Posted - 2009-05-08 : 08:31:18
|
| Try thisselect t.*from cosmos t inner join ( select id from cosmos group by id having count(custname) >=5 ) m on t.id = m.idMadhivananFailing to plan is Planning to fail |
 |
|
|
tinamiller1
Yak Posting Veteran
78 Posts |
Posted - 2009-05-08 : 09:57:09
|
| That worked. Thanks so much!!!tina m miller |
 |
|
|
tinamiller1
Yak Posting Veteran
78 Posts |
Posted - 2009-05-08 : 11:21:07
|
| Can I do this:select distinct custname, t.*from cosmos tinner join(select idfrom cosmosgroup by idhaving count(custname) >=5) m on t.id = m.idI am asking becuase I noticed when I run this I also get duped custnames. So I will have the custnbr listed 10 times with same custname, custadd and all the fields. I can't delete this. I used custname and id as an example. This is a healthcare company I work for and I can't delete any dupes at all since I am not admin. I am not sure why we have dupes but we do. Drives me nuts no one cleans up the data.tina m miller |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2009-05-08 : 11:28:11
|
quote: Originally posted by tinamiller1 Can I do this:select distinct custname, t.*from cosmos tinner join(select idfrom cosmosgroup by idhaving count(custname) >=5) m on t.id = m.idI am asking becuase I noticed when I run this I also get duped custnames. So I will have the custnbr listed 10 times with same custname, custadd and all the fields. I can't delete this. I used custname and id as an example. This is a healthcare company I work for and I can't delete any dupes at all since I am not admin. I am not sure why we have dupes but we do. Drives me nuts no one cleans up the data.tina m miller
you can use it but wont necessarily provide what you're loking. please note distinct just return distinct set of entire resultset not a single field alone, so you wont be getting distinct values of custname field alone. What you need as per explanation above is to group on custname and apply some kind of agregate function like min,max on other field to get unique set of custname in output. if you can post sample data of what you want, i can provide you with sample query. |
 |
|
|
tinamiller1
Yak Posting Veteran
78 Posts |
Posted - 2009-05-08 : 11:36:08
|
| column names:id custname address city ST ZIP fone mth year11111 john doe 1111 apple str greenbay wi 1111 1111 10 200811111 jane doe 1111 apple str greenbay wi 1111 1111 10 200811111 dave doe 1111 apple str greenbay wi 1111 1111 10 200811111 june doe 1111 apple str greenbay wi 1111 1111 10 200811111 tina doe 1111 apple str greenbay wi 1111 1111 10 200811111 tara doe 1111 apple str greenbay wi 1111 1111 10 200811111 sara doe 1111 apple str greenbay wi 1111 1111 10 200822222 mary bun 2030 green ave greenbay wi 2222 2222 10 200822222 sara bun 2030 green ave greenbay wi 2222 2222 10 200811111 john tib 2011 lemon ave greenbay wi 2222 1111 10 200811111 john tib 2011 lemon ave greenbay wi 2222 1111 10 200811111 john tib 2011 lemon ave greenbay wi 2222 1111 10 200811111 john tib 2011 lemon ave greenbay wi 2222 1111 10 200811111 john tib 2011 lemon ave greenbay wi 2222 1111 10 200811111 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008So as you see if I run the above query, it returns all the does and the tibs. But they are not wanting dupe stuff. Even though the does have dupe addresses they have unique custname. So with this above data as example when I run the query based on I want all the fields in the entire table, the ID information and custname not duped, and I want any custname having the count >=5, the result I should get from the above example needs to be:11111 john doe 1111 apple str greenbay wi 1111 1111 10 200811111 jane doe 1111 apple str greenbay wi 1111 1111 10 200811111 dave doe 1111 apple str greenbay wi 1111 1111 10 200811111 june doe 1111 apple str greenbay wi 1111 1111 10 200811111 tina doe 1111 apple str greenbay wi 1111 1111 10 200811111 tara doe 1111 apple str greenbay wi 1111 1111 10 200811111 sara doe 1111 apple str greenbay wi 1111 1111 10 2008the 2 bun wouldn't be included because they are less than the 5 and the tib wouldn't be included even though it mees the >=5, it is entirely a dupe. I only want all the information within the whole table including id and then unique custnames.tina m miller |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2009-05-08 : 12:38:16
|
| sorry didnt understand reasoning behind how buns got excluded |
 |
|
|
tinamiller1
Yak Posting Veteran
78 Posts |
Posted - 2009-05-08 : 12:56:06
|
| because bun doesn't meet criteria. bun only have 2 custnames. I should have put tibs with diff id number. so let me try this again. Now this table has over 1.6million rows. massive dupes. the columns are id, custname, address, city, state, zip, fone, month, year.11111 john doe 1111 apple str greenbay wi 1111 1111 10 200811111 jane doe 1111 apple str greenbay wi 1111 1111 10 200811111 dave doe 1111 apple str greenbay wi 1111 1111 10 200811111 june doe 1111 apple str greenbay wi 1111 1111 10 200811111 tina doe 1111 apple str greenbay wi 1111 1111 10 200811111 tara doe 1111 apple str greenbay wi 1111 1111 10 200811111 sara doe 1111 apple str greenbay wi 1111 1111 10 200822222 mary bun 2030 green ave greenbay wi 2222 2222 10 200822222 sara bun 2030 green ave greenbay wi 2222 2222 10 200833333 john tib 2011 lemon ave greenbay wi 2222 1111 10 200833333 john tib 2011 lemon ave greenbay wi 2222 1111 10 200833333 john tib 2011 lemon ave greenbay wi 2222 1111 10 200833333 john tib 2011 lemon ave greenbay wi 2222 1111 10 200833333 john tib 2011 lemon ave greenbay wi 2222 1111 10 200833333 john tib 2011 lemon ave greenbay wi 2222 1111 10 2008The goal is to find all id's that have unique custnames >=5 listed within the id. so bun for instance there are only 2 distinct custnames so that doesn't meet the criteria. doe does because there are 7 distinct custnames. tib meets the criteria but you can see it is an exact dupe. they don't want that. they only want all the data within the table but only want anything that is >=5 and want only distinct cust names returned. the reason is right now we send 1 explanation of benefits form to every provider name listed under a tax id. i am using this as an example cuz i can't give you my real data. healthcare is very secure. but no one in reporting dept seems to be able to help me. i know the basic query i need but not sure how to get rid of the duped custname or in this case dupe provname. they want this because instead of sending 1 EOB to john doe, 1 eob to jane doe, 1 eob to dave joe etc etc etc....they want to combine them into 1 eob that goes to the whole group. but before they can program their target list of providers, they need to know who they are. we have to contact them b4 we just flip a switch. healthcare already pisses people off we dont want to add fuel to the fire. this is part of our go green stuff. when we run across a dupe like with tib because all rows are the same, only 1 eob is generated because all rows are exact. everything is exactly the same. so it views it as 1. i know it would be easier to get rid of dupes and it would be easier if the govt would say all docs have to have their own tax id, but that isn't the case. thnks and i hope this makes sense.tina m miller |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2009-05-08 : 12:59:59
|
ok.. i think this is what you wantSELECT t.*FROM YourTable tINNER JOIN (SELECT id FROM YourTable GROUP BY id HAVING COUNT(DISTINCT custname) >=5) t1ON t1.id=t.id |
 |
|
|
tinamiller1
Yak Posting Veteran
78 Posts |
Posted - 2009-05-08 : 13:01:22
|
| ah yeah i wasn't sure where to put that distincttina m miller |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2009-05-08 : 13:03:21
|
| ok. check and get back... |
 |
|
|
tinamiller1
Yak Posting Veteran
78 Posts |
Posted - 2009-05-11 : 08:21:15
|
| It doesn't like the distinct there either.tina m miller |
 |
|
|
khtan
In (Som, Ni, Yak)
17689 Posts |
Posted - 2009-05-11 : 08:24:26
|
tina,you are not using MS SQL Server right ? KH[spoiler]Time is always against us[/spoiler] |
 |
|
|
tinamiller1
Yak Posting Veteran
78 Posts |
Posted - 2009-05-11 : 09:45:42
|
| Right. I am using Access and the SQL section of the design.tina m miller |
 |
|
|
khtan
In (Som, Ni, Yak)
17689 Posts |
Posted - 2009-05-11 : 09:52:07
|
You should post your question over at the Access forum. visakh's query will work fine in a SQL Server but not necessary in MS Access. KH[spoiler]Time is always against us[/spoiler] |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2009-05-11 : 10:27:06
|
| ah...so you were using Access then |
 |
|
|
|