Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
boris37
Starting Member
12 Posts |
Posted - 2009-06-23 : 15:36:10
|
| So i am trying to remove duplicate DATE columns from a table...I have looked on examples upon examples but can't find something with a date issue and for some reason just can't figure out a way around....SELECT CONVERT(VARCHAR(10), StoredDateTime, 111) AS storeddatetime, COUNT(*) AS Expr1FROM NFLiquidsHistGROUP BY StoredDateTimeHAVING (COUNT(*) > 1)kind of works. but i need it for a specific meterid. If i throw in "where meterid =8283" gives me that ridiculous aggregate function error...(why i don't know).So heres the problem. We need 1 entry per day per meterid. For some reason time is included in this date column and the entries might be doubled with a 1minute in between the timestamp. My solution was to remove the hours and just find per day per meterid.SELECT CONVERT(VARCHAR(10), storeddatetime, 111) AS datetime, storeddatetime, meteridFROM NFLiquidsHistWHERE (meterid = 8283)ORDER BY datetime DESCThis query works but it gives me every day and not just the duplicates. I can manually go through and remove the days but if i have 12+ meterid's and 7 years of records it becomes daunting.The first example works but gives me every meterid for that date which is kind of not want i want since there will ALWAYS be duplicates.Lastly yes i know this table is convoluted and no i did not create this DB i just manage it... lolThanks :)Edit. There are a lot of other columns but how do i tell what is the primary key for this table?... lol |
|
|
nathans
Aged Yak Warrior
938 Posts |
Posted - 2009-06-23 : 18:52:17
|
| You can find PK of table using: sp_help [NFLiquidsHist]Regarding your aggregate error, you probably added the MeterId to your select statement as well as where. If that is the case you will need to add to your group by clause:SELECT CONVERT(VARCHAR(10), StoredDateTime, 111) AS storeddatetime, meterid , COUNT(*) AS Expr1FROM NFLiquidsHistWHERE (meterid = 8283)GROUP BY CONVERT(VARCHAR(10), StoredDateTime, 111), meterid HAVING (COUNT(*) > 1)edit: added where |
 |
|
|
eonmantra
Starting Member
11 Posts |
Posted - 2009-06-23 : 20:57:42
|
quote: Originally posted by boris37So heres the problem. We need 1 entry per day per meterid. For some reason time is included in this date column and the entries might be doubled with a 1minute in between the timestamp. My solution was to remove the hours and just find per day per meterid.
If all the information you need is one entry per day per meterid, and you do not need to include any other columns, then here are at least two different ways you can do this.The CTE method:WITH C AS (SELECT DATEADD(day, DATEDIFF(day, 0, storeddatetime), 0) AS DateOnly, meteridFROM NFLiquidsHist)SELECT DateOnly, meteridFROM CGROUP BY DateOnly, meterid; The derived table method:SELECT DateOnly, meteridFROM (SELECT DATEADD(day, DATEDIFF(day, 0, storeddatetime), 0) AS DateOnly, meterid FROM NFLiquidsHist ) AS CGROUP BY DateOnly, meterid; Here is the output of some made up data I used.DateOnly meterid----------------------- ----------2009-06-20 00:00:00.000 82832009-06-20 00:00:00.000 82842009-06-23 00:00:00.000 82832009-06-23 00:00:00.000 82852009-06-23 00:00:00.000 8286 The "SELECT DATEADD(day, DATEDIFF(day, 0, storeddatetime), 0) AS DateOnly" code zeros out the time for you, which makes the grouping easier. Once you have it this far you can add a WHERE clause on the meterid or date no problem.Now if you want only records with duplicate entries for each date then add back the "HAVING Count(*) > 1" clause.example with the derived table:SELECT DateOnly, meteridFROM (SELECT DATEADD(day, DATEDIFF(day, 0, storeddatetime), 0) AS DateOnly, meterid FROM NFLiquidsHist ) AS CGROUP BY DateOnly, meteridHAVING COUNT(*) > 1; Well I hope this helps. If I missed the point, try providing some sample data and an example of the desired output you are looking for. |
 |
|
|
boris37
Starting Member
12 Posts |
Posted - 2009-06-29 : 12:06:38
|
quote: Originally posted by nathans You can find PK of table using: sp_help [NFLiquidsHist]Regarding your aggregate error, you probably added the MeterId to your select statement as well as where. If that is the case you will need to add to your group by clause:SELECT CONVERT(VARCHAR(10), StoredDateTime, 111) AS storeddatetime, meterid , COUNT(*) AS Expr1FROM NFLiquidsHistWHERE (meterid = 8283)GROUP BY CONVERT(VARCHAR(10), StoredDateTime, 111), meterid HAVING (COUNT(*) > 1)edit: added where
Perfect works great. Now i am having trouble deleting 1 of the 2 lines i get output like:2008/08/01 8283 22008/08/02 8283 22008/08/03 8283 22008/08/04 8283 22008/08/05 8283 22008/09/23 8283 22008/09/24 8283 22008/09/30 8283 2meaning their are duplicates on those days. Now i tried a subquery to delete but it keeps giving me way more rows than i need. The logic they used was, select only the rows you don't need using a subquery and then once the query is working remove select with DELETE(makes sense). However i think i am not subquerying properly. SELECT bad_rows.*FROM NFLiquidsHist AS bad_rows INNER JOIN (SELECT CONVERT(VARCHAR(10), StoredDateTime, 111) AS storeddatetime, MeterID, COUNT(*) AS Expr1 FROM NFLiquidsHist WHERE (MeterID = 8283) GROUP BY CONVERT(VARCHAR(10), StoredDateTime, 111), MeterID HAVING (COUNT(*) > 1)) AS good_rows ON good_rows.storeddatetime <> bad_rows.StoredDateTimebut it is giving me... all the rows in the database with time regardless if they are duplicates or not and for every single meter. I'm sure i screwed up my query, anyone care to show me where? haha :)Thanks.P.S Eonmantra i know what you are trying to do but i already had a query 90% done i just needed to figure out why it wasn't working correctly. |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2009-06-29 : 12:15:49
|
seems like this is what you want:-SELECT bad_rows.*FROM NFLiquidsHist AS bad_rows LEFT JOIN(SELECT dateadd(dd,datediff(dd,0, StoredDateTime),0) AS storeddatetime, MeterID,MIN(PKCol) AS MinRecFROM NFLiquidsHistWHERE (MeterID = 8283)GROUP BY dateadd(dd,datediff(dd,0, StoredDateTime),0), MeterIDHAVING (COUNT(*) > 1)) AS good_rows ON good_rows.storeddatetime = dateadd(dd,datediff(dd,0, bad_rows.StoredDateTime),0)AND good_rows.MeterID=bad_rows.MeterIDAND good_rows.MinRec=bad_rows.PKColWHERE good_rows.MinRec IS NULL change select to delete if this gives you correct records |
 |
|
|
boris37
Starting Member
12 Posts |
Posted - 2009-06-29 : 16:19:15
|
| uhm there is no primary key in this table so i'm not sure how i should work around that.Also what is dateadd?i changed it and its spitting out everything againSELECT bad_rows.StoredDateTime, bad_rows.MeterID, bad_rows.Total, bad_rows.yday, bad_rows.sg, bad_rows.kfactor, bad_rows.cid, bad_rows.aid, bad_rows.modified, bad_rows.dateModifiedliq, bad_rows.modifiedByliq, bad_rows.reasonModifiedliq, bad_rows.EEC, bad_rows.dateExported, bad_rows.exportedByID, bad_rows.dateSchedExportedFROM NFLiquidsHist AS bad_rows LEFT OUTER JOIN (SELECT DATEADD(dd, DATEDIFF(dd, 0, StoredDateTime), 0) AS storeddatetime, MeterID, COUNT(*) AS Expr1 FROM NFLiquidsHist WHERE (MeterID = 8283) GROUP BY DATEADD(dd, DATEDIFF(dd, 0, StoredDateTime), 0), MeterID HAVING (COUNT(*) > 1)) AS good_rows ON good_rows.storeddatetime = DATEADD(dd, DATEDIFF(dd, 0, bad_rows.StoredDateTime), 0) AND good_rows.MeterID = bad_rows.MeterID AND good_rows.storeddatetime = bad_rows.StoredDateTime |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2009-06-29 : 16:23:50
|
Something similar to this?DELETE fFROM (SELECT ROW_NUMBER() OVER (PARTITION BY MeterID ORDER BY DATEDIFF(DAY, StoredDateTime, 0)) AS recIDFROM Table1) AS fWHERE recID > 1 N 56°04'39.26"E 12°55'05.63" |
 |
|
|
boris37
Starting Member
12 Posts |
Posted - 2009-06-29 : 17:54:12
|
quote: Originally posted by Peso Something similar to this?DELETE fFROM (SELECT ROW_NUMBER() OVER (PARTITION BY MeterID ORDER BY DATEDIFF(DAY, StoredDateTime, 0)) AS recIDFROM Table1) AS fWHERE recID > 1 N 56°04'39.26"E 12°55'05.63"
Yes sort of. In my initial query i count row to see if there are duplicates(just to verify any other query i will use). Then it finds the duplicates and I run a subquery to take 1 of the rows and display/get rid of it. It makes sense but i don't know quite how to code it. If i had a PK on this table it wouldn't be quite so difficult but the closest thing is the storeddatetime if you remove the time and only leave date(therefore you have more than one entry existing).Sumarized version:No PKDate closest to a PK i got.(exclude time)Duplicate entries on date PER METER, 1 needs to be removedThis is on multiple meters so a query must be done. Thanks for all the help guys appreciate it! |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2009-06-29 : 18:25:06
|
DELETE f-- SELECT *FROM (SELECT ROW_NUMBER() OVER (PARTITION BY MeterID ORDER BY StoredDateTime DESC) AS recIDFROM NFLiquidsHist) AS fWHERE recID > 1 N 56°04'39.26"E 12°55'05.63" |
 |
|
|
boris37
Starting Member
12 Posts |
Posted - 2009-06-29 : 18:33:48
|
quote: Originally posted by Peso DELETE f-- SELECT *FROM (SELECT ROW_NUMBER() OVER (PARTITION BY MeterID ORDER BY StoredDateTime DESC) AS recIDFROM NFLiquidsHist) AS fWHERE recID > 1 N 56°04'39.26"E 12°55'05.63"
Doesn't work if i add meterid like so:SELECT *FROM (SELECT ROW_NUMBER() OVER (PARTITION BY MeterID ORDER BY StoredDateTime DESC) AS recIDFROM NFLiquidsHist where (MeterID = 8283)) AS fWHERE recID > 1it gives me 2604 random rows that mean nothing. If i remove the meterid it crashes sql server... lol |
 |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2009-06-29 : 18:43:03
|
quote: Originally posted by boris37
quote: Originally posted by Peso DELETE f-- SELECT *FROM (SELECT ROW_NUMBER() OVER (PARTITION BY MeterID ORDER BY StoredDateTime DESC) AS recIDFROM NFLiquidsHist) AS fWHERE recID > 1 N 56°04'39.26"E 12°55'05.63"
Doesn't work if i add meterid like so:SELECT *FROM (SELECT ROW_NUMBER() OVER (PARTITION BY MeterID ORDER BY StoredDateTime DESC) AS recIDFROM NFLiquidsHist where (MeterID = 8283)) AS fWHERE recID > 1it gives me 2604 random rows that mean nothing. If i remove the meterid it crashes sql server... lol
Peso's query will delete the duplicates for you (apparently 2604) of them. You can change the query to see what it would delete:SELECT *FROM ( SELECT MeterID, StoredDateTime, ROW_NUMBER() OVER (PARTITION BY MeterID ORDER BY StoredDateTime DESC) AS recID FROM NFLiquidsHist where MeterID = 8283) AS fWHERE recID > 1 |
 |
|
|
boris37
Starting Member
12 Posts |
Posted - 2009-06-30 : 11:48:56
|
| No... it is giving me all the rows not just duplicates. 2604 is all the rows for this meter. I get rows like this with yours:8283 2008-09-30 13:01:00.000 2738283 2008-09-30 13:00:00.000 2748283 2008-09-29 13:01:00.000 2758283 2008-09-28 13:01:00.000 2768283 2008-09-27 13:01:00.000 2778283 2008-09-26 13:01:00.000 2788283 2008-09-25 13:01:00.000 2798283 2008-09-24 13:01:00.000 2808283 2008-09-24 13:00:00.000 2818283 2008-09-23 13:01:00.000 2828283 2008-09-23 13:00:00.000 2838283 2008-09-22 13:00:00.000 2848283 2008-09-21 13:00:00.000 2858283 2008-09-20 13:00:00.000 2868283 2008-09-19 13:00:00.000 287As you can see i am getting all the dates not just diplicates. On this one meter i should only have 6 or 7 duplicate entries. |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2009-06-30 : 12:52:21
|
| so as per your rule you want a single row out of above set? or one for each day? |
 |
|
|
boris37
Starting Member
12 Posts |
Posted - 2009-06-30 : 16:04:49
|
quote: Originally posted by visakh16 so as per your rule you want a single row out of above set? or one for each day?
one for each day... i'm pretty sure i have said that... We don't have a PK in this table so the closest thing is day date. Not my table... |
 |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2009-06-30 : 16:27:54
|
Just a slight change should fix that:SELECT *FROM ( SELECT MeterID, StoredDateTime, ROW_NUMBER() OVER (PARTITION BY MeterID, StoredDateTime ORDER BY StoredDateTime DESC) AS recID FROM NFLiquidsHist where MeterID = 8283) AS fWHERE recID = 1 |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2009-06-30 : 16:40:22
|
(PARTITION BY MeterID, DATEDIFF(DAY, 0, StoredDateTime) ORDER BY StoredDateTime DESC) N 56°04'39.26"E 12°55'05.63" |
 |
|
|
X002548
Not Just a Number
15586 Posts |
|
|
boris37
Starting Member
12 Posts |
Posted - 2009-07-02 : 12:25:31
|
| Thanks X002548 helps a lot but i got the query working....SELECT *FROM ( SELECT MeterID, StoredDateTime, ROW_NUMBER() OVER (PARTITION BY MeterID, DATEDIFF(DAY, 0, StoredDateTime) ORDER BY StoredDateTime DESC) AS recID FROM NFLiquidsHist where MeterID = 8283) AS fWHERE recID = 2seems to output the correct data (=1 outputs all rows minus duplicates)I have bookmarked your thing though just to clarify my next post i make(which yes i will make one eventually). Thanks again everyone*Thread Closed* please :) |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2009-07-02 : 12:31:38
|
If you have a "MeterID = 8232" in the derived table,there is no need to keep the "PARTITION BY MeterID".And if you want all duplicates, change "recID = 2" to "recID > 1". Microsoft SQL Server MVPN 56°04'39.26"E 12°55'05.63" |
 |
|
|
|
|
|
|
|