Author |
Topic |
bmsra79
Starting Member
24 Posts |
Posted - 2012-09-26 : 07:18:44
|
I have a table like this:ID, QTRYR, Col1, Col2, Col3, Col4, Col5301, 1Q2012, Not Provided, 1, <3, ND, 5301, 1Q2012, 1, NULL, NULL, 4, 1301, 1Q2012, 2, NULL, NULL, NULL, NULL101, 2Q2012, NA, 1, NULL, ND, NA101, 2Q2012, Not Done, NULL, NULL, OK, 1101, 2Q2012, 2, NULL, NULL, NULL, NULLExpected output:----------------ID, QTRYR, Col1, Col2, Col3, Col4, Col5301, 1Q2012, 1, 1, <3, 4, 5101, 2Q2012, 2, 1, NULL, OK, 1Take the first valid value among records of the same ID and QTRYR. |
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-09-26 : 07:29:40
|
You can do what you described only if you have a way of ordering the rows in the table - because by definition, the rows in a database table are an unordered collection. So if you take the first 3 rows in your example with ID=301, QTRYR=1Q2012, SQL Server has no concept of which comes first - second, or third. So you have to tell it how to determine the order. If there is another column or columns (such as a sequence id, timestamp etc.) that can be used to order the data. If there is, then we can write a query to get what you described relatively easily.Alternatively, if you don't care about which value among the non-null values you pick, then you don't need a way to order the rows. |
|
|
bmsra79
Starting Member
24 Posts |
Posted - 2012-09-26 : 07:40:15
|
There is a datetime field in this table too.ID, DATE1, QTRYR, Col1, Col2, Col3, Col4, Col5 |
|
|
bmsra79
Starting Member
24 Posts |
Posted - 2012-09-26 : 07:49:24
|
Updated with Date:ID, DATE1, Col1, Col2, Col3, Col4, Col5301, 4/30/12 7:36 PM, Not Provided, 1, <3, ND, 5301, 5/3/12 4:59 PM, 1, NULL, NULL, 4, 1301, 5/31/12 6:09 PM, 2, NULL, NULL, NULL, NULL101, 5/17/12 7:21 PM, NA, 1, NULL, ND, NA101, 5/24/12 2:38 PM, Not Done, NULL, NULL, OK, 1101, 6/8/12 7:22 PM, 2, NULL, NULL, NULL, NULLExpected output:----------------ID, QTRYR, Col1, Col2, Col3, Col4, Col5301, 2Q2012, 1, 1, <3, 4, 5101, 2Q2012, 2, 1, NULL, OK, 1 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-09-26 : 07:50:36
|
In your example, for ID=101 and QTRYR, why did you decided to pick "OK" for col4 rather than ND? What are the rules on what to pick and what not to pick? When I saw this initially, I thought you wanted to pick the first numeric value that you come across. |
|
|
bmsra79
Starting Member
24 Posts |
Posted - 2012-09-26 : 07:53:44
|
Preference is for Numeric values, then <1, >1, then +,-,positive, negative, OKThere are other valid values too, but this is just an indicator. |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-09-26 : 07:57:08
|
That makes it much harder. I don't know of a simple way to do this. Hopefully someone else will have some thoughts. |
|
|
bmsra79
Starting Member
24 Posts |
Posted - 2012-09-26 : 08:02:31
|
Basically look for all valid values and take the first available one in that Quarter for that ID.Valid Values are Integers, and text containing Integers,+,-, positive, negative, ok |
|
|
bmsra79
Starting Member
24 Posts |
Posted - 2012-09-26 : 13:29:31
|
How would you do it if only variables containing numeric values were to be considered. |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-09-26 : 13:45:16
|
Here is an example of how you can write the query if you had to consider only numeric values. You can copy and paste this code to an SSMS query window to run it and see what it does.The problem when you have a variety of valid values is that the case expressions will need to consider all of those conditions. It can be done, but the query gets ugly very fast.CREATE TABLE #tmp ( ID INT, DATE1 DATETIME, Col1 VARCHAR(32), Col2 VARCHAR(32), Col3 VARCHAR(32), Col4 VARCHAR(32), Col5 VARCHAR(32));INSERT INTO #tmp VALUES ('301','4/30/12 7:36 PM','Not Provided','1','<3','ND','5'),('301','5/3/12 4:59 PM','1',NULL,NULL,'4','1'),('301','5/31/12 6:09 PM','2',NULL,NULL,NULL,NULL),('101','5/17/12 7:21 PM','NA','1',NULL,'ND','NA'),('101','5/24/12 2:38 PM','Not Done',NULL,NULL,'OK','1'),('101','6/8/12 7:22 PM','2',NULL,NULL,NULL,NULL);;WITH cte1 AS( SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Date1) AS RN FROM #tmp),cte2 AS( SELECT * FROM cte1 WHERE RN = 1 UNION ALL SELECT c2.ID, c1.Date1, CASE WHEN ISNULL(c2.col1,'') NOT LIKE '%[0-9]%' THEN c1.col1 ELSE c2.col1 END, CASE WHEN ISNULL(c2.col2,'') NOT LIKE '%[0-9]%' THEN c1.col2 ELSE c2.col2 END, CASE WHEN ISNULL(c2.col3,'') NOT LIKE '%[0-9]%' THEN c1.col3 ELSE c2.col3 END, CASE WHEN ISNULL(c2.col4,'') NOT LIKE '%[0-9]%' THEN c1.col4 ELSE c2.col4 END, CASE WHEN ISNULL(c2.col5,'') NOT LIKE '%[0-9]%' THEN c1.col5 ELSE c2.col5 END, c1.RN FROM cte2 c2 INNER JOIN cte1 c1 ON c1.RN = c2.RN+1 AND c1.ID = c2.ID),cte3 AS( SELECT *,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY RN DESC ) AS RN2 FROM cte2)SELECT ID, DATE1, Col1, Col2, Col3, Col4, Col5 FROM cte3 WHERE RN2 = 1;DROP TABLE #tmp; |
|
|
bmsra79
Starting Member
24 Posts |
Posted - 2012-11-01 : 14:01:10
|
Slight change required in code.Get only those records having DATE1 within 6 months of DATE2 or DATE3. Where can I add these conditions into the code.Final Output:ID, DATE1, Col1, Col2, Col3, Col4, Col5, DATE2, DATE3 |
|
|
|
|
|