Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 CASE Creates Duplicate Rows

Author  Topic 

skinfreak
Starting Member

2 Posts

Posted - 2011-04-18 : 12:15:30
I am by no means a SQL programmer but trying my best. I have a problem where using the Case Expression introduces duplicate rows to the returned data set.

If I take out the code with the CASE expression, the query executes correctly. I have tried using a MAX function but this returns an error because of thel ogic witihn the CASE expression I think. I am stuck!

My code is as follows:

SELECT DISTINCT vwTMActualOperation.OperationRecordID,
vwTMActualOperation.SpecialtyDescription,
vwSPPlannedSession.StartDate,
vwTMActualOperation.LocationOfOperationDescription,
vwSPActualSession.ConsultantName,
/* Lots more to add....*/

(CASE
WHEN (CheckListID = 'PREOP' and QuestionID in (3,5) and RelatedQuestionID = 2 and QuestionText IS NULL) THEN
(SELECT (vwTMPatientChecklist.QuestionID))

WHEN (CheckListID = 'PREOP' and vwTMPatientChecklist.QuestionID = 25 AND vwTMPatientChecklist.RelatedQuestionID = 24) THEN
(SELECT (vwTMPatientChecklist.QuestionID))

/*Ignore ELSE if happy with NULL values*/
/*ELSE 0*/
END) Completed,


vwTMActualOperation.CasenoteNumber

FROM (((((vwSPPlannedSession
INNER JOIN vwTMPlannedOperation ON vwSPPlannedSession.SessionRecordID=vwTMPlannedOperation.SessionRecordID)
INNER JOIN vwFMLocation ON vwSPPlannedSession.DepartmentLocationCode=vwFMLocation.LocationCode)
LEFT OUTER JOIN vwSPActualSession ON vwSPPlannedSession.SessionRecordID=vwSPActualSession.SessionRecordID)
INNER JOIN vwTMActualOperation ON vwTMPlannedOperation.OperationRecordID=vwTMActualOperation.OperationRecordID)
INNER JOIN vwTMPatientChecklist ON vwTMActualOperation.OperationRecordID=vwTMPatientChecklist.OperationRecordID)
INNER JOIN vwSYHospitalSite ON vwFMLocation.HospitalSiteCode=vwSYHospitalSite.HospitalSiteCode


WHERE (vwTMPlannedOperation.StartDate BETWEEN '01/03/2011' AND '31/03/2020')
AND vwSYHospitalSite.HospitalSiteCode IN (1,2,4,7)

Many thanks

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-04-18 : 13:12:29
Since you have a DISTINCT clause in your select query, you will not get duplicate rows. Rows that may look like they are duplicates will have at least one column that differ. So, to diagnose the problem, take two rows that appear to be duplicates, look at each column and see where they differ. Once you know that, then debugging would be easier.
Go to Top of Page

jcelko
Esteemed SQL Purist

547 Posts

Posted - 2011-04-18 : 15:08:09
>> I am by no means a SQL programmer but trying my best. I have a problem where using the CASE expression introduces duplicate rows to the returned data set. <<

You have a lot of classic noob errors IN the data elements and failed to post DDL. But you knew that it IS a CASE expression AND not a statement. But a good SQL programmer will always use an ELSE clause. It might be a simple “ELSE NULL” or “ELSE CAST (<data type> AS NULL)” if you need to be safe and document the data type of the expression.



Since you used a SELECT DISTINCT, redundant duplicate rows are impossible. The SELECT clause IS applied last.

Please post real DDL. Learn to use ISO-11179 rules for the data element names, avoid needless dialect and use ISO-8601 temporal formats, codes and so forth. People cannot read your mind, so post your code and clear specs if you really want help.

The first classic error is the “Volkswagen” table name. Noobs who do not understand that VIEWs are tables or know the ISO-11179 rules or basic data modeling, do this silly prefix. Likewise, we laugh at “tmp_” prefixes, but they are worse because all too often they are being used as scratch tapes in a step-by-step procedural process; we use derived tables and CTEs that can be optimized.

The SELECT DISTINCT is rarely used in a properly designed schema. It is a “code smell”, symptom and not proof. If we had DDL, we would know. Likewise names like
“operation_record_id” are also suspect. Is it an id or a number? Does it identify a (paper) _records that then hold the information about the operation? Or is this data about the operation itself?

“location_of_operation_description” is a total nightmare; why not add “_value” to the end of it? This is the location, isn't it? So call it “operation_location”! You have a start_date, but it is the start of nothing in particular. The exam? The operation? The lawsuit that follows? Something else?

The word “completed” is a status value and not an attribute. Your singular table names say that you have only one row in them; an SQL programmer will use plural or collective names (I.e Employee = bad, Employees = better, Personnel = good).

“specialty_description” is weird; we used Med-Data and Phoenix codes for specialties and not a narrative text. A look-up table could provide that from the code.

Those Parens are a bitch to read – unless you are a LISP programmer :)SQL Programmers tend to not build chains of INNER and OUTER JOINs. The rule is that infixed joins are (effectively) done in left to right order, so an odd OUTER JOIN in the list is hard to maintain.

Expert SQL programmers use the list notation for INNER JOINs. I can go into the mindset later. We also look for joins that can be put into a derived table expression that can be given a name. This makes it easy to pull it out as a VIEW or CTE. It also gives us a table name to put on the columns – this can document the code much better.

Consider this from your code:

(SELECT SP_SS.session_record_id, FM_L.location_code,
..
FROM SP_Planned_Sessions AS SP_SS,
TM_Planned_Operation AS TM_PO,
FM_Locations AS FM_L – why do you need this table?
WHERE SP_SS.department_location_code = FM_L.location_code)
AND SP_SS.session_record_id = TM_PO.session_record_id)
AS Planned_Operation_Locations (session_record_id, location_code, ..)

Does this help? It will take years to learn SQL, so be patient :)


--CELKO--
Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Go to Top of Page

skinfreak
Starting Member

2 Posts

Posted - 2011-04-20 : 09:26:36
quote:
Originally posted by jcelko


You have a lot of classic noob errors IN the data elements and failed to post DDL. But you knew that it IS a CASE expression AND not a statement. But a good SQL programmer will always use an ELSE clause. It might be a simple “ELSE NULL” or “ELSE CAST (<data type> AS NULL)” if you need to be safe and document the data type of the expression.


I think that is included in the /*ELSE 0*/ ?

I am interogating a large system - including the table or view so I am not sure how much detail you want with regards to the makeup of the views I am pulling data from. I then intend to feed this into Crystal Reports for further manipulation.

The rows differ based on the two entries in the CASE expression. If I comment out one of the WHEN conditions, only one row per OperationRecordID is returned.

quote:
“location_of_operation_description” is a total nightmare; why not add “_value” to the end of it? This is the location, isn't it? So call it “operation_location”! You have a start_date, but it is the start of nothing in particular. The exam? The operation? The lawsuit that follows? Something else?

The schema was created by the software vendors so I have no control over it unfortunately. The start_date is the start date of the operation and "location_of_operation_description" is the description with a comanion field "location_of_operation" featuring the code of the location.

The codes are in there but for my query I want to exclude them as I will be more or less outputting straight to Crystal.

quote:
Those Parens are a b*tch to read – unless you are a LISP programmer :)SQL Programmers tend to not build chains of INNER and OUTER JOINs. The rule is that infixed joins are (effectively) done in left to right order, so an odd OUTER JOIN in the list is hard to maintain.

Outer join should not be in there. Should be inner joins mainly - oops.
Go to Top of Page
   

- Advertisement -