| Author |
Topic |
|
nmg196
Yak Posting Veteran
70 Posts |
Posted - 2003-12-04 : 12:40:06
|
Hi,We recently had a problem whereby one of our projects was deleting all the data from a couple of tables for seeming no reason whatsoever. After many many hours of searching, we narrowed the bug down to this very simple bit of code:SELECT VersionID INTO #versions FROM tResource_tResVersion WHERE ResourceID IN ( SELECT ResourceID FROM tResource WHERE SectionID = 56) The problem was caused by the part shown in red. This should be ID rather than ResourceID (there is no column called ResourceID in the tResource table).The wierd and annoying thing is, this validates OK in query analyser!!! WHY?! It also validates OK if you click the "Check Syntax" button on the stored procedure that had this code in. In fact the query runs and returns with no errors despite the fact that that column doesn't exist (but does the wrong thing). How on earth can this happen?I'm not sure what the precise resultant effect was, but it seemed to be producing a list of ALL of the VersionIDs in the tResource_tResVersion table, instead of the ones specified by the subquery.This twice resulted in all data being lost.WHY WHY WHY can this happen? Does the query analyser not validate subqueries? How can it execute completely invalid SQL with no errors?If I change the table name, then it throws an error, but it really doesn't care if the columns do or don't exist - either when validating or executing the query. Am I being stupid?We are using SQL Server 2000 SP3a.Any thoughts appreciated.Nick... |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2003-12-04 : 12:54:07
|
| I would put a call into MS about this. It sounds like a bug that hasn't been corrected yet.Tara |
 |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2003-12-04 : 12:55:27
|
| doesn't seem plausable...can you post the DDL for the 2 tables so we can replicate your problem?Brett8-) |
 |
|
|
drymchaser
Aged Yak Warrior
552 Posts |
Posted - 2003-12-04 : 12:59:13
|
| Does the sub-query run successfully as a stand-alone command? |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2003-12-04 : 13:15:51
|
| I just did the check in Query Analyzer and it checks fine. I did not setup the tables first. So the check syntax option does not check that the objects exist.Tara |
 |
|
|
nmg196
Yak Posting Veteran
70 Posts |
Posted - 2003-12-04 : 16:30:05
|
quote: Originally posted by X002548 doesn't seem plausable...can you post the DDL for the 2 tables so we can replicate your problem?
Sure - run this:CREATE DATABASE [TestDB]GOuse [TestDB]GOCREATE TABLE [dbo].[tResource] ( [ID] [int] NOT NULL , [SectionID] [int] NOT NULL ) ON [PRIMARY]GOCREATE TABLE [dbo].[tResource_tResVersion] ( [VersionID] [int] NOT NULL , [ResourceID] [int] NOT NULL ) ON [PRIMARY]GOALTER TABLE [dbo].[tResource] WITH NOCHECK ADD CONSTRAINT [PK_tResource] PRIMARY KEY CLUSTERED ([ID]) ON [PRIMARY] GOALTER TABLE [dbo].[tResource_tResVersion] WITH NOCHECK ADD CONSTRAINT [PK_tResource_tResVersion] PRIMARY KEY CLUSTERED ([VersionID]) ON [PRIMARY] GOCREATE PROCEDURE TestProc ASSELECT VersionIDFROM tResource_tResVersion WHERE ResourceID IN ( SELECT ResourceID FROM tResource WHERE SectionID = 10)GOINSERT INTO tResource (ID,SectionID) VALUES (1,10)INSERT INTO tResource (ID,SectionID) VALUES (2,11)INSERT INTO tResource (ID,SectionID) VALUES (3,12)INSERT INTO tResource_tResVersion (VersionID,ResourceID) VALUES (1,1)INSERT INTO tResource_tResVersion (VersionID,ResourceID) VALUES (2,1)INSERT INTO tResource_tResVersion (VersionID,ResourceID) VALUES (3,1)INSERT INTO tResource_tResVersion (VersionID,ResourceID) VALUES (4,2)INSERT INTO tResource_tResVersion (VersionID,ResourceID) VALUES (5,2)INSERT INTO tResource_tResVersion (VersionID,ResourceID) VALUES (6,2)GO You'll see that if you run TestProc, it executes fine with no errors even though there is no column called ResourceID in table tResource. Instead of throwing an error, it returns ALL rows from tResource_tResVersion. Because these rows were being used to decide what should be deleted, we lost all our data every time someone called the stored procedure with the erroneous code in it.It seems to be validating the column names for the inner query on the OUTER table! ie, it accepts ResourceID and VersionID, but not "DifferentID" which isn't in either table. A very annoying bug.Nick... |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2003-12-04 : 16:35:19
|
| Oh my! I wouldn't have believed it unless I ran it myself. I've got service pack 3a as well. Can someone not on service pack 3a run this?It is supposed to return only 1, 2, 3 in the result set, but it returns all of them.Tara |
 |
|
|
Granick
Starting Member
46 Posts |
Posted - 2003-12-04 : 17:15:48
|
| I ran this on our a test box here, and it doesn't have SP3a on it, and it displays the same results. If you run the SELECT in the IN clause by itself, it gives the errror, otherwise, it just returns all the results.If you put the TableName in front of columns, it will correctly spit out the error you are expecting about invalid column name. Example:SELECT zResource_tResVersion.VersionIDFROM zResource_tResVersionWHERE ResourceID IN (SELECT zResource.ResourceID FROM zResource WHERE SectionID = 10)It errors with if you alias the tables and use the alias as well.I guess this is just another reminder of why I always try to preface columns with the tablenames.Shannon |
 |
|
|
nmg196
Yak Posting Veteran
70 Posts |
Posted - 2003-12-04 : 17:18:19
|
quote: Originally posted by tduggan It is supposed to return only 1, 2, 3 in the result set, but it returns all of them.
Well as it stands it's supposed to throw an error and not even run! But somehow it does run, and returns everything for some unknown reason. But that's probably what you meant...Nick... |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2003-12-04 : 17:21:15
|
| Yes that's what I meant.Tara |
 |
|
|
chadmat
The Chadinator
1974 Posts |
Posted - 2003-12-04 : 17:36:03
|
I think I know what is going on here, and I can explain why it's not a bug SELECT VersionIDFROM tResource_tResVersion WHERE ResourceID IN ( SELECT ResourceID FROM tResource WHERE SectionID = 10)Above is the code in question. ResourceID does exist in the query (Not in the tResource table mind you) for each record it is the value of ResourceID on the outter query.So, for example, for the first row, ResourceID is 1, so, the inner query is translated to:SELECT 1 FROM tResource WHERE SectionID = 10Which returns 1. Get it?-Chadhttp://www.clrsoft.comSoftware built for the Common Language Runtime. |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2003-12-04 : 17:42:21
|
| Well that explains it. If you've got columns that have the same meaning, then name the same. We've got legacy systems here that do the same thing. In the parent table the column is called ID, but in the child table the column is called TableName_ID. Stop the insanity! Name them both TableName_ID or TableNameID or something similar.Tara |
 |
|
|
nmg196
Yak Posting Veteran
70 Posts |
Posted - 2003-12-04 : 19:00:39
|
| I don't agree.All you have explained is what is happening (which I had already stated) and not why this is not a bug.Read the code - it says "SELECT ResourceID FROM tResource" and that is exactly what you would expect it to do. I think 99% or more of people would not expect that code to do what it does - they would expect it to be a compile-time error (run-time at worst).I'm sure it should not be using ResourceID from the outer query just because it can't find it in the inner query. The clue is: this line of code is erroneous if you run it on it's own and if you fix it so it says ID instead of ResourceID, then it works perfectly (despite the two being in totally different tables). It should not pick a random table from another part of the query from which to pick it's values. |
 |
|
|
chadmat
The Chadinator
1974 Posts |
Posted - 2003-12-04 : 19:25:09
|
| It doesn't pick at random. ResourceID is "In-Scope" because it is valid in the outer query. If there were two tables in which the column existed in the outer query I bet you would get an ambiguous reference error. I realize what the code says, and if you understand what I tried to explain you would see what it is doing. The reason it gives an error if you just run the inner query is because ResourceID is not in scope anymore. Without the Outer portion, there is no ResourceID. Effectively, the query you are running is:SELECT VersionIDFROM tResource_tResVersion WHERE ResourceID IN (1,1,1,2,2,2)Run the following in pubs:select * from authorswhere au_id in (select au_id from titles)Same thing there, your inner query returns all of the au_ids. This is an expected result if you really think about it. At first glance, I agree it looks weird, and I thought it was a bug as well, but when I really thought about what was happening, it made sense.-Chadhttp://www.clrsoft.comSoftware built for the Common Language Runtime. |
 |
|
|
chadmat
The Chadinator
1974 Posts |
Posted - 2003-12-04 : 19:29:20
|
| Exactly as I suspected, the following returns an ambiguos reference error:select * from authors a join titleauthor t on a.au_id = t.au_idwhere a.au_id in (select au_id from titles)-Chadhttp://www.clrsoft.comSoftware built for the Common Language Runtime. |
 |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2003-12-05 : 09:32:27
|
| Check out the plan...It resolves the IN as a join...and it says (I can't cut and paste it, but):WHERE ([tResource_tResVersion].[ResourceID]) = ([tResource_tResVersion].[ResourceID])Meaning it's trying to coorelate it...I think (Like in I think I'll order a margaritta)..Need to look up on the rules for coorelation...I don't know (boy that happens a lot) but I'm not sure I'd say it's a bug...Brett8-) |
 |
|
|
nmg196
Yak Posting Veteran
70 Posts |
Posted - 2003-12-05 : 09:38:07
|
| Well if it's not a bug, then it's incredibly unintuitive. I certainly don't think that it should behave like that. It should force you to fully qualify the table if more than one table is in scope.Nick... |
 |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2003-12-05 : 09:44:59
|
quote: Originally posted by nmg196 Well if it's not a bug, then it's incredibly unintuitive. I certainly don't think that it should behave like that. It force you to fully qualify the table if more than one table is in scope.Nick...
I'm not saying it forces you to fully qualify the table, but the plan shows what it's "thinking".It "thought" that's what you wanted, because hey...must be because, there ain't no column like that in the inner query....Is it the code or the optimizer that's confusing?Brett8-) |
 |
|
|
nmg196
Yak Posting Veteran
70 Posts |
Posted - 2003-12-05 : 09:51:47
|
| I meant "It should force you to fully qualify..." - I've corrected that post now :)I find the language syntax confusing. You would think that it should cause an error. The subquery should be standalone, but in this context, is invalid if not used within the outer query. I don't really think that's how most people would expect it to work. It may not be a bug as such, but it's very unintuitive. |
 |
|
|
chadmat
The Chadinator
1974 Posts |
Posted - 2003-12-05 : 11:20:35
|
| It is just a repercusion of a mistake in code. I have had similar things happen when writing C# or C++ code, I make a mistake that compiles, then it takes a long time to figure out what the problem is. Just because the optimizer doesn't know what you "Intend" to do, doesn't make it a bug if you put valid syntax in, and the output is unexpected.-Chadhttp://www.clrsoft.comSoftware built for the Common Language Runtime. |
 |
|
|
LarsG
Constraint Violating Yak Guru
284 Posts |
Posted - 2003-12-07 : 09:52:23
|
| The query is not valid according to the ANSI SQL specification. |
 |
|
|
Next Page
|