Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
MrQuizzles
Starting Member
20 Posts |
Posted - 2010-01-13 : 16:52:39
|
Ok, another question:I have two tables with the following structuresDataCatalog-ID-nameuserDataAccess-swkyasgn_id-dataCatalogItemThere is a one-to-many relation from DataCatalog and userDataAccess. Each item in DataCatalog can have many items in userDataAccess.What I need to do is, given a specific swkyasgn_id, list all items in DataCatalog and whether or not an entry exists for each item in userDataAccessExample:DataCatalogID name-----------1 item12 item23 item34 item45 item5userDataAccessswkyasgn_id dataCatalogItem------------------------------1 31 41 52 12 32 4Desired Result Set given a swkyasgn_id of 1name exists---------------item1 0item2 0item3 1item4 1item5 1Desired Result Set given a swkyasgn_id of 2name exists---------------item1 1item2 0item3 1item4 1item5 0 The query I'm trying right now isSELECT name, IF EXISTS(SELECT * FROM userDataAccess uda WHERE dc.ID = uda.dataCatalogItem AND uda.swkyasgn_id = 1) (SELECT 1 AS accessible)ELSE (SELECT 0 AS accessible)FROM DataCatalog dc But that doesn't work. It doesn't like the IF statement, which isn't surprising considering I've rarely ever used them before. Can I plop them in the middle of SELECT statements like that? I don't know. |
|
|
vijayisonly
Master Smack Fu Yak Hacker
1836 Posts |
Posted - 2010-01-13 : 17:29:32
|
Sample datadeclare @DataCatalog table(ID int, [name] varchar(10))insert @DataCatalogselect 1 , 'item1'union all select 2, 'item2'union all select 3, 'item3'union all select 4 , 'item4'union all select 5, 'item5'declare @userDataAccess table (swkyasgn_id int, dataCatalogItem int)insert @userDataAccessselect 1, 3union all select 1, 4union all select 1 , 5union all select 2 , 1union all select 2 , 3union all select 2 , 4 Queryselect a.[name], case when b.dataCatalogItem is null then 0 else 1 end as [exists]from @DataCatalog aleft join @userDataAccess b on a.ID = b.dataCatalogItemand b.swkyasgn_id = 1order by a.[name] Result for id = 1name exists---------- -----------item1 0item2 0item3 1item4 1item5 1 Result for id = 2name exists---------- -----------item1 1item2 0item3 1item4 1item5 0 |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2010-01-14 : 00:15:20
|
| just to make it clear. you need to replace b.swkyasgn_id = 1 with b.swkyasgn_id = @id in previous query and you can pass values to @id like 1,2 etc to get result for each. |
 |
|
|
MrQuizzles
Starting Member
20 Posts |
Posted - 2010-01-14 : 12:10:33
|
| Hmmm, it just occurred to me, though, that this is an On^2 operation. In the actual database, each table has about 1000 records, meaning that such a query will cause 1,000,000 checks.I think I'm going to have to find a different way of getting what I want, meaning a bit of clever C#. I'm pretty sure I could get this done in linear time otherwise. |
 |
|
|
vijayisonly
Master Smack Fu Yak Hacker
1836 Posts |
Posted - 2010-01-14 : 12:20:59
|
| Did you atleast try the query in your DB before coming to a conclusion? |
 |
|
|
X002548
Not Just a Number
15586 Posts |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2010-01-14 : 15:30:19
|
quote: Originally posted by MrQuizzles Hmmm, it just occurred to me, though, that this is an On^2 operation. In the actual database, each table has about 1000 records, meaning that such a query will cause 1,000,000 checks.I think I'm going to have to find a different way of getting what I want, meaning a bit of clever C#. I'm pretty sure I could get this done in linear time otherwise.
If you are inferring that it is O(2) by looking at the query plan for Vijay's query and seeing the table scans and nested loops that should not stop you from trying it out. Vijay was demonstrating an example with table variables that are small in size and have no indexes. If you look at the query plan when running against the actual data, you might find it to be very different. |
 |
|
|
MrQuizzles
Starting Member
20 Posts |
Posted - 2010-01-15 : 14:52:36
|
Well, after further analysis, I'll admit I'm wrong. The database engine should be capable of performing the operation at something much closer to O(nlogn). Of course, I really don't know the inner workings of the database engine, so if anyone can tell me the actual cost of the operation, I'd be interested.My plan in C# was to perform two queries with some intermediate processing. What I need to do is create a bunch of checkboxes that are either checked or not dependent on the database (yes, this will create quite a lot of checkboxes, but this is what I'm being asked for). I was going to first do a (SELECT name, ID FROM DataCatalog ORDER BY ID ASC) and generate the list of checkboxes with their associated names from that, setting the ID of the checkbox to the associated ID from the query. Then I would perform a second query (SELECT dataCatalogItem FROM userDataAccess WHERE swkyasgn_id = *some number* ORDER BY userDataAccess ASC) and iterate through the checkboxes, checking them if their ID shows up in the second query.I figure the first query is linear, the creation of the checkboxes is linear, the second query is linear, and the checking of the boxes, due to its sequential nature (both the checkboxes and the result set are in the same sequence), is also linear, so the whole deal never breaks linear time.So which method, then, would be quicker?Theoretical efficiency points to the second method, but is it this way in practicality?Also, from a purely academic standpoint, would joining the DataCatalog table to a derived table be quicker than the given query.Is this:select a.[name], case when b.dataCatalogItem is null then 0 else 1 end as [exists]from DataCatalog aleft join userDataAccess b on a.ID = b.dataCatalogItem and b.swkyasgn_id = *some number* Faster, slower or the same as this:select a.[name], case when b.dataCatalogItem is null then 0 else 1 end as [exists]from DataCatalog aleft join (SELECT * FROM userDataAccess WHERE swkyasgn_id = *some number*) AS b on a.ID = b.dataCatalogItem and why?I'm a computer science undergrad who's working as a web development intern, so I'm genuinely interested in knowing these things and I certainly don't mind having my mistakes pointed out to me as long as it's explained why they're wrong. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-01-16 : 05:17:13
|
I haven't read the thread, so I'm just looking at your last question comparing the two queries.You can get SQL Server to show you the Query Plan for each. You might find that they are identical. SQL Server can also tell you the logical I/O involved in each, which would equate to "cost given an average world " ("Logical" I/O being different to "Physical", because in the real world all the data you need might be cached in memory anyway).-- Comment in the SHOWPLAN to see the Query Plan, OR the STATISTICS to see the logical stats-- SET SHOWPLAN_TEXT ONGO-- SET STATISTICS IO ON; SET STATISTICS TIME ON*** put query here ***SET STATISTICS IO OFF; SET STATISTICS TIME OFFGOSET SHOWPLAN_TEXT OFFGO If they are not the same I would expect (1) to be faster.You have a SELECT * in your sub-select in (2) - I think the optimiser will fix that to the columns you actually need, but it might not (and you could obviously fix that yourself)If (2) you are forcing how you want the query planner to work. A newer version of SQL, or patch, could come up with a better query plan to (1), or the query plan for (1) might change over time - as the number of rows, or the distribution of the data, or whether you add/drop indexes, impacts things.Personally I would always use (1) and if it was slow 99.99% of the time I would "influence" (well "tune" would be a better word) it in other ways. Usually by adding indexes. |
 |
|
|
MrQuizzles
Starting Member
20 Posts |
Posted - 2010-01-19 : 13:56:40
|
| Oooh, I did not see that I could get it to tell me the query plan. That will be a useful tool in the future. I'm fairly new to Transact-SQL, having been taught with MySQL. Both are ANSI SQL compliant, so they're largely the same. I do very much so enjoy using stored procedures for queries that include user input, though. I'll have to consider Postgres for future open source projects.The queries don't end up being identical. It seems that the creation of the subquery table causes it to compute a scalar for that table, which is causing that query to cost ever so slightly more. |
 |
|
|
|
|
|
|
|