Author |
Topic |
coolest
Starting Member
6 Posts |
Posted - 2007-09-27 : 03:54:49
|
Please help me in this case,We are having one base table and audit table, The Audit table table structure is it will have all the key columns from the base table and the audit timestamp and audit action. For exBase tableKey1 Key2 Nonkey1 Nonkey2 AA BB NK NKAudit TableKey1 Key2 Audit_Time Audit_Action AA BB 2007-09-27 00:00:00 InsertAA BB 2007-09-27 09:00:00 Update The idea is whenever an update or insert is happening in Basetable one row with the corresponding Action will be inserted into the Audit table. This is done using a trigger.And when we select the rows for the transaction we want to select from the base table, if we do a join between the base table and audit table , we will get two rows for the given example. So for avoiding this i worte a query like Select distinct A.* From BaseTable A, AuditTable BWhere A.Keys = B.keys But i am having a problem when my base table is having Text Datatype.Again I rewrite my query asSelect A.* from Basetable AWhere Exists ( Select 'X' From AuditTable B where A.keys = B.Keys)But I feel , this will result in performance issues as it is like using a Corelated SUbquery. And Our Basetable and Audit table will be always huge.IS there any alternative ways for acheving this?? |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-09-27 : 04:52:08
|
Well, EXISTS is very smart about that.It "stops" looking for more records when first match is found.And if you look at the execution plan, it is very similar to your first attempt. E 12°55'05.25"N 56°04'39.16" |
 |
|
eyechart
Master Smack Fu Yak Hacker
3575 Posts |
Posted - 2007-09-27 : 04:57:43
|
how are you capturing the Text column in your trigger? I didn't think that the inserted pseudo table supported text or blobs. anyway, that might be beside the point. do what peter says :)-ec |
 |
|
coolest
Starting Member
6 Posts |
Posted - 2007-09-27 : 06:03:28
|
Thanks for the replies.We are not capturing Text in the Audit tables, we are capturing only the Key columns. Is there any other better way to achieve the Same Functionality? |
 |
|
coolest
Starting Member
6 Posts |
Posted - 2007-09-27 : 06:07:18
|
Thanks for the replies.We are not capturing Text in the Audit tables, we are capturing only the Key columns. Is there any other better way to achieve the Same Functionality? |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-09-27 : 06:40:24
|
Won't there ALWAYS be a row in the Audit table for every row in the Base table (i.e. the one with the "Insert" action)Kristen |
 |
|
coolest
Starting Member
6 Posts |
Posted - 2007-09-27 : 06:46:31
|
Yes definetly there will be a row for insert in the Audit table, But for update also we will be having a row withe the Audit Action as Update. So if two updates happen on the row , there will be three rows in the Audit table and one row in the base table. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-09-27 : 08:35:09
|
In that case surelyWhere Exists ( Select 'X' From AuditTable B where A.keys = B.Keys)will always be TRUE, and is therefore redundant, or have I missed something?Kristen |
 |
|
coolest
Starting Member
6 Posts |
Posted - 2007-09-28 : 00:02:38
|
Actually , I have missed something. We are using the Audit table for reducing the load, It is actually sending the data to a Datawarehouse on monthly basis.Let me explain my scenario.We will load the Transactional data by monthly wise, as we dont have the Timestamp in the Base table we are forced to use the Audit table to identify when a record is modified.So let me assume , we are sending the data on Septempber 1st,2007, So i will write the query like, Select A.* From Base table Awhere exists(Select 'X' From AuditTable B where A.keys = B.Keys andB.Audit_Time > '2007-08-01 00:00:00') So that it will send the August month data only.And when we run it in October, the query will be like Select A.* From Base table Awhere exists(Select 'X' From AuditTable B where A.keys = B.Keys andB.Audit_Time > '2007-09-01 00:00:00'), So it will fetch all the records modified and inserted in the Base table After September 1.Now , i hope the requirement is clear.So , Now when a record is inserted in the base table after september 1, it will be having a entry in the Audit table , and whenever it is modified it will have the corresponding entry in the Audit table. So if i assume , if it is modified 2 times, we will 2 reords in the Audit table and in total we will have 3 records in the Audit table.So when we do a normal join with the base table 3 records will be selected as the Key columns will be always same.For avoiding this duplicates, i want a better performant query.ThanksAnand |
 |
|
coolest
Starting Member
6 Posts |
Posted - 2007-09-28 : 00:14:13
|
Actually , i am having something else in mind for better performance, but i dont know how to implement it.Let me assume , i am having 10000 rows in my base table and only 10 rows inserted in the month of September.The Query isSelect A.* from Basetable Awhere exists (Select 'X' from Audit table B where A.keys = B.Keys and B.AuditTime > '2007-09-01 00:00:00')So for identifying this 10 rows, my query will loop 10000 times as it is a Co-related Subquery based on Base Table.What i am having in mind is , somehow i have to use the AuditTable in the Outer Query and fetch corresponding Non keyfields in the Basetable.So that now the query will loop only for 10 times.As i explained earlier , I will be having all the Keycolumns in the Audit table and I need to select the distinct from the audit table and select the corresponding Nonkey fields from the Base Table.Guide me in implementing this logic.ThanksAnand |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-09-28 : 03:16:11
|
"What i am having in mind is , somehow i have to use the AuditTable in the Outer Query and fetch corresponding Non keyfields in the Basetable. So that now the query will loop only for 10 times."I don't see why, the Query Optimiser is smart enough to correlate the data in the inner EXISTS with the outer query.Make sure you have an index on AuditTable.AuditTime & keys (possible the other way round, try it an see which performs better).Note that the perceived wisdom is to use the conventionEXISTS ( SELECT * FROM ...this is written as giving the optimiser the choice of which column/index to base the Exists on, and although I think that SELECT 'X' and SELECT 1 etc. will work, because of the frequency of SELECT * usage in this construction there might be some shortcuts that the optimiser takes when it sees itAlso, you should leave the hyphens out of string dates like '2007-09-01 00:00:00' - there are configurations, would you believe!, where that won't work whereas '20070901 00:00:00', or even just '20070901', will work.Kristen |
 |
|
|