Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
matty1stop
Starting Member
37 Posts |
Posted - 2007-04-19 : 08:50:11
|
| I'm attempting to create a stored procedure that would transfer data daily from a data warehouse to a data mart. I'm running into slow performance so I'm trying to find the best way to structure the feed to avoid adding duplicate data. The query I'm running is more complex than my example, but just to start out simple lets say I have a table in the warehouse called Transactions with two fields: UserID (varchar(32)) and ProcessTime(datetime). There is one record in this table for every transaction a user processes. I want to populate a table (DailyTranCount, UserID (varchar(32), ProcessDate (smalldatetime), TransactionCount (int) ) in the data mart with a daily count of the user's transactions. The insert statement would look something like this: insert into DailyTranCountselect userid, convert(smalldatetime, processtime, 101), count(*)from Transactionsgroup by userid, convert(smalldatetime, processtime, 101)To avoid adding duplicate data is it best to:A) use NOT EXISTS B) use NOT INC) use a left join to the DailyTranCount including a null check in the where clauseThanks for reading and please let me know if there is any other information I should have provided.Matt |
|
|
harsh_athalye
Master Smack Fu Yak Hacker
5581 Posts |
Posted - 2007-04-19 : 09:08:11
|
| My best bet would be to use LEFT JOIN with check on NULL.Harsh AthalyeIndia."The IMPOSSIBLE is often UNTRIED" |
 |
|
|
LoztInSpace
Aged Yak Warrior
940 Posts |
Posted - 2007-04-19 : 23:19:37
|
| Try it & see the query plans.I recon the optimiser will make IN the same as a join but it depends on other factors |
 |
|
|
Vinnie881
Master Smack Fu Yak Hacker
1231 Posts |
Posted - 2007-04-20 : 00:51:33
|
| The general concensous is NOT EXISTS is better to use in this scenerio (It's been explained to me in a previous post that it's normally better practice to not need to add and join another table unless it's required for the query), but LoztinSpace is correct, check the execution plan on both to see what performs best for you. |
 |
|
|
|
|
|
|
|