| Author |
Topic |
|
albertkohl
Aged Yak Warrior
740 Posts |
Posted - 2009-02-23 : 13:31:19
|
hello all, i was wondering if there is a better way to update approx 222 million rows in a table that's 313gb large.currently im using:update t1 set areacode = substring(phone,1,3), citystate = rtrim(t1.city)+' '+state, countystate = rtrim(t1.county_name)+' '+state, ltv = case when HMVAL>'0' AND HMVAL<>'0000' then cast(mortgage_amount as float)/cast(HMVAL as float) endfrom mytable t1 i know that the citystate and countystate columns are redundant, but we use them for a webapp i wrote awhile ago. using better code for that is on the list of things to do, but it's a very long list :(is there a better way for me to update all these rows? i'm looking for speed, not really worried about locks or up-time. database model is simple. Thanks!(also, might be worth mentioning, a) this is an olap table we update only quarterly, and b) i have a uid column that is an incramented identity, so in batches was one idea.) |
|
|
sodeep
Master Smack Fu Yak Hacker
7174 Posts |
Posted - 2009-02-23 : 13:43:26
|
| Yes do in batches so you don't fill up log. |
 |
|
|
albertkohl
Aged Yak Warrior
740 Posts |
Posted - 2009-02-23 : 13:46:49
|
| can you give me an idea of how to write a loop or something that'll do the job?i imported the flat text file in 500000 record incraments, that appeared to work well while increasing the logs a minimal amount. |
 |
|
|
mfemenel
Professor Frink
1421 Posts |
Posted - 2009-02-23 : 14:00:32
|
| Something like this would perform a statement in million row chunks up to the max of your identity column in your table.declare @max intdeclare @counter intselect @max=MAX(uid) from tablenameset @counter=1000000while @counter<@max begin update tablename set field=xyz where uid between @counter and @counter + 1000000 set @counter=@counter+1000000 select @counter END Mike"oh, that monkey is going to pay" |
 |
|
|
albertkohl
Aged Yak Warrior
740 Posts |
Posted - 2009-02-23 : 14:09:22
|
| SWEET! thanks. now last question. i know it's a HUGE SQL No-no, but is there any way to run this update w/o locking the table and creating blocks?until the update is done, and i finish indexing, i pull orders off the table manually, none of the fields i use will be affected by this update so i'm not worried about dirty reads.Thanks! |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
albertkohl
Aged Yak Warrior
740 Posts |
Posted - 2009-02-23 : 14:35:55
|
| that's what i was talking about, sorry i wasnt clear. but i would be running select statements against that table while it's updating. i wouldnt really be worried about the updating columns though, because like i said my web-app uses those, and i havnt cut over from our old to our new update yet.so you think if i do 1000 rows @ a time i would cut-down on any locking either way? |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
mfemenel
Professor Frink
1421 Posts |
Posted - 2009-02-23 : 15:12:16
|
| In my earlier post I didn't mention timing. If you don't have a maintenace window to run this in then at least choose an off peak time, late night, early morning, weekend to run this. It will run faster, your users will notice it less and generally everyone will be happier (except for maybe you working off hours..but that's the job).Mike"oh, that monkey is going to pay" |
 |
|
|
albertkohl
Aged Yak Warrior
740 Posts |
Posted - 2009-02-23 : 15:22:50
|
| thanks guys!!!so one more quick question. will larger batches cause the entire job to finish sooner, or will the difference between large/small batches really only matter on locking?Thanks! |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
albertkohl
Aged Yak Warrior
740 Posts |
Posted - 2009-02-23 : 15:32:45
|
| got ya, okay, another one :D (i know, i'm full of them)should i index the table by the UID and columns i'm updating? right now it's a raw table, no indexing or partitioning... so it seems like finding the UID blocks to update is taking longer |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
mfemenel
Professor Frink
1421 Posts |
Posted - 2009-02-23 : 15:54:43
|
| IMHO indexing the UID would make sense if it's part of your where or join clause. I would not create a clustered index though or you'll slow your self down quite a bit because you'll end up updating the table and the index. I don't see any benefit to updating your updated columns.Mike"oh, that monkey is going to pay" |
 |
|
|
albertkohl
Aged Yak Warrior
740 Posts |
Posted - 2009-02-23 : 16:48:16
|
| sounds good, thank you all again, i'm in the process of running it now. Much Appreciated! |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
albertkohl
Aged Yak Warrior
740 Posts |
Posted - 2009-02-23 : 17:25:50
|
| well, i have it running on one server now (since b4 i started this post) and it's been runing for 5hrs now, should be done soon. once it is, i have the same exact table on a test DB on the same server, i'm gunna try it this way and see if i can shave some time off. hopefully i can. switching from import/export wizard to bulk insert shaved my import time down from 8hrs to 2, so i'm sure this will be a huge impact as well. not to mention i'm not always gunna have a few tarabytes free on the server for log expansion during huge updates like this, so cutting down the logging will help alot in the future. not to mention this has a number of other applications for me too...Thanks again, i'll give you guys some numbers when i'm done if your curious. |
 |
|
|
|