Author |
Topic |
Gemeen_Aapje
Yak Posting Veteran
54 Posts |
Posted - 2008-04-24 : 07:07:58
|
Hi,I have a problem with the performance of a search query, and I'm hoping you can help me make it better. I've had no formal SQL training, only what I've taught myself to make my sites.So here go, I have a query that says...Select a, b, c from tblTableNameWhere a = "x" AND b = "y" AND c IN ('value1','value2','value3')A simple enough query I'm sure you will agree. It searches over 10,000 records. It used to perform ok with about 2,000 records, now it's taking 30/40+ seconds.The problem is with C IN(....). The values in the brackets can be anywhere from 5 values to 2,000 values (Postcodes they are, where a user might select the nearest 5 mile radius or the nearest 500 mile radius).The field is a text box, and has a maxium of 4 characters in. It's indexed also. The search works fine with anything up to about 100 postcodes, but anything more and it really kills the SQL server.Please can you advise what the best option is for a query like this?Many thanks~Matt |
|
harsh_athalye
Master Smack Fu Yak Hacker
5581 Posts |
Posted - 2008-04-24 : 07:23:59
|
Values in (..) are hard-coded or are they result of some query?Harsh AthalyeIndia."The IMPOSSIBLE is often UNTRIED" |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-04-24 : 07:46:50
|
Don't forget to ask "dynamic sql". E 12°55'05.25"N 56°04'39.16" |
 |
|
Gemeen_Aapje
Yak Posting Veteran
54 Posts |
Posted - 2008-04-24 : 08:03:47
|
Hey folks. What/who's Dynamic SQL?The values differ and are part of a query.The script first searches through the tblPostcodes, calculates in about half a second which postcodes are within the user's specified range from their specified postcode, then adds them to an array. The script next splits the array into the next query as per above where it finds all the users/profiles that have specified a postcode within the list found in the first query.Thanks for the very prompt reply. |
 |
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2008-04-24 : 14:22:28
|
Why put the results in an array? If you put them into a table (variable or temp) then you can just join to that table without having to use an IN clause. |
 |
|
cat_jesus
Aged Yak Warrior
547 Posts |
Posted - 2008-04-24 : 15:16:58
|
Also, have you checked the indexes on these tables? Are they appropriate and are statistics up to date?But I agree with peso and Lamprey, put the results into a temp table and then join on that. It's not only faster, but easier to code. |
 |
|
Gemeen_Aapje
Yak Posting Veteran
54 Posts |
Posted - 2008-04-24 : 15:28:13
|
actually it's not an array, i remember i changed it when i changes it to an IN clause. It's not just a VB variable string like " 'pc1','pc2','pc3' " which just gets inserted into the IN query.These searches are running like every second or two. I like to avoid writing in the database if i can help it, its a lot of work for the system aint it? (no thats a serious question, cos im not sure lol).If I'm going to write the results to a new table, wont that take a long time - only to search from that temp table, then delete it all again etc. ???Here's a rundown on how it works, a sort of overview...select pc from tblPostcodes where pc <> 12345 (users postcode)....for each postcode, calculate the distance from users postcode (done in VB).if postcode is < users specified distance thenadd to string variableUsing the results from above...Select a, b, c from tblTableNameWhere a = "x" AND b = "y" AND c IN (insert variable from above here)Thanks again |
 |
|
Gemeen_Aapje
Yak Posting Veteran
54 Posts |
Posted - 2008-04-24 : 15:31:51
|
sorry yea ive checked the indexes etc theyre all fine. I did a lot of work reorganising them all and learning about clustered/non-clustered etc just for this one issue. Cheers |
 |
|
cat_jesus
Aged Yak Warrior
547 Posts |
Posted - 2008-04-24 : 15:46:51
|
Consider that if you are using a query rather than a stored procedure you are recompiling the code every time. You don't get the benefit of caching either because you are putting literal values in the query thus changing it and requiring a recompile. If you use a temp table you may be able to at least avoid recompiles for multiple executions of the exact code.The bottom line though is if you are running a query every couple of seconds you really need to put it in a stored procedure if you can. |
 |
|
Gemeen_Aapje
Yak Posting Veteran
54 Posts |
Posted - 2008-04-24 : 17:30:07
|
Its the method of calculating the distance between 2 postcodes i can't figure out how to do in SP. It's really complex. Take a look...'POSTCODE DISTANCE CALCULATION'=============================const pi = 3.14159265358979323846'::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::'::: This function get the arccos function from arctan function :::'::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::Function acos(rad) If Abs(rad) <> 1 Then acos = pi/2 - Atn(rad / Sqr(1 - rad * rad)) ElseIf rad = -1 Then acos = pi End IfEnd function'::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::'::: This function converts decimal degrees to radians :::'::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::Function deg2rad(Deg) deg2rad = cdbl(Deg * pi / 180)End Function'::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::'::: This function converts radians to decimal degrees :::'::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::Function rad2deg(Rad) rad2deg = cdbl(Rad * 180 / pi)End FunctionFunction PCdistance(lat1, lon1, lat2, lon2, unit) Dim theta, dist theta = lon1 - lon2 dist = sin(deg2rad(lat1)) * sin(deg2rad(lat2)) + cos(deg2rad(lat1)) * cos(deg2rad(lat2)) * cos(deg2rad(theta)) dist = acos(dist) dist = rad2deg(dist) distance = dist * 60 * 1.1515 Select Case ucase(unit) Case "M" PCdistance = distance Case "K" PCdistance = distance * 1.609344 Case "N" PCdistance = distance * 0.8684 End SelectEnd FunctionIt's the sine, cosine, etc that get me. |
 |
|
cat_jesus
Aged Yak Warrior
547 Posts |
Posted - 2008-04-29 : 12:58:49
|
Why not create user defined functions for this piece of your problem? That way you don't have to wait for VB to come back with a list of postcodes to search.In other words your SP would have as input the user's postcode and the distance and then you return the list. There's no point in going back and forth if you can do it all in one stored procedure.Depending on the number of postcodes, you might even want to consider precalculating all of the distances and put the results in an optimised table and then do straight SQL queries against that.An infinite universe is the ultimate cartesian product. |
 |
|
cat_jesus
Aged Yak Warrior
547 Posts |
Posted - 2008-04-29 : 13:24:48
|
Also know that the functions you list above already exist in SQL server.They are in the order you list themACOSRADIANSDEGREESYou can see how they function in BOL. (there is also PI())And for the distance function you can use thishttp://www.zipcodeworld.com/samples/distance.mssql.htmlAn infinite universe is the ultimate cartesian product. |
 |
|
Gemeen_Aapje
Yak Posting Veteran
54 Posts |
Posted - 2008-05-09 : 08:54:25
|
thanks for your replies guys, ill look at making a SP.Sorry for the delay getting back to you, ive been offline for a couple of weeks.Thanks again, ill let you know how i get on |
 |
|
Gemeen_Aapje
Yak Posting Veteran
54 Posts |
Posted - 2008-05-09 : 15:26:21
|
Well i've started trying a few ideas tonight.I first tried the temp table option, creating, inserting, selecting, dropping etc... but it was quite slow still (Although faster). There are thousands of records to insert, and just the inserting of the records was taking 6+ seconds.What i have now done is replaced that with a derived table. ie inner join (select....) TempTable ON.... etc.My script is now 316% faster!!!!! And that's without converting it all to a stored procedure, all i've done is move the WHERE IN(....) past of the query to an inner join (select pc from tblPostcodes where pc IN(...)). The reason being i assume that the postcodes table is much, much smaller than the member profiles (Which is getting bigger and bigger every day, whereas the postcodes one isn't).I'm gonna keep working on it, and would just like to say thank-you all so much for helping me.Take care,~Matt |
 |
|
Gemeen_Aapje
Yak Posting Veteran
54 Posts |
Posted - 2008-05-09 : 15:30:02
|
oh, and because the postcodes table is smaller and doesn't change... the sql server can more easily cache the results of the inner select statement.... woo hoo |
 |
|
cat_jesus
Aged Yak Warrior
547 Posts |
Posted - 2008-05-09 : 16:39:08
|
I'm glad I could be of assistance. Sometimes all you need is a nudge in the right direction.316%. Nice.Be sure to show us the new and improved version.An infinite universe is the ultimate cartesian product. |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-05-09 : 17:35:02
|
There are better approaches.You can start to search for a "bounding box". This is very fast.For those records within the box, you then apply the "circle" distance for.Then most "box" records will be returned.Or apply the Haversine formula. E 12°55'05.25"N 56°04'39.16" |
 |
|
|