Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2008-12-29 : 08:47:13
|
Hi,I have a table full of geo data as seen below. I have a SP that I am trying to write that is going to bring back the X closest cities when I pass it a lat/long locatoin.I have found the code for calculating distance, but nothing that will bring me back a list of closest citiessomething like this :CREATE PROCEDURE [dbo].[select_closest_Cities] ( @topRecords int, @latitude float, @longitude float )ASSET NOCOUNT ON SELECT TOP (@topRecords) * FROM [GEODATASOURCE_CITIES_GOLD] WHEREGONote: Below the table structure I have also included a function that was used to calculate distance between 2 points. Perhaps its helpful in this situation , not sure.Any help greatly appreciated!thanks again,mike123CREATE TABLE [dbo].[GEODATASOURCE_CITIES_GOLD]( [REGION] [int] NULL, [SUB_REGION] [varchar](2) NULL, [UFI] [int] NULL, [UNI] [int] NULL, [DSG] [varchar](5) NULL, [CC_FIPS] [varchar](2) NULL, [CC_ISO] [varchar](2) NULL, [FULL_NAME] [varchar](200) NULL, [FULL_NAME_ND] [varchar](200) NULL, [SORT_NAME] [varchar](200) NULL, [ADM1_CODE] [varchar](2) NULL, [ADM1_FULL_NAME] [varchar](200) NULL, [ADM1_FULL_NAME_ND] [varchar](200) NULL, [ADM2_CODE] [varchar](3) NULL, [ADM2_FULL_NAME] [varchar](200) NULL, [ADM2_FULL_NAME_ND] [varchar](200) NULL, [LATITUDE] [float] NULL, [LONGITUDE] [float] NULL, [DMS_LATITUDE] [int] NULL, [DMS_LONGITUDE] [int] NULL, [UTM] [varchar](4) NULL, [JOG] [varchar](7) NULL) ON [PRIMARY]GOcreate function dbo.Distance( @lat1 float , @long1 float , @lat2 float , @long2 float) returns floatasbegindeclare @DegToRad as floatdeclare @Ans as floatdeclare @Miles as floatset @DegToRad = 57.29577951set @Ans = 0set @Miles = 0if @lat1 is null or @lat1 = 0 or @long1 is null or @long1 = 0 or @lat2 isnull or @lat2 = 0 or @long2 is null or @long2 = 0beginreturn ( @Miles )endset @Ans = SIN(@lat1 / @DegToRad) * SIN(@lat2 / @DegToRad) + COS(@lat1 / @DegToRad ) * COS( @lat2 / @DegToRad ) * COS(ABS(@long2 - @long1 )/@DegToRad)set @Miles = 3959 * ATAN(SQRT(1 - SQUARE(@Ans)) / @Ans)set @Miles = CEILING(@Miles)return ( @Miles )end |
|
|
sakets_2000
Master Smack Fu Yak Hacker
1472 Posts |
Posted - 2008-12-29 : 09:33:42
|
Something like,SELECT TOP (@topRecords) dbo.Distance (@latitude,@longitude,LATITUDE,DMS_LONGITUDE),* FROM [GEODATASOURCE_CITIES_GOLD] order by 1 desc |
 |
|
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2008-12-29 : 09:42:11
|
| Hi Sakets_2000,Appreciate the help but what are the params(@latitude,@longitude,LATITUDE,DMS_LONGITUDE) ?Since I am not calculating the distance between 2 points, I only have 1 set of points to pass to the function. I am not sure this function will work as it is currently designed?Thanks,Mike123 |
 |
|
|
sakets_2000
Master Smack Fu Yak Hacker
1472 Posts |
Posted - 2008-12-29 : 09:47:21
|
| @latitude,@longitude are your parameters. LATITUDE,DMS_LONGITUDE are the values it'll pick up from your base table and calculate distance between every geography and the passed co-ordinates. I have just given you the base code of the function, and not the entire body. |
 |
|
|
sakets_2000
Master Smack Fu Yak Hacker
1472 Posts |
Posted - 2008-12-29 : 09:54:03
|
Your proc can be something like this:CREATE PROCEDURE [dbo].[select_closest_Cities]--exec select_closest_Cities 100,10,20(@topRecords int,@latitude float,@longitude float)ASSET NOCOUNT ONdeclare @sql varchar(4000)set @sql= 'SELECT TOP '+convert(varchar,@topRecords)+' dbo.Distance ('+convert(varchar,@latitude)+','+convert(varchar,@longitude)+',LATITUDE,DMS_LONGITUDE),* FROM [GEODATASOURCE_CITIES_GOLD] order by 1 desc'exec (@sql)GO |
 |
|
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2008-12-29 : 09:54:53
|
| Ah ok, I understand better now :) I am running the query now and it is taking quite some time. (Over 1 minute so far), also it is not bringing back desired results at all.The function has to be changed I assume ? I'm hoping to be able to execute a query such as this < 1 second. Not sure if this is possilbe or if there is a better approach to take.Thanks,Mike123 |
 |
|
|
sakets_2000
Master Smack Fu Yak Hacker
1472 Posts |
Posted - 2008-12-29 : 10:19:24
|
Ah, I'm sorry. I had posted the function to list you the farthest cities first.CREATE PROCEDURE [dbo].[select_closest_Cities]--exec select_closest_Cities 100,10,20(@topRecords int,@latitude float,@longitude float)ASSET NOCOUNT ONdeclare @sql varchar(4000)set @sql= 'SELECT TOP '+convert(varchar,@topRecords)+' dbo.Distance ('+convert(varchar,@latitude)+','+convert(varchar,@longitude)+',LATITUDE,DMS_LONGITUDE),* FROM [GEODATASOURCE_CITIES_GOLD] order by 1 /*desc*/ ascexec (@sql)GO |
 |
|
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2008-12-29 : 10:39:37
|
| Hi Sakets,I'm getting pretty strange results it seems on running it with ASC and DESC. I can't make sense of either result, definately neither are the closest cities. Perhaps one of them could be the furthest point on earth.Any idea ?Thanks again, much appreciated! :)mike123 |
 |
|
|
sakets_2000
Master Smack Fu Yak Hacker
1472 Posts |
Posted - 2008-12-29 : 10:58:29
|
Mike,Whats the difference between DMS_LATITUDE & LATITUDE, and DMS_LONGITUDE and LONGITUDE in your table ? I think, Our case is just that we are considering the wrong columns for our calculations.The formulae you are using looks good.Can you check for results after running this,alter PROCEDURE [dbo].[select_closest_Cities]--exec select_closest_Cities 100,10,20(@topRecords int,@latitude float,@longitude float)ASSET NOCOUNT ONdeclare @sql varchar(4000)set @sql= 'SELECT TOP '+convert(varchar,@topRecords)+' dbo.Distance ('+convert(varchar,@latitude)+','+convert(varchar,@longitude)+',LATITUDE,LONGITUDE),* FROM [GEODATASOURCE_CITIES_GOLD] order by 1 asc'exec (@sql)Also, How are you accounting for east-west measurement and north-souht measurement? |
 |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
|
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2008-12-29 : 11:29:05
|
quote: Originally posted by sakets_2000 Mike,Whats the difference between DMS_LATITUDE & LATITUDE, and DMS_LONGITUDE and LONGITUDE in your table ? I think, Our case is just that we are considering the wrong columns for our calculations.The formulae you are using looks good.
Here are the represented values in the database. One appears to be formulated into an INT variable. Not sure exactly why to be honest.LATITUDE / LONGITUDE / DMS_LATITUDE / DMS_LONGITUDE12.516667 -69.95 123100 -69570012.616667 -70.066667 123700 -70040012.533333 -69.983333 123200 -69590012.483333 -69.966667 122900 -69580012.516667 -70.016667 123100 -70010012.533333 -70.983333 123200 -70590012.483333 -69.95 122900 -69570012.45 -69.916667 122700 -69550012.483333 -69.95 122900 -69570012.55 -70.033333 123300 -70020012.55 -69.966667 123300 -695800 |
 |
|
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2008-12-29 : 11:39:23
|
quote: Originally posted by Michael Valentine Jones You might want to look at the function on this link for the distance calculation:http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=81360CODO ERGO SUM
appreciated, and helpful alternative to have... Will look into it more... accuracy to this level is not necessarily critical but im glad I read this link...looking for a fast efficient way to return a list of closest cities.. if you have any input would be greatly appreciated too!Thanks!mike123 |
 |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2008-12-29 : 12:13:36
|
| You might also want to look into this link for ideas on how to limit the search boundaries and allow you to use an index on the latitude and longitude columns for a more efficient search.Function F_FIND_SEARCH_LIMITShttp://www.sqlteam.com/forums/topic.asp?TOPIC_ID=112369CODO ERGO SUM |
 |
|
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2008-12-29 : 12:26:19
|
| Hey Michael,Again helpful link .. reading over it now.. This query of bringing back the "10 closest cities" is quite an expensive and advanced query it seems, am I correct ? (Alot more complicated than just calculating the distance anyways right ? )Thanks again,mike123 |
 |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2008-12-29 : 17:03:37
|
| You might be able to pre-calculate the distance if you have a fairly small fixed set of from/to locations, but it gets out of hand fairly quickly as the number of combinations increases. For example, to calculate the distance from 80,000 5 digit US Zip codes to every other US Zip code would take 6.4 billion rows, and give you a table well over 100 GB in size. If you only have 2,000 combinations to worry about, you could have a table that is a much more reasonable 150 MB in size.You might get away with loading only the combinations that were within maybe 100 miles of each other, depending on the requirements of your application, but it could still be a very large table. If you only need to find the 10 or 20 closest store locations to every Zip code, a table holding those should be fairly reasonable in size, maybe 30 MB.The idea of defining the East/West and North/South limits of the query is the best plan for a more general case where there are large numbers of possible locations. If you can accurately define these limits, you should be able to get a query with reasonable performance on a table with indexes on latitude and longitude, and then use the great circle calculations to find the rows within the limits of the circle.CODO ERGO SUM |
 |
|
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2008-12-30 : 05:44:44
|
| Hi Michael,Wow yes 6.4 billion rows is definately out of control. I was originally planning on just using LAT / LONG co-ordinates and finding the closest 20 cities in any direction. Ideally I wouldn't want to limit by distance, but rather just cut off the query at 20 cities. I think this could cause a performance penalty? I wasn't intially planning on using zipCodes to locate the starting position, just GPS. Do I need zipcode? Should I use zipcode? I am trying to make this function for the whole world so thought GPS was a better starting point.When you say holding these values in a table, are you talking about creating a seperate table with the results to the query, and then querying that from the application?Appreciate all the help, this is substantially more difficult than I intially anticipated :)thanks again,mike123 |
 |
|
|
mike123
Master Smack Fu Yak Hacker
1462 Posts |
Posted - 2009-01-04 : 12:45:39
|
| Also, I want to query against this table quite often, and would like to see the best performance I can get. I plan to cache it the result sets in my application so that I hit this table as little as possible, but it brings up another question.What would be a good way to "precalculate" or somehow store the calculated values? Since this data is completely static, perhaps there is a good technique I can employ to populate another table with desired results.For example, I could create a table like below, and store the top 20 results for each city. This would make the query a heck of alot faster of course. Since its virtually static data this approach would be ok. (the data is updated monthly, i could drop, recreate, and repopulate the table 1x per month)Any guidance much appreciated! I'm having alot of difficulty with this query, and I think performance wise to get it working fast is even more of a challenge.Thanks once again!mike123cityName / relatedCity / CitiesAwayNEWYORK / JERSEY CITY / 1NEWYORK / NEWARK / 2NEWYORK / YONKERS / 3NEWYORK / ELIZABETH / 4NEW YORK / PATERSON / 5 |
 |
|
|
|
|
|
|
|