Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Ch #5 - Sum of Distances

Author  Topic 

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-11-21 : 18:55:11
create table points (x decimal, y decimal)
go

Find the point (not necessarily from the given arbitrary
set of points stored in the table) with the minimal sum of
distances from this sought point to each point from the set..

Just in case memo:
distance between two points (x, y) and (x0, y0) is
sqrt(square(x-x0)+square(y-y0))

MichaelP
Jedi Yak

2489 Posts

Posted - 2003-11-21 : 19:36:18
Hmmm....yikes....
The "(not necessarily from the given arbitrary
set of points stored in the table)" part throws me. How in the world are you going to do that? Maybe find the two closest, and find the point exactly half way betweem those two points?

<Yoda>Smells like Yahoo Maps this does. Meditate on this I must.</Yoda>

Michael

<Yoda>Use the Search page you must. Find the answer you will.</Yoda>
Go to Top of Page

MichaelP
Jedi Yak

2489 Posts

Posted - 2003-11-21 : 19:47:07
Is this the answer? WIth my dataset the answer is 1,5 at 2.8 away.


create table #points (x decimal, y decimal)
INSERT INTO #points(x, y) VALUES(1, 5)
INSERT INTO #points(x, y) VALUES(1, 6)
INSERT INTO #points(x, y) VALUES(1, 7)
INSERT INTO #points(x, y) VALUES(2, 10)
INSERT INTO #points(x, y) VALUES(2, 20)
INSERT INTO #points(x, y) VALUES(2, 30)

INSERT INTO #points(x, y) VALUES(3, 3) -- Our Sought Point

DECLARE @DestX DECIMAL
DECLARE @DestY DECIMAL

SELECT @DestX = 3, @DestY = 3
--
-- SELECT p1.x, p1.y, p2.x, p2.y, sqrt(square(p1.x-p2.x)+square(p1.y - p2.y))
-- FROM #Points p1
-- CROSS JOIN #Points p2
-- WHERE p2.x = @DestX
-- AND p2.y = @DestY
-- AND sqrt(square(p1.x-p2.x)+square(p1.y - p2.y)) <> 0.0
-- ORDER BY p1.x, p1.y


SELECT MIN(sqrt(square(p1.x-p2.x)+square(p1.y - p2.y)))AS MinSum
FROM #Points p1
CROSS JOIN #Points p2
WHERE p2.x = @DestX
AND p2.y = @DestY
AND sqrt(square(p1.x-p2.x)+square(p1.y - p2.y)) <> 0.0
DROP TABLE #points

--The closest point to 3,3 is 1,5


Michael

<Yoda>Use the Search page you must. Find the answer you will.</Yoda>
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-11-21 : 19:49:12
Your choice, Michael, your choice..

BTW try to google it.. it's almost useless..
Go to Top of Page

MichaelP
Jedi Yak

2489 Posts

Posted - 2003-11-21 : 19:50:54
I think I read that problem wrong, my solution finds the closest point, but it doesn't get the sum of the distances to all the points.

<Yoda>Wrong it is. Think on this some more I must</Yoda>

Michael

<Yoda>Use the Search page you must. Find the answer you will.</Yoda>
Go to Top of Page

MichaelP
Jedi Yak

2489 Posts

Posted - 2003-11-21 : 19:58:00
I think this is the answer:


create table #points (x decimal, y decimal)
INSERT INTO #points(x, y) VALUES(1, 5)
INSERT INTO #points(x, y) VALUES(1, 6)
INSERT INTO #points(x, y) VALUES(1, 7)
INSERT INTO #points(x, y) VALUES(2, 10)
INSERT INTO #points(x, y) VALUES(2, 20)
INSERT INTO #points(x, y) VALUES(2, 30)

INSERT INTO #points(x, y) VALUES(3, 3) -- Our Sought Point

DECLARE @DestX DECIMAL
DECLARE @DestY DECIMAL

SELECT @DestX = 3, @DestY = 3


SELECT p1.x, p1.y, SUM(sqrt(square(p1.x-p2.x)+square(p1.y - p2.y))) AS SumDistance
FROM #Points p1
CROSS JOIN #Points p2
WHERE p2.x = @DestX
AND p2.y = @DestY
AND sqrt(square(p1.x-p2.x)+square(p1.y - p2.y)) <> 0.0

GROUP BY p1.x, p1.y

SELECT MIN(SumDistance) FROM
(
SELECT p1.x, p1.y, SUM(sqrt(square(p1.x-p2.x)+square(p1.y - p2.y))) AS SumDistance
FROM #Points p1
CROSS JOIN #Points p2
WHERE p2.x = @DestX
AND p2.y = @DestY
AND sqrt(square(p1.x-p2.x)+square(p1.y - p2.y)) <> 0.0
GROUP BY p1.x, p1.y

) AS A


DROP TABLE #points




Michael

<Yoda>Use the Search page you must. Find the answer you will.</Yoda>
Go to Top of Page

MichaelP
Jedi Yak

2489 Posts

Posted - 2003-11-21 : 20:01:22
Do you "start" are your given point, then travel to all the points, and total that sum?

Got an example of inputs and the correct output? I think my last answer is wrong.

I think this is what you are looking for:


SELECT p1.x, p1.y, SUM(sqrt(square(p1.x-p2.x)+square(p1.y - p2.y))) AS SumDistance
FROM #Points p1
CROSS JOIN #Points p2
WHERE sqrt(square(p1.x-p2.x)+square(p1.y - p2.y)) <> 0.0
AND p1.x <> @DestX
AND p1.y <> @DestY
GROUP BY p1.x, p1.y



Michael

<Yoda>Use the Search page you must. Find the answer you will.</Yoda>
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-11-22 : 06:51:38
1.
What is it >> SELECT @DestX = 3, @DestY = 3??
No any input initial values!

2.
Here the simplest set of only 4 given points:

INSERT INTO #points(x, y) VALUES (0, 0)
INSERT INTO #points(x, y) VALUES (0, 1)
INSERT INTO #points(x, y) VALUES (1, 0)
INSERT INTO #points(x, y) VALUES (1, 1)

Almost obviously the sought point is (0.5, 0.5)..
But how to find it for the set of 40,000 points??
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2003-11-22 : 19:11:44
My proposed solution did not work. It was close, but not the minimum sum of all points.

I am now thinking about a divide and conquer technique using median and mean? Any thoughts..

Can you provide 5 arbitrary points and a solution other than your initial coordinates.

Questions:

Are we dealing with points located in Q1 only (x>=0, y>=0) ??
Also, what maximum scale result you are looking for ?
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-11-23 : 09:04:22
quote:
Can you provide 5 arbitrary points and a solution other than your initial coordinates.

E.g., pentagon.. Obviously(?) the sought point is the pentagon's center
and it is not in the 5 points set..
quote:
Are we dealing with points located in Q1 only (x>=0, y>=0) ??

It does not matter. Just think of shifting of coordinate origin.
In short, you can assume that x>=0 and y>=0.. if it helps..
quote:
Also, what maximum scale result you are looking for ?

Any reasonable.. with possibility easily increase accuracy.
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2003-11-23 : 13:01:04
My assumption was that the desired coordinate would lie some where between then mean and median of the given coordinates.

DROP PROC up_DivideConquer
DROP PROC up_FindMinSumCoordinate
GO

CREATE PROC up_DivideConquer
--Used float to speed up algorithm
@Step int,
@UBound_x float,
@UBound_y float,
@LBound_x float,
@LBound_y float,
@UBoundValueType varchar(20),
@LBoundValueType varchar(20)
AS
SET NOCOUNT ON

declare @x float, @y float, @i int
create table #x (x float)
create table #y (y float)
create table #bounds (n int identity(0,1) primary key,x float,y float)
create table #bound_sum (x float,y float, bound_sum float)

--Build boundary coordinate values
set @i = 1
select @x = (@UBound_x + @LBound_x) / (@step+2)/(3.333 + (@i*0.058)) --(@i*0.056))
while @i <=40
begin
select @x = ( @x + @LBound_x ) / (( @i+@step)/@step )
insert into #x select @x
set @i = @i + 1
end

set @i = 1
select @y = (@UBound_y + @LBound_y) /(@step+2)/(3.333 + (@i*0.058)) --(@i*0.056))
while @i <=40
begin
select @y = ( @y + @LBound_y ) / (( @i+@step)/@step )
insert into #y select @y
set @i = @i + 1
end

insert into #bounds
select x,y
from #x, #y

--Calculate Boundary sums
declare @n int; set @n=0

while @n <= (select max(n) from #bounds)
begin
select @x = x,@y=y from #bounds where n = @n

insert into #bound_sum
select @x,@y, sum(sqrt(square(x-@x)+square(y-@y)))
from points

set @n = @n+1
end

insert into ##results
select top 1 'B_Result' + cast(@Step as varchar(20)),x,y,bound_sum
from #bound_sum
where bound_sum = (select min(bound_sum) from #bound_sum)

drop table #x; drop table #y; drop table #bounds; drop table #bound_sum
GO

--****************************************************************************
--****************************************************************************

CREATE Proc up_FindMinSumCoordinate
AS
SET NOCOUNT ON
create table ##results (n int identity(0,1),valuetype varchar(20),x float,y float, distancesum float)

--x coordinates
select identity(int,1,1) n, x
into #x
from points
order by x

--y coordinates
select identity(int,1,1) n, y
into #y
from points
order by y

declare @med_x0 float, @med_y0 float, @med_sum float,
@mean_x0 float, @mean_y0 float, @mean_sum float

--calculate median of n-coordinates
if (select count(*)%2 from points) = 0
begin
--even num of coordinates
select @med_x0 = sum(x)/2,
@med_y0 = (select sum(y)/2 from (select top 2 y from #y where n >= (select (count(n)+1)/2 from #y)) a )
from (select top 2 x from #x where n >= (select (count(n)+1)/2 from #x)) a
end
else
--odd num of coordinates
select @med_x0 = x ,
@med_y0 = (select y from #y where n = (select (count(n)+1)/2 from #y))
from #x
where n = (select (count(n)+1)/2 from #x)

--calculate mean of n-coordinates
select @mean_x0 = sum(x)/count(x) ,
@mean_y0 = sum(y)/count(y) from points

--Select Min Boundary Sum and coordinates
insert into ##results select 'Median', @med_x0, @med_y0, sum(sqrt(square(x-@med_x0)+square(y-@med_y0))) from points
insert into ##results select 'Mean', @mean_x0,@mean_y0, sum(sqrt(square(x-@mean_x0)+square(y-@mean_y0))) from points

--Divide and Conquer
declare @i int; set @i = 0
declare @UBound_x float,
@UBound_y float,
@LBound_x float,
@LBound_y float,
@UBoundValueType varchar(20),
@LBoundValueType varchar(20)

begin
while @i <= 10--num or iterations through divide/conquer
BEGIN
set @i = @i + 1
select @UBound_x = x from (select top 1 x from ##RESULTS order by distancesum ) a
select @LBound_x = x from (select top 2 x from ##RESULTS order by distancesum ) a
select @UBound_y = y from (select top 1 y from ##RESULTS order by distancesum) a
select @LBound_y = y from (select top 2 y from ##RESULTS order by distancesum) a
select @UBoundValueType = valuetype from (select top 1 valuetype from ##results order by distancesum) a
select @LBoundValueType = valuetype from (select top 2 valuetype from ##results order by distancesum) a

exec up_DivideConquer @i,@UBound_x,@UBound_y,@LBound_x,@LBound_y,@UBoundValueType,@LBoundValueType

end
end

select top 1 x,y,DistanceSum
from ##results
order by distancesum

drop table #x; drop table #y; drop table ##results
GO

--****************************************************************************
--****************************************************************************

--EXEC up_FindMinSumCoordinate
GO
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-11-23 : 21:59:13
Consider the following two triangles:

(0, 0)(2, 0)(1, 1) and (0, 0)(2, 0)(1, 1000)

For the 2nd triangle your result point is (1, 333.3.....)

I.e., your result point starts to move after the 3rd vertex..

Meanwhile, it should stay unmoved(!!)..
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2003-11-24 : 07:53:06
Now you've utterly confused me!!

Consider the following:


create table points (x decimal, y decimal)

insert into points select 0,0
insert into points select 2,0
insert into points select 1,1000

select 1 x,333.333 y,sum(sqrt(square(x-1)+square(y-333.3333333))) distancesum from points
select 1 x,0 y,sum(sqrt(square(x-1)+square(y-0))) distancesum from points

drop table points


Are you saying that the point 1,333.3333... would be the solution
to the preceeding coordinates??

Because the point 1,0 produces a smaller sum to all points.
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-11-24 : 09:09:14
quote:
Because the point 1,0 produces a smaller sum to all points.

No!! I didn't say that :)
In the 1st triangle the sought point is (1, 0.33333...)..
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2003-11-24 : 09:41:23
quote:
Originally posted by Stoad

quote:
Because the point 1,0 produces a smaller sum to all points.

No!! I didn't say that :)
In the 1st triangle the sought point is (1, 0.33333...)..


Ok, I'll concede the typo as I have been know to produce a typo or two in my day. But, Is it 0.33333...?

select 1 x,0.333 y,sum(sqrt(square(x-1)+square(y-0.3333333))) distancesum from points
select 1 x,0.651 y,sum(sqrt(square(x-1)+square(y-0.651))) distancesum from points

EDIT: Better yet (1,0.5774...)

Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-11-24 : 11:35:04
wow..
Jay.. I just was in a hurry and took this number - 0.3333..
- from my memory.. But this is not the point. The point is:
what results your code produces for the following two different
sets of points:

1. (0, 0)(2, 0)(1, 1)

and

2. (0, 0)(2, 0)(1, 1000)

Just post them here..
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2003-11-24 : 11:41:55
1 , 0.5773500302265816
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-11-24 : 12:05:45
F$$$ me..

Jay, you are just a genius!!.. Quite seriously..
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-11-24 : 13:49:14
Just one thing.. Why on my machine (7.0) I get this result
for those two cases? Wrong table points?

x y DistanceSum
----------- -------------------- -------------------
1.0 0.33333333333333331 2.7748517734455866


x y DistanceSum
-------------------- -------------------- -------------------
1.0053916655068034 0.58327290317504832 1001.7320799216654

create table points(x float, y float)
go
insert into points
select 0, 0 union all
select 2, 0 union all
select 1, 1000
go
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2003-11-24 : 17:20:04
When you have outer lying points (ie. (1,1000) ) which are far away from the median the number may be innaccurate based on the resolution setting (See below). So while the first answer (0, 0)(2, 0)(1, 1) was correct as the points were closely grouped, the second group (0, 0)(2, 0)(1, 1000) was sweked because (1,1000) falls far outside the median.

In the Divide and Conquer iteration of the up_FindMinSumCoordinate there is a value @i which will set the number of iterations to perform and determine the level of resolution in the solution. If you increase @i you will generate more accuracy because of the direct coorelation between the number of iterations performed and the resolution of the result. But the cost is speed.

You can also adjust the @i iterations in the up_DivideConquer proc which may create a larger sample of boundary values per iteration.

Take a look at the ##results table to see how it is moving towards the min sum coordinates.

Anyways, as mentioned above, I have not fully tested this through many data-sets and I am sure there are more elegant / correct / optimized solutions than this. If fact, I was hoping more solutions would be posted as I thought this was a very interesting challenge.

ps. can you call my wife, work, friends, collegues, etc... and tell them I am a genius.

Jay
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-11-25 : 02:39:45
AAAAAHHHHHHHHH.... Jay.... iteration.... of course.... fully shaped OK....
quote:
ps. can you call my wife, work, friends, collegues, etc...... and tell them I am a genius.

Hm.. the calls from my political black hole will be quite expensive for me..
maybe just spam them??.. :)
Go to Top of Page
    Next Page

- Advertisement -