SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 General SQL Server Forums
 Script Library
 Curve Fitting (Trend Analysis and Prediction)
 New Topic  Reply to Topic
 Printer Friendly
Next Page
Author Previous Topic Topic Next Topic
Page: of 2

SwePeso
Patron Saint of Lost Yaks

Sweden
30265 Posts

Posted - 01/09/2007 :  08:01:16  Show Profile  Visit SwePeso's Homepage  Reply with Quote
For a set of data points (x, y), this algorithm can be used to fit the data to any of the following curves:

1. Straight line (linear regresion); y = A + b*x
2. Exponential curve; y = A*EXP(b*x); nb a > 0
3. Logarithmic curve; y = A + b*LN(x)
4. Power curve; y = A*x^b; nb a > 0

The coefficient of determination is R2 (how well does the curve fit)
-- Prepare test data
CREATE TABLE	cf
		(
			x decimal(38, 10),
			y decimal(38, 10)
		) 

-- Calculate Linear regression
INSERT	cf
SELECT	40.5, 104.5 UNION ALL
SELECT	38.6, 102 UNION ALL
SELECT	37.9, 100 UNION ALL
SELECT	36.2, 97.5 UNION ALL
SELECT	35.1, 95.5 UNION ALL
SELECT	34.6, 94

SELECT	'Linear regression' AS Type, A, b, R2
FROM	dbo.fnCurveFitting(1)
UNION ALL
SELECT	'Bestfit = ' + CAST(Type AS VARCHAR), A, b, R2
FROM	dbo.fnBestFit()

-- Calculate Exponential regression
DELETE
FROM	cf

INSERT	cf
SELECT	.72, 2.16 UNION ALL
SELECT	1.31, 1.61 UNION ALL
SELECT	1.95, 1.16 UNION ALL
SELECT	2.58, .85 UNION ALL
SELECT	3.14, .5

SELECT	'Exponential regression' AS Type, A, b, R2
FROM	dbo.fnCurveFitting(1)
UNION ALL
SELECT	'Bestfit = ' + CAST(Type AS VARCHAR), A, b, R2
FROM	dbo.fnBestFit()

-- Calculate Logarithmic regression
DELETE
FROM	cf

INSERT	cf
SELECT	3, 1.5 UNION ALL
SELECT	4, 9.3 UNION ALL
SELECT	6, 23.4 UNION ALL
SELECT	10, 45.8 UNION ALL
SELECT	12, 60.1

SELECT	'Logarithmic regression' AS Type, A, b, R2
FROM	dbo.fnCurveFitting(1)
UNION ALL
SELECT	'Bestfit = ' + CAST(Type AS VARCHAR), A, b, R2
FROM	dbo.fnBestFit()

-- Calculate Power regression
DELETE
FROM	cf

INSERT	cf
SELECT	10, .95 UNION ALL
SELECT	12, 1.05 UNION ALL
SELECT	15, 1.25 UNION ALL
SELECT	17, 1.41 UNION ALL
SELECT	20, 1.73 UNION ALL
SELECT	22, 2 UNION ALL
SELECT	25, 2.53 UNION ALL
SELECT	27, 2.98 UNION ALL
SELECT	30, 3.85 UNION ALL
SELECT	32, 4.59 UNION ALL
SELECT	35, 6.02

SELECT	'Power regression' AS Type, A, b, R2
FROM	dbo.fnCurveFitting(1)
UNION ALL
SELECT	'Bestfit = ' + CAST(Type AS VARCHAR), A, b, R2
FROM	dbo.fnBestFit()

DROP TABLE cf

Peter Larsson
Helsingborg, Sweden

Edited by - SwePeso on 01/10/2007 05:00:31

SwePeso
Patron Saint of Lost Yaks

Sweden
30265 Posts

Posted - 01/09/2007 :  08:02:03  Show Profile  Visit SwePeso's Homepage  Reply with Quote
Here are the functions.
When using a type that is not valid, the function defaults to linear regression.
CREATE FUNCTION dbo.fnCurveFitting
(
	@Type TINYINT
)
RETURNS @p TABLE (A DECIMAL(38, 10), b DECIMAL(38, 10), R2 DECIMAL(38, 10))
AS
/*
Type = 1	Linear		y = a + b*x
Type = 2	Exponential	y = a*e^(b*x)   nb a > 0
Type = 3	Logarithmic	y = a + b*ln(x)
Type = 4	Power		y = a*x^b	nb a > 0
*/
BEGIN
	DECLARE	@n DECIMAL(38, 10),
		@x DECIMAL(38, 10),
		@x2 DECIMAL(38, 10),
		@y DECIMAL(38, 10),
		@xy DECIMAL(38, 10),
		@y2 DECIMAL(38, 10),
		@d DECIMAL(38, 10),
		@a DECIMAL(38, 10),
		@b DECIMAL(38, 10),
		@r2 DECIMAL(38, 10)

	SELECT	@n =	COUNT(*),
		@x =	CASE
				WHEN @Type = 2 THEN SUM(x)
				WHEN @Type = 3 THEN SUM(LOG(x))
				WHEN @Type = 4 THEN SUM(LOG(x))
				ELSE SUM(x)
			END,
		@x2 =	CASE
				WHEN @Type = 2 THEN SUM(x * x)
				WHEN @Type = 3 THEN SUM(LOG(x) * LOG(x))
				WHEN @Type = 4 THEN SUM(LOG(x) * LOG(x))
				ELSE SUM(x * x)
			END,
		@y =	CASE
				WHEN @Type = 2 THEN SUM(LOG(y))
				WHEN @Type = 3 THEN SUM(y)
				WHEN @Type = 4 THEN SUM(LOG(y))
				ELSE SUM(y)
			END,
		@xy =	CASE
				WHEN @Type = 2 THEN SUM(x * LOG(y))
				WHEN @Type = 3 THEN SUM(LOG(x) * y)
				WHEN @Type = 4 THEN SUM(LOG(x) * LOG(y))
				ELSE SUM(x * y)
			END,
		@y2 =	CASE
				WHEN @Type = 2 THEN SUM(LOG(y) * LOG(y))
				WHEN @Type = 3 THEN SUM(y * y)
				WHEN @Type = 4 THEN SUM(LOG(y) * LOG(y))
				ELSE SUM(y * y)
			END,
		@d =	@n * @x2 - @x * @x		
	FROM	cf

	IF @d = 0
		RETURN

	SELECT	@a = (@x2 * @y - @x * @xy) / @d,
		@b = (@n * @xy - @x * @y) / @d,
		@r2 = (@a * @y + @b * @xy - @y * @y / @n) / (@y2 - @y * @y / @n)

	INSERT	@p
	SELECT	CASE
			WHEN @Type = 2 THEN EXP(@a)
			WHEN @Type = 3 THEN @a
			WHEN @Type = 4 THEN EXP(@a)
			ELSE @a
		END,
		@b,
		@r2

	RETURN
END

CREATE FUNCTION dbo.fnBestFit
(
)
RETURNS @p TABLE (Type TINYINT, A DECIMAL(38, 10), b DECIMAL(38, 10), R2 DECIMAL(38, 10))
AS

BEGIN
	INSERT	@p
	SELECT	1,
		A,
		b,
		R2
	FROM	dbo.fnCurveFitting(1)

	INSERT	@p
	SELECT	2,
		A,
		b,
		R2
	FROM	dbo.fnCurveFitting(2)

	INSERT	@p
	SELECT	3,
		A,
		b,
		R2
	FROM	dbo.fnCurveFitting(3)

	INSERT	@p
	SELECT	4,
		A,
		b,
		R2
	FROM	dbo.fnCurveFitting(4)

	DELETE
	FROM	@p
	WHERE	R2 <> (SELECT MAX(R2) FROM @p)

	RETURN
END


Peter Larsson
Helsingborg, Sweden

Edited by - SwePeso on 01/09/2007 08:44:09
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30265 Posts

Posted - 01/09/2007 :  08:02:51  Show Profile  Visit SwePeso's Homepage  Reply with Quote
Harsh, it seems that doing this SQL wise would be faster than doing it in front-end, since SQL Server does this set-based.


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

harsh_athalye
Flowing Fount of Yak Knowledge

India
5581 Posts

Posted - 01/09/2007 :  08:08:07  Show Profile  Visit harsh_athalye's Homepage  Click to see harsh_athalye's MSN Messenger address  Send harsh_athalye a Yahoo! Message  Reply with Quote
Is it? Oh wow! That's a wonderful observation then.

I think you are right, if you know the correct way to implement the solution, the language hardly matters.

Harsh Athalye
India.
"The IMPOSSIBLE is often UNTRIED"
Go to Top of Page

spirit1
Cybernetic Yak Master

Slovenia
11751 Posts

Posted - 01/09/2007 :  08:33:13  Show Profile  Visit spirit1's Homepage  Reply with Quote
running those log running queries again, are we peter?
Boredom on the horizon? Too much free time?






Go with the flow & have fun! Else fight the flow
blog thingie: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30265 Posts

Posted - 01/09/2007 :  08:42:45  Show Profile  Visit SwePeso's Homepage  Reply with Quote
Free time? When writing algorithms here? No no...


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

daydreamer82
Starting Member

3 Posts

Posted - 04/11/2007 :  16:46:38  Show Profile  Reply with Quote
hello,

This algorithm is providential for me. Im a student experimenting path discovering on Internet. I succeeded to put some traceroute data into a mysql database. And now, i tried to find a way to find shortests paths...

Thanks a lot ;)



PS. If you have the mysql version ... ;)

Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30265 Posts

Posted - 04/11/2007 :  16:56:51  Show Profile  Visit SwePeso's Homepage  Reply with Quote
I think you mean this topic
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=77262


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

daydreamer82
Starting Member

3 Posts

Posted - 04/11/2007 :  17:00:36  Show Profile  Reply with Quote
Ho yes, you are right. I lost myself during the registration ;)

Soo sorry. Admin? hem...
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

USA
7020 Posts

Posted - 04/11/2007 :  17:05:39  Show Profile  Reply with Quote
Peter,

Why stop half way?

Shouldn't your script insert the data into an Excel spreadsheet, and create a graph with the proper trend line too?




CODO ERGO SUM
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30265 Posts

Posted - 04/11/2007 :  17:07:53  Show Profile  Visit SwePeso's Homepage  Reply with Quote
I love you too, Michael


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

USA
7020 Posts

Posted - 04/11/2007 :  17:30:10  Show Profile  Reply with Quote
Sometimes, I just have to let out my inner twit.



CODO ERGO SUM
Go to Top of Page

blindman
Flowing Fount of Yak Knowledge

USA
2365 Posts

Posted - 04/11/2007 :  19:05:52  Show Profile  Reply with Quote
To keep the outer one company?
e4 d5 xd5 Nf6

Edited by - blindman on 04/11/2007 19:07:10
Go to Top of Page

HumanJHawkins
Starting Member

USA
4 Posts

Posted - 11/02/2010 :  17:41:49  Show Profile  Reply with Quote
This is wonderful. It looks like it will be greatly helpful toward some trend calculation that I hope to do. I think I can use this as is, but I would like to learn from and understand it better. So, apologies in advance for the noob questions that follow...

Can anyone point to places in the code if/where intermediate data (such as slope, intercept, or "c") are calculated?

I think it's clear that r2 is R squared. But I've never seen "a" as a variable name in this sort of equation. So that makes me wonder if "b" is just a variable that was needed, or if it is the "b" that is commonly used in calculating regressions.

Thanks for the code, and thanks in advance for any further explanation.

HumanJHawkins
Go to Top of Page

HumanJHawkins
Starting Member

USA
4 Posts

Posted - 11/02/2010 :  19:03:07  Show Profile  Reply with Quote
One more follow-up... I'm using MS SQL Server 2005. In my dataset, my y-column includes negative numbers. I am getting the error:
"A domain error occurred."

I believe this is due to the use of "LOG(y)", where LOG() is expecting a positive number. Is there a common strategy for dealing with this issue?

Again, thanks in advance,
HumanJHawkins
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30265 Posts

Posted - 11/03/2010 :  18:40:40  Show Profile  Visit SwePeso's Homepage  Reply with Quote
Are you using the BestFit function? Well... That hasn't been optimized for this situations.
However I have a SQLCLR that has! See http://www.developerworkshop.net/software.html
There are scripts and samples too.


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Humate
Posting Yak Master

101 Posts

Posted - 11/09/2010 :  14:35:00  Show Profile  Reply with Quote
EXEC sp_configure 'clr enabled', 1;

RECONFIGURE WITH OVERRIDE;

With override option detailed here.
http://msdn.microsoft.com/en-us/library/ms176069.aspx
Go to Top of Page

HumanJHawkins
Starting Member

USA
4 Posts

Posted - 11/10/2010 :  12:53:57  Show Profile  Reply with Quote
I guess I should get more sleep... I appear to add confusion with every visit. Anyway, here is a thread that describes how to get around issues one may encounter with enabling CLR:

http://social.msdn.microsoft.com/Forums/en/sqlnetfx/thread/50c11a95-046e-472e-b788-d12c091da1f5

Thanks for all of your help.

Edited by - HumanJHawkins on 11/10/2010 18:21:30
Go to Top of Page

gingerninja
Starting Member

United Kingdom
2 Posts

Posted - 12/08/2010 :  06:10:24  Show Profile  Reply with Quote
Hi,

Bit of a long shot this one, but I have stumbled on this forum post. I have a requirement to replicate some of the Excel Solver functionality within SQL. Ideally I wanted a purely TSQL function, but this looks unlikely now, so I'm also looking at CLR options.

What I need to do is analyse some financial fund data (36 price values) against somewhere between 2-5 benchmarks (also 36 price values each) and determine the sensitivities (weights) of the benchmarks that best follow the fund. In other words find a set of benchmark weights which minimizes the tracking error between the resulting benchmark and the fund. It's explained perfectly here: - http://www.andreassteiner.net/performanceanalysis/?External_Performance_Analysis:Style_Analysis

This is done with a few clicks in Excel, which is why it's so frustrating that I can't find a SQL Server contained solution. However, I do appreciate that it's a quadratic problem, so may not be so easily portable to SQL Server. I've have looked at the Frontline Solver (http://www.solver.com) and building a C# Dll, but I'd rather avoid that if possible.

Anyone got any thoughts or experience that might assist?

Many thanks,
Stephen
Go to Top of Page

AvinashPatwari
Starting Member

1 Posts

Posted - 10/12/2011 :  01:51:31  Show Profile  Reply with Quote
Swepeso,
I am so grateful for your post. IT saved my life. In my college project , I have to use similar kind of functionality for forecasting of events. But again, I will be more helpful if I get any more insight on Polynomial regression .. I am not able to find fool-proof and suitable method like yours. Your expertise is needed.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30265 Posts

Posted - 12/08/2013 :  09:25:55  Show Profile  Visit SwePeso's Homepage  Reply with Quote
No, it's the final EXP.

e^820.97 can not be represented in SQL Server.



Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA
Go to Top of Page
Page: of 2 Previous Topic Topic Next Topic  
Next Page
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.5 seconds. Powered By: Snitz Forums 2000