A simple formula - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

Transact-SQL (2008)

A simple formula

Author

Topic

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2011-05-05 : 15:03:26

Is it possible to fine a simple math formula for following scenario?
I want to check are 5 sequence values absolutely incremental.
For example


row_num     v1          v2          v3          v4          v5
----------- ----------- ----------- ----------- ----------- -----------
1           50          55          60          90          150
2           50          60          49          70          89

The row with number 1 is incremental then I want select it, because:
50 < 55 < 60 < 90 < 150

A simple method is:

SELECT *
FROM TableName
WHERE v1 < v2 AND v2 < v3 AND v3 < v4 AND v4 < v5;

Or another method:

SELECT *
FROM TableName
WHERE (SIGN(v2 – v1) – 1) + (SIGN(v3 – v2) – 1) + (SIGN(v4 – v3) – 1) + (SIGN (v5 – v4) – 1) = 0;

There is a very simpler and shorter formula for this?
Thanks

______________________

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-05-05 : 15:05:29

I'm pretty sure your first solution is the shortest. It's certainly the most logical.

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2011-05-06 : 07:57:14

Thanks,
Now how I solve the new problem,
Values inserted vertically instead of horizontally.

Please see:


row_num     nbr         v
----------- ----------- -----------
1           1           50
1           2           55
1           3           60
1           4           90
1           5           150

Now how I can compare values for checking absolutely incrementally?
I should pivot the table again and use my first solution?!

______________________

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-05-06 : 08:26:35

Try this

DECLARE	@Sample TABLE
	(
		row_num INT,
		nbr INT,
		v INT
	)

INSERT	@Sample
VALUES	(1, 1,  50),
	(1, 2,  55),
	(1, 3,  60),
	(1, 4,  90),
	(1, 5, 150)
;WITH cte
AS (
	SELECT	row_num,
		ROW_NUMBER() OVER (PARTITION BY row_num ORDER BY v) - nbr AS Yak
	FROM	@Sample
)
SELECT		row_num
FROM		cte
GROUP BY	row_num
HAVING		MAX(Yak) > 0

N 56°04'39.26"
E 12°55'05.63"

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-05-06 : 09:42:51

Here's a pivoted version based on Peter's setup:

SELECT row_num, v1, v2, v3, v4, v5
FROM (SELECT row_num, 'v'+CAST(nbr AS VARCHAR) n, v FROM @Sample) a
PIVOT(MAX(v) FOR n IN (v1,v2,v3,v4,v5)) b
WHERE v1 < v2 AND v2 < v3 AND v3 < v4 AND v4 < v5

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2011-05-06 : 13:17:25

Values can be equals so I need to use RANK () instead of ROW_NUMBER ()
Also I think using MIN (Yak) = 0 AND MAX (Yak) = 0 is necessary in HAVING clause.

Means:


;WITH cte
AS (
	SELECT	row_num,
		     RANK() OVER (PARTITION BY row_num ORDER BY v) - nbr AS Yak
	FROM	@Sample
)
SELECT		row_num
FROM		cte
GROUP BY	row_num
HAVING		MAX(Yak)= 0 AND MIN(Yak) = 0

______________________

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-05-06 : 14:34:05

No, I don't think you need both MAX and MIN.
See the values. If all values are incremental the subtraction will be 0.
BUT! If at least one is out of order, you will have both a positive value and the same negative value in your series.

N 56°04'39.26"
E 12°55'05.63"

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-05-06 : 14:55:44

And change RANK to ROW_NUMBER(), otherwise the duplicate value will return 1 in the subtraction.

N 56°04'39.26"
E 12°55'05.63"

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-05-07 : 13:33:58

I am not sure about performance, but next method looks simpler to me:


select a.row_num,
       a.v            as v1,
       b.v            as v2,
       c.v            as v3,
       d.v            as v4,
       e.v            as v5
from tableName a
join tableName b on b.row_num = a.row_num and a.v < b.v
join tableName c on c.row_num = b.row_num and b.v < c.v
join tableName d on d.row_num = c.row_num and c.v < d.v
join tableName e on e.row_num = d.row_num and d.v < e.v

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2011-05-07 : 15:53:45

Good!
But the query needs more conditions:

quote:

Originally posted by mmarovic

I am not sure about performance, but next method looks simpler to me:


select a.row_num,
       a.v            as v1,
       b.v            as v2,
       c.v            as v3,
       d.v            as v4,
       e.v            as v5
from tableName a
join tableName b on b.row_num = a.row_num and a.v < b.v and a.nbr = 1 and b.nbr = 2
join tableName c on c.row_num = b.row_num and b.v < c.v and c.nbr = 3
join tableName d on d.row_num = c.row_num and c.v < d.v and d.nbr = 4
join tableName e on e.row_num = d.row_num and d.v < e.v and e.nbr = 5

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/

______________________

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-05-08 : 02:40:38

Yes, that's correct, I forgot about sequencing.

With sequencing and right indexing (covered starting with nbr), this method should be the most efficient one.

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-05-08 : 04:40:19

[code]CREATE TABLE #Sample
(
RowNum INT NOT NULL,
ColNum TINYINT NOT NULL,
Value INT NOT NULL
)

CREATE UNIQUE CLUSTERED INDEX UCX_Sample ON #Sample (RowNum, ColNum)

INSERT #Sample
(
RowNum,
ColNum,
Value
)
SELECT 1000 * v1.Number + v2.Number AS RowNum,
v3.Number AS ColNum,
ABS(CHECKSUM(NEWID())) AS Value
FROM master..spt_values AS v1
INNER JOIN master..spt_values AS v2 ON v2.Type = 'P'
AND v2.Number BETWEEN 0 AND 999
INNER JOIN master..spt_values AS v3 ON v3.Type = 'P'
AND v3.Number BETWEEN 1 AND 6
WHERE v1.type = 'P'
AND v1.number BETWEEN 0 AND 999

SET STATISTICS IO ON
SET STATISTICS TIME ON
GO

-- Peso 1
SELECT RowNum
FROM (
SELECT RowNum,
ROW_NUMBER() OVER (PARTITION BY RowNum ORDER BY Value) - ColNum AS Yak
FROM #Sample
) AS d
GROUP BY RowNum
HAVING MAX(Yak) = 0
GO 2
/*
(1365 row(s) affected)
Table 'Worktable'. Scan count 1, logical reads 13389, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 19110 ms, elapsed time = 16381 ms.
*/

-- mmarovic
SELECT a.RowNum
FROM #Sample AS a
INNER JOIN #Sample AS b ON b.RowNum = a.RowNum
AND b.ColNum = 2
AND b.Value > a.Value
INNER JOIN #Sample AS c ON c.RowNum = b.RowNum
AND c.ColNum = 3
AND c.Value > b.Value
INNER JOIN #Sample AS d ON d.RowNum = c.RowNum
AND d.ColNum = 4
AND d.Value > c.Value
INNER JOIN #Sample AS e ON e.RowNum = d.RowNum
AND e.ColNum = 5
AND e.Value > d.Value
INNER JOIN #Sample AS f ON f.RowNum = e.RowNum
AND f.ColNum = 6
AND f.Value > e.Value
WHERE a.ColNum = 1
GO 2
/*
(1365 row(s) affected)
Table 'Worktable'. Scan count 6, logical reads 80329, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 4867 ms, elapsed time = 4992 ms.
*/

SET STATISTICS TIME OFF
SET STATISTICS IO OFF

DROP TABLE #Sample[/code]

N 56°04'39.26"
E 12°55'05.63"

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-05-08 : 05:40:28

Now try index I mentioned, that starts with colNum.

CREATE UNIQUE CLUSTERED INDEX UCX_Sample ON #Sample (ColNum, RowNum)

Also, you have one more value and one more join in query you posted as mine.

quote:
SELECT a.RowNum
FROM #Sample AS a
INNER JOIN #Sample AS b ON b.RowNum = a.RowNum
AND b.ColNum = 2
AND b.Value > a.Value
INNER JOIN #Sample AS c ON c.RowNum = b.RowNum
AND c.ColNum = 3
AND c.Value > b.Value
INNER JOIN #Sample AS d ON d.RowNum = c.RowNum
AND d.ColNum = 4
AND d.Value > c.Value
INNER JOIN #Sample AS e ON e.RowNum = d.RowNum
AND e.ColNum = 5
AND e.Value > d.Value
INNER JOIN #Sample AS f ON f.RowNum = e.RowNum
AND f.ColNum = 6
AND f.Value > e.Value
WHERE a.ColNum = 1

Anyway, I like the query you and ms65g produced. It is short and efficient enough. The only drowback is that it is not quite obvious from the code what you try to achieve.

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-05-08 : 06:39:50

I forgot to mention, I did not expect my query to run significantly faster even with index you used. However, with index starting with colNum, I expect logical reads to drop.

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-10 : 07:37:55

quote:
Originally posted by mmarovic

.....
However, with index starting with colNum, I expect logical reads to drop.
....

How do you just LOOK at some code and say "this would have fewer logical reads", or "that would be more efficient"!?! I admit that don't know even the very the basics of query optimization - can you point me to some books/articles that would help me get a working knowledge?

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-05-10 : 08:05:44

Read blog http://blogs.msdn.com/b/conor_cunningham_msft/ and his posts about how indexes work.

N 56°04'39.26"
E 12°55'05.63"

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-10 : 08:27:32

Thanks, Peso!

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-05-10 : 09:51:17

There is also SqlInTheWild, http://sqlinthewild.co.za/

N 56°04'39.26"
E 12°55'05.63"

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-05-10 : 15:02:52

quote:
Originally posted by sunitabeck

quote:
Originally posted by mmarovic

.....
However, with index starting with colNum, I expect logical reads to drop.
....

How do you just LOOK at some code and say "this would have fewer logical reads", or "that would be more efficient"!?! I admit that don't know even the very the basics of query optimization - can you point me to some books/articles that would help me get a working knowledge?

In this case it is not that complex. Whichever data set query optimizer chooses to read first it uses criteria: colNum = <some constant>.
If clustered index does not start with that column, sql server would have to read every row of the table to figure out if the condition is satisfied.
If the first column of clustered index is colNum, then it would "jump" to the first row having value from the condition. It would read row by row until it finds the first row with higher value. When it finds it, "it" knows that rows are sorted in ascending order, so there is no chance that further rows could have colNum with the value required. In that case it stops reading and continues with the next step...

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-10 : 20:59:32

Thanks Mirko! That is a simple explanation - does make sense even to me :)

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-05-14 : 11:42:44

You are wellcome. :)

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/

Next Page

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources