Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 A simple formula

Author  Topic 

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2011-05-05 : 15:03:26
Is it possible to fine a simple math formula for following scenario?
I want to check are 5 sequence values absolutely incremental.
For example

row_num v1 v2 v3 v4 v5
----------- ----------- ----------- ----------- ----------- -----------
1 50 55 60 90 150
2 50 60 49 70 89

The row with number 1 is incremental then I want select it, because:
50 < 55 < 60 < 90 < 150

A simple method is:

SELECT *
FROM TableName
WHERE v1 < v2 AND v2 < v3 AND v3 < v4 AND v4 < v5;

Or another method:

SELECT *
FROM TableName
WHERE (SIGN(v2 – v1) – 1) + (SIGN(v3 – v2) – 1) + (SIGN(v4 – v3) – 1) + (SIGN (v5 – v4) – 1) = 0;

There is a very simpler and shorter formula for this?
Thanks


______________________

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-05-05 : 15:05:29
I'm pretty sure your first solution is the shortest. It's certainly the most logical.
Go to Top of Page

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2011-05-06 : 07:57:14
Thanks,
Now how I solve the new problem,
Values inserted vertically instead of horizontally.

Please see:

row_num nbr v
----------- ----------- -----------
1 1 50
1 2 55
1 3 60
1 4 90
1 5 150

Now how I can compare values for checking absolutely incrementally?
I should pivot the table again and use my first solution?!


______________________
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-05-06 : 08:26:35
Try this
DECLARE	@Sample TABLE
(
row_num INT,
nbr INT,
v INT
)

INSERT @Sample
VALUES (1, 1, 50),
(1, 2, 55),
(1, 3, 60),
(1, 4, 90),
(1, 5, 150)
;WITH cte
AS (
SELECT row_num,
ROW_NUMBER() OVER (PARTITION BY row_num ORDER BY v) - nbr AS Yak
FROM @Sample
)
SELECT row_num
FROM cte
GROUP BY row_num
HAVING MAX(Yak) > 0



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-05-06 : 09:42:51
Here's a pivoted version based on Peter's setup:
SELECT row_num, v1, v2, v3, v4, v5
FROM (SELECT row_num, 'v'+CAST(nbr AS VARCHAR) n, v FROM @Sample) a
PIVOT(MAX(v) FOR n IN (v1,v2,v3,v4,v5)) b
WHERE v1 < v2 AND v2 < v3 AND v3 < v4 AND v4 < v5
Go to Top of Page

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2011-05-06 : 13:17:25
Values can be equals so I need to use RANK () instead of ROW_NUMBER ()
Also I think using MIN (Yak) = 0 AND MAX (Yak) = 0 is necessary in HAVING clause.

Means:

;WITH cte
AS (
SELECT row_num,
RANK() OVER (PARTITION BY row_num ORDER BY v) - nbr AS Yak
FROM @Sample
)
SELECT row_num
FROM cte
GROUP BY row_num
HAVING MAX(Yak)= 0 AND MIN(Yak) = 0



______________________
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-05-06 : 14:34:05
No, I don't think you need both MAX and MIN.
See the values. If all values are incremental the subtraction will be 0.
BUT! If at least one is out of order, you will have both a positive value and the same negative value in your series.



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-05-06 : 14:55:44
And change RANK to ROW_NUMBER(), otherwise the duplicate value will return 1 in the subtraction.



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-05-07 : 13:33:58
I am not sure about performance, but next method looks simpler to me:

select a.row_num,
a.v as v1,
b.v as v2,
c.v as v3,
d.v as v4,
e.v as v5
from tableName a
join tableName b on b.row_num = a.row_num and a.v < b.v
join tableName c on c.row_num = b.row_num and b.v < c.v
join tableName d on d.row_num = c.row_num and c.v < d.v
join tableName e on e.row_num = d.row_num and d.v < e.v


Mirko

My blog: http://mirko-marovic-eng.blogspot.com/
Go to Top of Page

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2011-05-07 : 15:53:45
Good!
But the query needs more conditions:
quote:
Originally posted by mmarovic

I am not sure about performance, but next method looks simpler to me:

select a.row_num,
a.v as v1,
b.v as v2,
c.v as v3,
d.v as v4,
e.v as v5
from tableName a
join tableName b on b.row_num = a.row_num and a.v < b.v and a.nbr = 1 and b.nbr = 2
join tableName c on c.row_num = b.row_num and b.v < c.v and c.nbr = 3
join tableName d on d.row_num = c.row_num and c.v < d.v and d.nbr = 4
join tableName e on e.row_num = d.row_num and d.v < e.v and e.nbr = 5


Mirko

My blog: http://mirko-marovic-eng.blogspot.com/



______________________
Go to Top of Page

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-05-08 : 02:40:38
Yes, that's correct, I forgot about sequencing. With sequencing and right indexing (covered starting with nbr), this method should be the most efficient one.

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-05-08 : 04:40:19
[code]CREATE TABLE #Sample
(
RowNum INT NOT NULL,
ColNum TINYINT NOT NULL,
Value INT NOT NULL
)

CREATE UNIQUE CLUSTERED INDEX UCX_Sample ON #Sample (RowNum, ColNum)

INSERT #Sample
(
RowNum,
ColNum,
Value
)
SELECT 1000 * v1.Number + v2.Number AS RowNum,
v3.Number AS ColNum,
ABS(CHECKSUM(NEWID())) AS Value
FROM master..spt_values AS v1
INNER JOIN master..spt_values AS v2 ON v2.Type = 'P'
AND v2.Number BETWEEN 0 AND 999
INNER JOIN master..spt_values AS v3 ON v3.Type = 'P'
AND v3.Number BETWEEN 1 AND 6
WHERE v1.type = 'P'
AND v1.number BETWEEN 0 AND 999

SET STATISTICS IO ON
SET STATISTICS TIME ON
GO

-- Peso 1
SELECT RowNum
FROM (
SELECT RowNum,
ROW_NUMBER() OVER (PARTITION BY RowNum ORDER BY Value) - ColNum AS Yak
FROM #Sample
) AS d
GROUP BY RowNum
HAVING MAX(Yak) = 0
GO 2
/*
(1365 row(s) affected)
Table 'Worktable'. Scan count 1, logical reads 13389, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 19110 ms, elapsed time = 16381 ms.
*/

-- mmarovic
SELECT a.RowNum
FROM #Sample AS a
INNER JOIN #Sample AS b ON b.RowNum = a.RowNum
AND b.ColNum = 2
AND b.Value > a.Value
INNER JOIN #Sample AS c ON c.RowNum = b.RowNum
AND c.ColNum = 3
AND c.Value > b.Value
INNER JOIN #Sample AS d ON d.RowNum = c.RowNum
AND d.ColNum = 4
AND d.Value > c.Value
INNER JOIN #Sample AS e ON e.RowNum = d.RowNum
AND e.ColNum = 5
AND e.Value > d.Value
INNER JOIN #Sample AS f ON f.RowNum = e.RowNum
AND f.ColNum = 6
AND f.Value > e.Value
WHERE a.ColNum = 1
GO 2
/*
(1365 row(s) affected)
Table 'Worktable'. Scan count 6, logical reads 80329, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 4867 ms, elapsed time = 4992 ms.
*/

SET STATISTICS TIME OFF
SET STATISTICS IO OFF

DROP TABLE #Sample[/code]

N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-05-08 : 05:40:28
Now try index I mentioned, that starts with colNum.
CREATE UNIQUE CLUSTERED INDEX UCX_Sample ON #Sample (ColNum, RowNum)


Also, you have one more value and one more join in query you posted as mine.

quote:
SELECT a.RowNum
FROM #Sample AS a
INNER JOIN #Sample AS b ON b.RowNum = a.RowNum
AND b.ColNum = 2
AND b.Value > a.Value
INNER JOIN #Sample AS c ON c.RowNum = b.RowNum
AND c.ColNum = 3
AND c.Value > b.Value
INNER JOIN #Sample AS d ON d.RowNum = c.RowNum
AND d.ColNum = 4
AND d.Value > c.Value
INNER JOIN #Sample AS e ON e.RowNum = d.RowNum
AND e.ColNum = 5
AND e.Value > d.Value
INNER JOIN #Sample AS f ON f.RowNum = e.RowNum
AND f.ColNum = 6
AND f.Value > e.Value
WHERE a.ColNum = 1
Anyway, I like the query you and ms65g produced. It is short and efficient enough. The only drowback is that it is not quite obvious from the code what you try to achieve.

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/
Go to Top of Page

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-05-08 : 06:39:50
I forgot to mention, I did not expect my query to run significantly faster even with index you used. However, with index starting with colNum, I expect logical reads to drop.

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-10 : 07:37:55
quote:
Originally posted by mmarovic

.....
However, with index starting with colNum, I expect logical reads to drop.
....


How do you just LOOK at some code and say "this would have fewer logical reads", or "that would be more efficient"!?! I admit that don't know even the very the basics of query optimization - can you point me to some books/articles that would help me get a working knowledge?
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-05-10 : 08:05:44
Read blog http://blogs.msdn.com/b/conor_cunningham_msft/ and his posts about how indexes work.



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-10 : 08:27:32
Thanks, Peso!
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-05-10 : 09:51:17
There is also SqlInTheWild, http://sqlinthewild.co.za/


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-05-10 : 15:02:52
quote:
Originally posted by sunitabeck

quote:
Originally posted by mmarovic

.....
However, with index starting with colNum, I expect logical reads to drop.
....


How do you just LOOK at some code and say "this would have fewer logical reads", or "that would be more efficient"!?! I admit that don't know even the very the basics of query optimization - can you point me to some books/articles that would help me get a working knowledge?


In this case it is not that complex. Whichever data set query optimizer chooses to read first it uses criteria: colNum = <some constant>.
If clustered index does not start with that column, sql server would have to read every row of the table to figure out if the condition is satisfied.
If the first column of clustered index is colNum, then it would "jump" to the first row having value from the condition. It would read row by row until it finds the first row with higher value. When it finds it, "it" knows that rows are sorted in ascending order, so there is no chance that further rows could have colNum with the value required. In that case it stops reading and continues with the next step...

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-10 : 20:59:32
Thanks Mirko! That is a simple explanation - does make sense even to me :)
Go to Top of Page

mmarovic
Aged Yak Warrior

518 Posts

Posted - 2011-05-14 : 11:42:44
You are wellcome. :)

Mirko

My blog: http://mirko-marovic-eng.blogspot.com/
Go to Top of Page
    Next Page

- Advertisement -