Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 WHILE statement in Stored Procedure

Author  Topic 

AndyC
Yak Posting Veteran

53 Posts

Posted - 2004-07-27 : 12:58:29
I am trying to calling a stored procedure from Cold Fusion
The following code works fine and returns the required data for 1970

CREATE PROCEDURE MagicNumbersforHOF (@cat_Value int)
AS
BEGIN


SELECT 1970,COUNT(list) AS dataOut
FROM (SELECT playerID list
FROM baseball.Batting
WHERE yearID < 1970
GROUP BY playerID
HAVING (SUM(HR) > @cat_Value)) DERIVEDTBL


END
GO


However, I wish to return a dataset for every year between 1940 and 2003. My feeble attempt is


CREATE PROCEDURE MagicNumbersforHOF (@cat_Value int)
AS

DECLARE @Year int
SET @Year = 1939

WHILE @Year < 2003
SET @Year = @Year+1

BEGIN

SELECT @Year AS Season,COUNT(list) AS dataOut
FROM (SELECT playerID list
FROM dbo.Batting
WHERE yearID < @Year
GROUP BY playerID
HAVING (SUM(HR) > @cat_Value)) DERIVEDTBL

END
GO


Using this code I just get the last calculated value returned i.e for 2003

I presumably need to place the calculated values(Season and dataOut) within the loop
into some sort of temporary table and then access that to return the data
but I'm new to this procedure

Thanks




Andrew Clark
www.majorleaguecharts.com

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-07-27 : 13:13:14
SQL returns sets of rows by nature, you don't have to resort to any tricks to force it.

EDIT! I see now what you need. A little more complicate than I thought.

you need a table of all years, or a query that returns all years. if you have a table, use it, if not use (select distinct yearID from Batting), or some other table that has all the years in it. In the solution below, replace the table "Years" with however you can return that list of all years.

Then, you have:


SELECT Year AS Season,COUNT(list) AS dataOut
FROM
(SELECT
Years.yearId, playerID list
FROM
Batting
INNER JOIN
Years ON Batting.YearID < Years.YearID
WHERE
Years.YearID between 1940 and 2003
GROUP BY
Years.yearId, playerID
HAVING
SUM(HR) > @cat_Value
) DERIVEDTBL
GROUP BY YearID


and as always, if you are learning, don't start by creating procedures -- just write simple SQL statements in QA that you can execute directly.



- Jeff
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-07-27 : 13:22:32
by the way -- interesting web site, though it doesn't seem to work just yet.

I'd love to help out any way I can, I'm a huge baseball fan and would like to consider myself fairly adept at T-SQL and writing stat gathering/calculating SQL statements.

- Jeff
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2004-07-27 : 13:33:17
quote:
Originally posted by jsmith8858

I'd love to help out any way I can, I'm a huge baseball fan



And that's saying something when your a saux fan....

What's it still? 7 back?



Brett

8-)
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-07-27 : 13:41:41
we're SLOWLY chipping away ! took 2/3 from the yanks last weekend ... actually, we're playing the Orioles right now ... they always give us trouble!

- Jeff
Go to Top of Page

AndyC
Yak Posting Veteran

53 Posts

Posted - 2004-07-27 : 15:09:49
Jeff,

Thanks for the reply and tip re QA

I created a table Years and after a bit of tidying up got this to work

SELECT yearID,COUNT(list) AS dataOut
FROM
(SELECT
Years.yearId , playerID list
FROM
Batting
INNER JOIN
Years ON Batting.yearID < Years.yearID
WHERE
Years.yearID between 1940 and 2003
GROUP BY
Years.yearId, playerID
HAVING
SUM(HR) > 499
) DERIVEDTBL
GROUP BY yearID


Unfortunately, it took a minimum of 45 seconds to run which as it is intended to run on a web site is unacceptable. Is there something in the subquery that could be improved or should I be setting some indices somewhere to speed things along.

As you are a baseball fan, you may realize that what I am attempting to do is to calculate - in this instance - how many players had reached the 500 Home run mark by year. So for instance in 1940, the dataout figure is 1 (ruth), in 1950 3 (ruth,foxx,ott) etc

I want the user to be able to vary the category and number e.g. 300 wins so need to be able to calculate the data on-th-fly

Re the website, I am not aware of any problems. What error did you get
or was it just a long wait - you need a fast connection
Try http://www.majorleaguecharts.com/mlc/index.cfm

Thanks for the offer on help. Much appreciated. I'll contact you directly

Andrew Clark
www.majorleaguecharts.com
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-07-27 : 15:30:42
(yes, it is working -- sorry about that. i figured it out -- i thought the drop-downs weren't working for the names but then i realized you have to type in a name and click "GO" to fill up the drop-downs).

yeah, that's a lot of data to crunch. you might want to create a table that has accumulated totals by year by player, and query that directly for this. it's a classic "running totals" problem and those are historically inefficient using SQL.

the query I gave you should help you in calculating the values for that new table; just create the table, fill it up, make sure it has some good indexes, and then you should be good to go. (luckily, this data is static and all you will have to do is add to this table as each new year of data comes in).

- Jeff
Go to Top of Page

AndyC
Yak Posting Veteran

53 Posts

Posted - 2004-07-27 : 15:52:57
( I have tried to make it a bit clearer in the Chart Hints now)

I'll try that cumulative table approach. Alternatively I may just use selected years, say once each decade

Andrew Clark
www.majorleaguecharts.com
Go to Top of Page

AndyC
Yak Posting Veteran

53 Posts

Posted - 2004-07-27 : 20:13:46
OK I'm working on the cumulative table. This code produces the correct output for a specific player


USE mlb
GO

DECLARE
@playerID nvarchar(10),
@yearID smallint,
@HR smallint,
@yearStart smallint,
@yearEnd smallint

SET @playerID = 'sosasa01'
SELECT @playerID

SELECT @yearStart = MIN(yearID) FROM dbo.Batting GROUP BY playerID HAVING (playerID = @playerID)
SELECT @yearEnd = MAX(yearID) FROM dbo.Batting GROUP BY playerID HAVING (playerID = @playerID)


SELECT dbo.Years.yearID, dbo.Batting.playerID, SUM(dbo.Batting.HR) AS HR
FROM dbo.Batting INNER JOIN
dbo.Years ON dbo.Batting.yearID <= dbo.Years.yearID
WHERE (dbo.Years.yearID BETWEEN @yearStart AND @yearEnd)
GROUP BY dbo.Years.yearID, dbo.Batting.playerID
HAVING (dbo.Batting.playerID = @playerID)


So I now need to
a) do the same for all players. I have a seperate table MasterPlayers
which has unique playerIDs if I need that for a join
b) Enter the data into a table BattingCum which has the similar columns to Batting e.g playerID, yearID, HR

Can you assist on this please



Andrew Clark
www.majorleaguecharts.com
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-07-29 : 09:50:46
Andy --

(here's some questions/notes/thoughts/techniques that I think will help you out ....)

How would you like to handle people who have retired, and the years AFTER that. For example, should we keep repeating Babe Ruth's final stats all the way up to 2003 ?

That's one thing I realized when I thought about storing the cumulative stats over the years for players -- each year the # of players you have to track would grow and grow, and in 2003, you would have a record for everyone who's ever played the sport !

Something to think about. Either way, here's some ideas for you:

To calculate the cumulative totals for each player/year, here's one way to do it:

1) first, you need a sql statement that returns the first year/last year that each player played:


select playerID, min(year) as MinYear, max(year) as MaxYear
from Batting
group by playerID


it might pay to insert those results into a temp table for speed purposes ... call the resultset of the above #PlayersYearRange.

From that resultset, you then join to the "years" table like this:


select
p.playerID, y.year
from
#PlayersYearRange p
inner join
Years Y
on
y.year between MinYear and MaxYear


To store stats for all players and ALL years (not just for the years in which they played but until 2003), change the join condition from "between" to simply "on y.year >= MinYear".

As before, at any point you can put these results in a temp table. Call the above results #PlayersYears.

Now, to insert the cumulative totals for all players stats for each year they played, you would execute something like:


insert into CumulativeStats (PlayerId, Year, ....)
select
py.playerID, py.Year, SUM( ... all stats ....)
from
#PlayersYears py
inner join
Batting B
on
B.PlayerID = py.playerID and
B.Year <= py.Year


Notes:

1) if you save this as a stored procedure, you can run it each year to add new cumulative totals by adding a parameter called @Year and adding a WHERE clause to the end: WHERE py.Year = @Year.

2) You can add further where clauses to limit the years you calculate overall. For example, to calculate only 1 year per decade, and only for stats since 1940, you would add: WHERE py.Year > 1940 AND Py.Year % 10 = 0

- - - -

If you decided to only store years in which a player actually played, you can then "expand out" other years in a query as follows:

1) first, we need to calculate, for each player/year combination, what the LAST year before that year has the player's cumulative totals:


select
p.Player, y.Year, Max(C.Year) as CumulativeStatsYear
from
Players P
cross join
Years y
inner join
CumulativeStats C <-- this has a PK of player/Year
on
P.Player = C.Player and
y.Year >= C.Year


If you have that, then you can get all players cumulative stats for each year as follows:



select
a.player, a.year, c.*
from
(<the above SQL statement>) a
inner join
cumulativeStats c
on
a.CumulativeStatsYear = c.Year AND
a.Player = c.player


However, you will have to test the speed of this to see if this works.

- Jeff
Go to Top of Page

AndyC
Yak Posting Veteran

53 Posts

Posted - 2004-07-29 : 16:30:57
Jeff,
Thanks a bundle. It worked without any real problem
I just want to hold the player for the years he played eg Ruth 1914 -1935. so I just worked on your first option. It needs a GROUP By py.playerID, py.Year for the insert into my final table. It only took a couple of minutes to run all players for the batting stats - there are 88,000 rows, I think. I have also been addressing another thread I have had outstanding so am a bit brain dead. Would you mind amending the code we had previously to take account of the fact we now have players cumulative stats. I call it BattingCum
It was

SELECT yearID,COUNT(list) AS dataOut
FROM
(SELECT
Years.yearId , playerID list
FROM
Batting
INNER JOIN
Years ON Batting.yearID < Years.yearID
WHERE
Years.yearID between 1940 and 2003
GROUP BY
Years.yearId, playerID
HAVING
SUM(HR) > 499
) DERIVEDTBL
GROUP BY yearID



Andrew Clark
www.majorleaguecharts.com
Go to Top of Page

AndyC
Yak Posting Veteran

53 Posts

Posted - 2004-07-31 : 15:03:33
I now have the site up using both individual year and cumulative data
thanks Jeff and Cory
The code I use in Cold Fusion to obtain the cumulative data is of the form

Select
Age,
playerA = sum(case when playerId = '#playerA#' then cat else 0 end),
yearA = max(case when playerId = '#playerA#' then YearId else 0 end),
playerB = sum(case when playerId = '#playerB#' then cat else 0 end),
yearB = max(case when playerId = '#playerB#' then YearId else 0 end)

From
(
Select
B.PlayerId,
Age = Case
When MP.birthMonth < 7 then (B.yearID - MP.birthYear)
Else (B.yearID - MP.birthYear-1) End,
B.YearId,

B.#category# AS cat

From BattingCum as B
Left Join MasterPlayers as MP
On B.playerId = MP.playerId
Where B.playerId in ('#playerA#','#playerB#') AND (B.yearID - MP.birthYear) between '#minAge#' AND '#maxAge#'

) as byAge
Group By Age

where all the variables sent are in hash marks and BattingCum provides the cumulative totals of categories e.g. HR by year
However, this is not ideal as what I obtain is the cumulative career totals between the ages of say 30 and 33 rather than the total career obtained over that time period

e.g age, HRshown, HR what I want

30, 123, 0
31, 147, 24
32, 168, 45
33, 202, 79

I have played around with something along the line
SELECT @baseLevelA =
B.HR
From PitchingCum as B
Left Join MasterPlayers as MP
On B.playerId = MP.playerId
Where B.playerId in (@playerIDA) AND (B.yearID - MP.birthYear) = #minAge#
[/code]

And then having
B.#category# - @baseLevelA AS cat in the output query but
am not sure how to extend this to a series of players + there may be a more elegant way to do it

Cheers


Andrew Clark
www.majorleaguecharts.com
Go to Top of Page
   

- Advertisement -