Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
AndyC
Yak Posting Veteran
53 Posts |
Posted - 2006-06-26 : 10:13:10
|
I'm tryng to use sql2005 to determine maximum sequences for baseball recordsThe code below may not be the most elegant but it does give the correct answer for the longest run of 100rbi seasons that willie mays had(8)select top 1 count(seqs) as maxyearseqs from(select playerid,yearID+1-(SELECT top 1 yearidFROM BattingGROUP BY playerID, yearIDHAVING (playerID = N'mayswi01') AND (SUM(RBI) > 99))- row_number() over(order by yearid) as seqsfrom battinggroup by playerid,yearidhaving playerid=N'mayswi01' and sum(rbi)>99) as derivedgroup by seqs,playeridorder by count(seqs) desc However, I'm having a problem generalizing it so that I get the best by any player - presumably something to do with linking playerids but cannot fathom it outIt would also be useful to add teamid and yearid filters so that I coulf find out,say, which yankee player had the longest sequence of seasons between 1960 and 1980Thanks for any helpAndrew Clarkwww.majorleaguecharts.com |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-06-26 : 14:12:39
|
| If you expect some answers, you should post some DDL here together with some test data and some indication of the expected output.Peter LarssonHelsingborg, Sweden |
 |
|
|
rob_farley
Yak Posting Veteran
64 Posts |
Posted - 2006-06-27 : 04:13:57
|
| Does this help:/*drop table dbo.batting;goselect * into dbo.batting from(select 'player1' playerid, 1999 yearid, 101 rbiunion allselect 'player1' playerid, 2000 yearid, 99 rbiunion allselect 'player1' playerid, 2001 yearid, 101 rbiunion allselect 'player1' playerid, 2002 yearid, 101 rbiunion allselect 'player1' playerid, 2003 yearid, 101 rbiunion allselect 'player1' playerid, 2004 yearid, 99 rbiunion allselect 'player2' playerid, 2000 yearid, 99 rbiunion allselect 'player2' playerid, 2001 yearid, 99 rbiunion allselect 'player2' playerid, 2002 yearid, 101 rbiunion allselect 'player2' playerid, 2003 yearid, 101 rbiunion allselect 'player2' playerid, 2004 yearid, 99 rbi) batting*/select starts.playerid, starts.yearid startyear, ends.yearid endyear, ends.yearid-starts.yearid+1 from(select b2.playerid, b2.yearid, row_number() over(order by b2.playerid, b2.yearid) seq_nofrom batting b1 right join batting b2 on b2.playerid = b1.playerid and b2.yearid = b1.yearid + 1where isnull(b1.rbi,0) < 100 and b2.rbi >= 100) startsjoin(select b1.playerid, b1.yearid, row_number() over(order by b1.playerid, b1.yearid) seq_nofrom batting b1 left join batting b2 on b2.playerid = b1.playerid and b2.yearid = b1.yearid + 1where b1.rbi >= 100 and isnull(b2.rbi,0) < 100) ends on starts.seq_no = ends.seq_no--Obviously this batting table has fewer columns than yours. You might want to put a CTE in there as a wrapper to your table, that can alias batting to 'select playerid, yearid, sum(rbi) rbi from batting group by playerid, yearid'To extend it to include teams, just join the resultset to another column which says who each player played for and when. And if you want to filter on years, put that in your CTE definition "where yearid between 1960 and 1980".I hope this helps... Good luck with your stats.Rob Farleyhttp://robfarley.blogspot.com |
 |
|
|
AndyC
Yak Posting Veteran
53 Posts |
Posted - 2006-06-27 : 14:09:17
|
Thanks Rob. Your code works fine with data provided.However, I am having a bit of a problem applying it to my statsI appear to have identified it to the fact that for some players, there are a different number of resultsets for each of the sub queries.For example for this player's data, playerid='becklja01'1888 271889 971890 1201891 731892 961893 1061894 1201895 1101896 321896 381897 111897 761898 721899 991900 941901 791902 691903 811904 671905 571906 441907 7*nb more than one set of data for some years e.g 1896 as there are seperate entries for different teams he played with (hence the SUM in my original entry)and the codeselect b2.playerid, b2.yearid, row_number() over(order by b2.playerid, b2.yearid) seq_nofrom batting b1right joinbatting b2on b2.playerid = b1.playerid and b2.yearid = b1.yearid + 1where isnull(b1.rbi,0) < 100and b2.rbi >= 100and b1.playerid='becklja01' I get the result:-playerid yearid seq_nobecklja01 1890 1becklja01 1893 2which is correctbut for the other queryselect b1.playerid, b1.yearid, row_number() over(order by b1.playerid, b1.yearid) seq_nofrom batting b1left joinbatting b2on b2.playerid = b1.playerid and b2.yearid = b1.yearid + 1where b1.rbi >= 100and isnull(b2.rbi,0) < 100and b1.playerid='becklja01' playerid yearid seq_nobecklja01 1890 1becklja01 1895 2becklja01 1895 3which has two sequences ending in 1895. This throws out all the resultsCan you elucidate?Andrew Clarkwww.majorleaguecharts.com |
 |
|
|
rob_farley
Yak Posting Veteran
64 Posts |
Posted - 2006-06-27 : 19:54:47
|
| Ok. But if you're not using the teamid field, then you should be aggregating the data so that there is only one RBI record for Player/Year. However you're producing that aggregated section, only "group by playerid, yearid".I assume you have a bunch of raw data, and you're producing the 'batting' CTE with something like:with batting (playerid, yearid, rbi) as ( select playerid, yearid, sum(rbi) from raw_batting_data group by playerid, yearid --This is the key line here. Don't have teamid unless you're catering for it in the next section too!)select starts.playerid, starts.yearid startyear, ends.yearid endyear, ends.yearid-starts.yearid+1from(select b2.playerid, b2.yearid, row_number() over(order by b2.playerid, b2.yearid) seq_nofrom batting b1right joinbatting b2on b2.playerid = b1.playeridand b2.yearid = b1.yearid + 1where isnull(b1.rbi,0) < 100and b2.rbi >= 100) startsjoin(select b1.playerid, b1.yearid, row_number() over(order by b1.playerid, b1.yearid) seq_nofrom batting b1left joinbatting b2on b2.playerid = b1.playeridand b2.yearid = b1.yearid + 1where b1.rbi >= 100and isnull(b2.rbi,0) < 100) endson starts.seq_no = ends.seq_noRob Farleyhttp://robfarley.blogspot.com |
 |
|
|
AndyC
Yak Posting Veteran
53 Posts |
Posted - 2006-06-28 : 11:04:55
|
Thanks Rob - I'm slowly getting there. I now have code which correctly provides the aggregated data by year - but I now have problems when I add in the teamidI currently use the wrapperwith newbattingteam (playerid, yearid, teamid, rbi) as(select playerid, yearid, teamid, rbifrom batting) but then this subqueryselect b2.playerid,b2.teamid, b2.yearid, row_number() over(order by b2.playerid, b2.teamid, b2.yearid) seq_nofrom newbattingteam b1right joinnewbattingteam b2on b2.playerid = b1.playerid and b2.teamid = b1.teamidand b2.yearid = b1.yearid + 1where isnull(b1.rbi,0) < 100and b2.rbi >= 100and b1.playerid='foxxji01'group by b2.playerid,b2.teamid,b2.yearid -- no diff if i leave this out I only get a reference to the first team's sequenceAndrew Clarkwww.majorleaguecharts.com |
 |
|
|
rob_farley
Yak Posting Veteran
64 Posts |
Posted - 2006-06-28 : 20:41:50
|
| Yes, you don't need the group by in that part of the query. Try this:with newbatting (playerid, teamid, yearid, rbi) as(select playerid, teamid, yearid, sum(rbi)from battinggroup by playerid, teamid, yearid --This is the key line here. Don't have teamid unless you're catering for it in the next section too!)select starts.playerid, starts.teamid, starts.yearid startyear, ends.yearid endyear, ends.yearid-starts.yearid+1from(select b2.playerid, b2.teamid, b2.yearid, row_number() over(order by b2.playerid, b2.teamid, b2.yearid) seq_nofrom newbatting b1right joinnewbatting b2on b2.playerid = b1.playerid and b2.teamid = b1.teamidand b2.yearid = b1.yearid + 1where isnull(b1.rbi,0) < 100and b2.rbi >= 100) startsjoin(select b1.playerid, b1.teamid, b1.yearid, row_number() over(order by b1.playerid, b1.teamid, b1.yearid) seq_nofrom newbatting b1left joinnewbatting b2on b2.playerid = b1.playerid and b2.teamid = b1.teamidand b2.yearid = b1.yearid + 1where b1.rbi >= 100and isnull(b2.rbi,0) < 100) endson starts.seq_no = ends.seq_noRob Farleyhttp://robfarley.blogspot.com |
 |
|
|
AndyC
Yak Posting Veteran
53 Posts |
Posted - 2006-06-29 : 08:04:10
|
| yes Rob. i was making a mistake filtering for a specific player in the subqueries. Putting it in a HAVING clause in the cte was the correct routeThanks again for all your help. I should be set nowAndrew Clarkwww.majorleaguecharts.com |
 |
|
|
rob_farley
Yak Posting Veteran
64 Posts |
Posted - 2006-06-29 : 20:32:46
|
| No, that should be in a WHERE clause. HAVING is for filters that cannot be applied before the grouping.Rob Farleyhttp://robfarley.blogspot.com |
 |
|
|
|
|
|
|
|