Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
Yida12
Starting Member
6 Posts |
Posted - 2011-07-20 : 08:35:25
|
| Hi Everyone,I have a table that has revenues for firms per year like so:FIRM YEAR REVENUEFirmA 1991 125FirmA 1992 130FirmA 1993 160FirmB 1991 150FirmB 1992 175FirmC 1992 130FirmC 1993 120The timeline is 1991-1993 inclusive.I want to add to the table so that if a firm does not have revenue for a year, SQL will try to use a revenue from the closest year with revenue.So FirmB 1993 would have revenue of 175, and FirmC 1991 would have revenue of 130.Modified Table would look like:FIRM YEAR REVENUEFirmA 1991 125FirmA 1992 130FirmA 1993 160FirmB 1991 150FirmB 1992 175FirmB 1993 175FirmC 1991 130FirmC 1992 130FirmC 1993 120If you need more detail, please let me know. Thank you for the help. |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2011-07-20 : 08:44:36
|
| [code]SELECT p.* INTO #MissingYrsFROM(SELECT t.FIRM,y.YrFROM (SELECT DISTINCT FIRM FROM table)tCROSS JOIN (SELECT 1991 AS Yr UNION ALL SELECT 1992 UNION ALL SELECT 1993)y)pLEFT JOIN Table t1ON t1.FIRM = p.FIRMAND t1.YEAR = p.YrWHERE t1.FIRM IS NULLINSERT INTO tableSELECT m.FIRM,m.Yr,COALESCE(y1.REVENUE,y2.REVENUE)FROM #MissingYrs mOUTER APPLY (SELECT TOP 1 REVENUEFROM TableWHERE FIRM = m.FIRMAND YEAR >m.YrORDER BY YEAR ASC) y1OUTER APPLY (SELECT TOP 1 REVENUEFROM TableWHERE FIRM = m.FIRMAND YEAR < m.YrORDER BY YEAR DESC) y2DROP TABLE #MissingYrs[/code]------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
Seventhnight
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2011-07-20 : 08:49:11
|
[code]Declare @t table ( Firm varchar(10), Yr int, Rev int)Insert Into @t Select 'FirmA', 1991, 125Insert Into @t Select 'FirmA', 1992, 130Insert Into @t Select 'FirmA', 1993, 160Insert Into @t Select 'FirmB', 1991, 150Insert Into @t Select 'FirmB', 1992, 175Insert Into @t Select 'FirmC', 1992, 130Insert Into @t Select 'FirmC', 1993, 120Select * From @tDeclare @rs int, --Range Start @re int --Range EndSet @rs = 1991Set @re = 1993 Select A.Firm, A.Yr, A.Rev From ( Select Z.*, Y.Rev, r = ROW_NUMBER() Over(Partition By Z.Firm, Z.Yr Order By abs(Z.Yr-Y.Yr), Z.Yr-Y.Yr) From ( Select * From (Select Distinct Firm From @t) A Cross Join (Select Yr=number From master..spt_values Where number between @rs and @re) B ) Z Left Join @t Y On Z.Firm = Y.Firm ) A Where A.r = 1 Order By A.Firm, A.Yr[/code]Corey I Has Returned!! |
 |
|
|
Yida12
Starting Member
6 Posts |
Posted - 2011-07-20 : 10:24:58
|
| Thanks for the help guys.Do your codes work for a table with some firms not having any revenue data for any years? In this case, I just want to skip the firm entirely. Also, I am applying the code to thousands of firms with years 2004-2010. |
 |
|
|
Seventhnight
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2011-07-20 : 10:49:07
|
Oh... I realized I was making mine more difficult than necessary:EDIT: Forgot to say... this should work fine on thousands of firms, and I added in a Firm that did not fall inside the Yr range... and it is excluded from the results.Drop Table #tCreate Table #t ( Firm varchar(10), Yr int, Rev int, Primary Key (Firm, Yr))Insert Into #t Select 'FirmA', 1991, 125Insert Into #t Select 'FirmA', 1992, 130Insert Into #t Select 'FirmA', 1993, 160Insert Into #t Select 'FirmB', 1991, 150Insert Into #t Select 'FirmB', 1992, 175Insert Into #t Select 'FirmC', 1992, 130Insert Into #t Select 'FirmC', 1993, 120Insert Into #t Select 'FirmD', 1990, 100 -- Test Firm with no Rev for 91-93-- Make a bunch of copiesInsert Into #tSelect Firm + CONVERT(varchar,number), Yr, Rev From #t A, (Select number From master..spt_values Where type = 'P') BSelect * From #tDeclare @rs int, --Range Start @re int --Range EndSet @rs = 1991Set @re = 1993 Select Firm, Yr, Rev From ( Select A.Firm, B.Yr, A.Rev, r = ROW_NUMBER() Over(Partition By A.Firm, B.Yr Order By abs(A.Yr - B.Yr)) From #t A Cross Join (Select Yr=number From master..spt_values Where number between @rs and @re) B Where A.Yr between @rs and @re ) Z Where r = 1 Corey I Has Returned!! |
 |
|
|
Yida12
Starting Member
6 Posts |
Posted - 2011-07-20 : 11:15:07
|
| Seventhnight,Could you contact me via an email? I want to send you a cleaned sample of the data. I tried implementing your code but am getting errors. |
 |
|
|
Seventhnight
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2011-07-20 : 12:35:30
|
I don't have access to my regular email here... sorry.Why don't you post what you have tried here, along with the errors you are getting.Also, you could try to recreate sample data that you think reproduces the issue...Corey I Has Returned!! |
 |
|
|
Yida12
Starting Member
6 Posts |
Posted - 2011-07-22 : 14:18:33
|
So for my actual table, I need to update the table so that each FIRM has a revenue for every year (2004-2010).If there is a tie in the years, pick the year that is the later one.Here is my code:-- table variabledeclare @data as table ( FIRM nvarchar(100), YEAR int, REVENUE float))-- sample datainsert into @data(FIRM, YEAR, REVENUE)Select a.FIRM, a.YEAR, a.REVENUE from [excel table] as a-- SQLselect sq.FIRM, sq.yr, coalesce(d.revenue, closest_revenue.closest_revenue) as revenuefrom ( -- generate a result set that matches firms to years select distinct d.FIRM, yrs.yr from @data AS d cross apply (values(2004),(2005),(2006), (2007), (2008), (2009), (2010)) yrs(yr)) AS sqleft outer join @data d on sq.FIRM=d.FIRM and sq.YEAR=d.YEARouter apply ( -- look up the closest revenue -- how do I deal with ties? select top 1 d_closest.revenue as closest_revenue from @data as d_closest where sq.FIRM=d.FIRM order by abs(sq.yr - d_closest.DATAYEAR)) AS closest_revenueorder by sq.ORG_KEY, sq.SEGMENT_KEY, sq.yr The code runs instantly before "left outer join @data d on sq.FIRM=d.FIRM and sq.YEAR=d.YEAR", but this step takes so much time that I just stop it. Any suggestions on this and on how to deal with ties? |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2011-07-23 : 03:57:40
|
| how many rows does @data contain?------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
Yida12
Starting Member
6 Posts |
Posted - 2011-07-25 : 15:57:48
|
| @data contains tens of thousands of rows. |
 |
|
|
Seventhnight
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2011-07-25 : 16:52:17
|
Maybe a temp table would be better (can be indexed). Also, I'd assign an Id to a 'Firm':-- temp tableCreate Table #firms table ( FirmId int identity(1,1) FIRM nvarchar(100), Primary Key (FirmId))Create Table #data table ( FirmId int, FIRM nvarchar(100), Yr int, REVENUE float)-- sample datainsert into #firms (FIRM)Select distinct a.FIRM from [excel table] as a Order By A.Firminsert into #data(FIRM, Yr, REVENUE)Select FirmId = (Select FirmId From #firms Where Firm = A.Firm), a.Firm, a.Yr, a.REVENUE from [excel table] as aSelect Z.FirmId, Y.Firm, Z.Yr, Z.RevFrom ( Select A.FirmId, B.Yr, A.Rev, r = ROW_NUMBER() Over(Partition By A.FirmId, B.Yr Order By abs(A.Yr - B.Yr)) From #data A Cross Join (values(2004),(2005),(2006), (2007), (2008), (2009), (2010)) B(Yr) Where A.Yr between 2004 and 2010 ) ZInner Join #firms YOn Z.FirmId = Y.FirmIdWhere r = 1Order By Z.FirmId, Z.Yr Seeing this as far as output on my sample data:(18441 row(s) affected)Table '#t_____000000006850'. Scan count 1, logical reads 69, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 125 ms, elapsed time = 446 ms. Corey I Has Returned!! |
 |
|
|
Yida12
Starting Member
6 Posts |
Posted - 2011-07-26 : 09:40:03
|
| So using a temp table is faster than using a table variable? And also what is the reason for the ID on firms? |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2011-07-26 : 13:07:23
|
| it can be based on amount of data and indexes.------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
Seventhnight
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2011-07-26 : 16:17:03
|
By generating the Id in the temp table, and making it a primary key, I get an indexed integer. Even though the rest of the query isn't really using the #firms temp table much, it is much easier(more efficient) to work with the integer than the nvarchar(100)... you could put an index on the #data temp table and it might run even better. I didn't test that though.Corey I Has Returned!! |
 |
|
|
|
|
|
|
|