Calculating Running Totals

By Garth Wells on 07 May 2001 | 21 Comments | Tags: SELECT


Roger writes "Is there a way to retrive a field value from the previously read row in order to use it to calculate a field in the current row . . ."

For example:
day   sales   cumu_total
1     120     120
2     60      180
3     125     305
4     40      345

In order to calculte the cumulative total I need to know the previous cumulative total (cumulative total could be any other calculation). I did this in MySQL by using variables to temporarily hold values between rows but SQL server doesn't assinging variables in retrieval statements. Is it possible to do this in a Select statement?"


The answer to the questions is "yes", you can solve this problem with a single SELECT statement. Instead of just showing the solution that came to mind, though, I want to take a look at three ways to solve the problem.

Creating the Test Data

I used the code shown below to create the base table and a few thousand rows of test data. It is important to add enough data so that the efficiency of the three solutions can be accurately measured. In other words, with only a few rows of data they all seem efficient.
CREATE TABLE Sales (DayCount smallint, Sales money)
CREATE CLUSTERED INDEX ndx_DayCount ON Sales(DayCount)
go
INSERT Sales VALUES (1,120)
INSERT Sales VALUES (2,60)
INSERT Sales VALUES (3,125)
INSERT Sales VALUES (4,40)

DECLARE @DayCount smallint, @Sales money
SET @DayCount = 5
SET @Sales = 10

WHILE @DayCount < 5000
 BEGIN
  INSERT Sales VALUES (@DayCount,@Sales)
  SET @DayCount = @DayCount + 1
  SET @Sales = @Sales + 15
 END

The Three Solutions

The three different solutions I tested are shown below. The execution time with and without a clustered index on DayCount is shown at the top of each batch.

Solution 1: Temp Table/Cursor

(NoIndex = 2 secs, Index = 2 secs)

CREATE TABLE #Sales (DayCount smallint, Sales money, RunningTotal money)

DECLARE @DayCount smallint,
        @Sales money,
        @RunningTotal money

SET @RunningTotal = 0

DECLARE rt_cursor CURSOR
FOR
SELECT DayCount, Sales
FROM Sales

OPEN rt_cursor

FETCH NEXT FROM rt_cursor INTO @DayCount,@Sales

WHILE @@FETCH_STATUS = 0
 BEGIN
  SET @RunningTotal = @RunningTotal + @Sales
  INSERT #Sales VALUES (@DayCount,@Sales,@RunningTotal)
  FETCH NEXT FROM rt_cursor INTO @DayCount,@Sales
 END

CLOSE rt_cursor
DEALLOCATE rt_cursor

SELECT * FROM #Sales ORDER BY DayCount

DROP TABLE #Sales

 

Solution 2: The "Celko" Solution

(NoIndex = 25 secs, Index = 20 secs)

SELECT DayCount,
       Sales,
       Sales+COALESCE((SELECT SUM(Sales) 
                      FROM Sales b 
                      WHERE b.DayCount < a.DayCount),0)
                         AS RunningTotal
FROM Sales a
ORDER BY DayCount

 

Solution 3: The "Guru's Guide" Solution

(NoIndex = 38 secs, Index = 17 secs)

SELECT a.DayCount,
       a.Sales,
       SUM(b.Sales)
FROM Sales a
CROSS JOIN Sales b
WHERE (b.DayCount <= a.DayCount) AS RunningTotal
GROUP BY a.DayCount,a.Sales
ORDER BY a.DayCount,a.Sales

Solution 1 uses a temporary table and a cursor and executes extremely fast. I must admit that my original intent was to show this solution as the way *not* to solve the problem. In general, I tell developers to avoid cursors at all costs. In this particular case, however, the cursor approach is much quicker than the single SELECT solutions.

Solution 2 is the one that came to mind when I originally read the question. I attribute the solution to Joe Celko, because it can be found in his SQL for Smarties book and I'm sure I learned about it in one of his posts in the SQL Server newsgroups.

The tricky part about this solution is the use of a correlated subquery and the COALESCE function. A correlated subquery is one that executes for each row in the outer query. The inner query sums the Sales for rows with a DayCount value less than the current row.

The COALESCE function returns the first non-NULL value in its list of values. The first row processed returns a NULL for SUM(Sales) because there is no DayCount less than it's value. When this happens a 0 is returned and added to the Sales value.

Solution 3 is labeled the "Guru's Guide" because it can be found in Ken Henderson's enormously popular "The Guru's Guide to Transact-SQL." I would have never come up with this solution, but I guess that's what makes Ken the Guru and me a Guru wannabee.

This solution uses a CROSS JOIN and table aliases to join the Sales table with itself. This causes each row in the left table (Sales a) to be joined with each row in the right table (Sales b) where the DayCount in b in less than the DayCount in a. The SUM(b.Sales) and the GROUP BY a.DayCount, a.Sales then allow the running total for each row to be calculated. If you are having trouble figuring out how this works draw a layout of two instances of Sales side-by-side, and then draw lines from a row in the left table to the rows in the right table that meet the WHERE condition.

When I first ran the tests I did not have a clustered index on DayCount. Upon realizing DayCount is a prime candidate for a clustered index, I added it and re-ran the tests. The execution time for Solution 3 was significantly reduced after the index was created because of the number of rows joined by the CROSS JOIN.

And as an FYI, the server I used to tests these solutions is a Compaq ML370 with dual 600's and 1 GB RAM--your results may vary.

Discuss this article: 21 Comments so far. Print this Article. This page has been read 84,517 times.

If you like this article you can sign up for our newsletter. We send it out each week that we post a new article. There's an opt-out link at the bottom of each newsletter so it's easy to unsubscribe at any time.

Email Address:

Email ThisSubscribe to this feedKick itSave to del.icio.usView blog reactions

Related Articles

Joining to the Next Sequential Row (2 April 2008)

Writing Outer Joins in T-SQL (11 February 2008)

How to Use GROUP BY with Distinct Aggregates and Derived tables (31 July 2007)

How to Use GROUP BY in SQL Server (30 July 2007)

SQL Server 2005: Using OVER() with Aggregate Functions (21 May 2007)

Server Side Paging using SQL Server 2005 (4 January 2007)

Using XQuery, New Large DataTypes, and More (9 May 2006)

Counting Parents and Children with Count Distinct (10 January 2006)

Other Recent Forum Posts

Max() question (6 Replies)

SQL Newbie with a join question (3 Replies)

converting time question (4 Replies)

How to fix a table with a single bad record/index (6 Replies)

Update using two tables (3 Replies)

sub data in table (16 Replies)

Sample database (5 Replies)

foreign key problem.. (6 Replies)

Subscribe to SQLTeam.com

Weekly SQL Server newsletter with articles, forum posts, and blog posts via email:

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

- Sponsor's Message -

SQLShare.com Videos

Renaming a Database

You won't do it often, but it's nice to know how, and you're not still using sp_renamedb are you? Join us for a quick look at how to use the alter syntax to change the db name along with a tip on how to quickly disconnect any remaining users from the database.

File Share Subscriptions in Reporting Services

Whether you want to generate PDF invoices for customers or do a daily export that will get processed by one of your vendors, the ability to deliver reports to a file share is a useful and simple feature baked in to Reporting Services. In this lesson Devin will show you how to do it and how to set most of the common options.

File Share Subscriptions in Reporting Services

Whether you want to generate PDF invoices for customers or do a daily export that will get processed by one of your vendors, the ability to deliver reports to a file share is a useful and simple feature baked in to Reporting Services. In this lesson Devin will show you how to do it and how to set most of the common options.

Using DB_ID and DB_Name Functions

Simple but effective, DB_ID and DB_Name give you a concise way to look up the id of a database from a name, or look up the name of a database from an id. There are times when you'll need to write the join to sys.sysdatabases, but when all you need is a quick conversion, these functions get it done.

Using @@Total_Read and @@Total_Write

Ever wondered how many physical reads and writes your SQL service is doing? Yes, it may seek a bit geeky, but it's often useful to have an idea if a server is read or write heavy. @@Total_Read and @@Total_Write show you the number of physical (not logical) reads since the last service restart, and from there we can easily calculate a percentage of reads if needed.