Query performance: multiple sums on 40,000 rows...

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

Transact-SQL (2008)

Query performance: multiple sums on 40,000 rows...

Author

Topic

PITU
Starting Member

4 Posts

Posted - 2009-10-26 : 19:14:36

Hello everyone,

I would like a few tips on how to boost performance of a set of queries.

I have a table containing at least 40,000 transactions so far, and the table is growing fast. The table contains fields such as [TransactionId], [TransactionType], [TransactionDate], [InPunchTime], [OutPunchTime], [WeekId], etc. I have a .NET web app that serves as the DB interface. The database is designed to store transactions that are later evaluated and grouped by date or some other ID, such as [WeekId].

Somewhere in code, I have need to come up with various totals of elapsed time between [InPunchTime] and [OutPunchTime], which I accomplish using the datediff function. Rather than store the time elapsed between In and Out, I use SQL to calculate it. I have to run many function-based queries to get numbers that I later sum together.

So for example, let's say I insert several transactions like this:
/*query 1*/
INSERT INTO
[TransactionTable]
(TransactionId, TransactionType, TransactionDate,
InPunchTime, OutPunchTime, WeekId)
VALUES
(1, '1', #1/1/2001#, #15:00#, #17:00#, 1);
and then:
(2, '1', #1/2/2001#, #11:00#, #17:00#, 1);
and then:
(3, '2', #1/1/2001#, #15:00#, #17:00#, 1);
and then:
(4, '2', #1/1/2001#, #14:00#, #17:00#, 1);

[TransactionType] can be grouped; you can see that many rows can have the same [TransactionType]. Also, I can use the DateDiff function to get each row's elapsed time between [InPunchTime] and [OutPunchTime] as [TotalDailyTime], and then I can use the SUM function to get [GrandTotalTime]. Right now, I am doing as described and am performing DateDiff and the SUM function in the same query with a WHERE clause as follows:

/*query 2*/
SELECT round(Sum((DateDiff('n', [InPunchTime], [OutPunchTime]) / 60)),2) AS [TotalDailyTime] FROM [TransactionTable] WHERE [TransactionType]=1 AND [WeekId]=1;

My .NET application fetches a distinct list of WeekId's; it then executes a loop for each WeekId and runs the queries, as this pseudo-code example shows:

1. SELECT Distinct(WeekId) FROM [TransactionTable];
2. Store result in an array/dataset/whatever
3. For Each Row In array/dataset/whatever:
- open database connection
- execute query 2
- do some logic on the number returned from the query
- close database connection
Loop
4. End Pseudo Code

So really, let's say there are 30 WeekId's, and there can be many transactions for each [WeekId]. So there is the overhead of having to loop through each [WeekId], of having to SUM all of the transactions for that [WeekId], AND of having to open and close a connection to the database!

I believe there is another way that does not require opening and closing a connection 8 billion times. I am working on that. I also believe there may be a better way to do the queries. Your opinions are welcome regarding this! And I believe that migrating from Access to SQL Express 2008 will also provide a performance boost.

The app and the code ran fine 30,000 transactions ago. Rather than archive transactions, I would like to boost efficiency.

Does anyone have any specific tips that I might use to boost performance? I can be more specific if need be.

Basically, I have other queries that have to run also, such as modifying the WHERE clause like this:
... WHERE [TransactionType]=2 AND [WeekId]=1;
... WHERE [TransactionType]=4 AND [WeekId]=3;

The end result is, I think, a lot of unecessary overhead/processing/repeating of queries! The combined queries take like 60 seconds and I'd like to reduce that. I think *most* of the problem is in opening and closing the database connection and 2nd to that, how the queries are defined. And I haven't even discussed indexing yet.

Any help is appreciated.

PITU
Starting Member

4 Posts

Posted - 2009-10-27 : 10:55:00

Let's assume I rewrite the code so that the connection to the DB stays alive until all querying is done - rather than opening/closing.... Would there be much benefit in combining the queries? Such as:

SELECT SUM(SELECT SUM(Field1) AS [SumField0]) AS [SumField1] or something like that?

Then how would I get the original Field1 value (prior to SUMMATION)?

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources