Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Extremely slow query

Author  Topic 

Carleon123
Starting Member

9 Posts

Posted - 2004-07-13 : 15:55:12
Hello everyone, I was hoping someone would be able to help me come up with a solution for the task at hand. I'm sorry if a similar topic exists elsewhere, I was not able to find one. The situation is as follows:

I need to create reports on our company's sales data based on selected criteria. The problem I have is that there are two main tables being used (this years sales data and last years) that are about 10,000,000 records each. On top of that, I need to join them with about 4-5 lookup tables before querying them. At the moment, I am just using an inline sql query to do this. I am aware that this probably not even close to the best option, but I do not know what is. The current query may take a number of minutes (2-3) before returning the data (which, btw, is not a big set since all the fields are Sum values except the GROUP BY field).

Could anyone suggest ways of seriously improving these query speeds? (Sorry if this is a very simple question for most of you).

Any help is greatly appreciated!

Mike

kselvia
Aged Yak Warrior

526 Posts

Posted - 2004-07-13 : 16:00:53
Post the query you are using and we will see what we can do.


--Ken
Your Kung-Fu is not strong. -- 'The Core'
Go to Top of Page

Carleon123
Starting Member

9 Posts

Posted - 2004-07-13 : 16:23:02
Well, here is a slightly simplified version of the query (it does not include last year's data). This one, by itself is running very slow, I can only imagine how much slower it would be when UNIONized with last years.

SELECT     dbo.lu_branchesTbl.sales_centre, SUM(dbo.Mike_SalesCYCN.volume) AS total_vol, SUM(dbo.Mike_SalesCYCN.cogs) AS total_cogs, 
SUM(dbo.Mike_SalesCYCN.dnnsi) AS total_nsi, SUM(dbo.Mike_SalesCYCN.dngp) AS total_gp, SUM(dbo.Mike_SalesCYCN.cma) AS total_cma
FROM dbo.lu_channelsTbl INNER JOIN
dbo.Mike_SalesCYCN INNER JOIN
dbo.lu_carbonatedTbl ON dbo.Mike_SalesCYCN.carbonated = dbo.lu_carbonatedTbl.carbonated ON
dbo.lu_channelsTbl.channel_key = dbo.Mike_SalesCYCN.channel_key INNER JOIN
dbo.lu_5brandsTbl ON dbo.Mike_SalesCYCN.brand_key = dbo.lu_5brandsTbl.brnd_key INNER JOIN
dbo.lu_bustypeTbl ON dbo.Mike_SalesCYCN.bustyp = dbo.lu_bustypeTbl.business_type INNER JOIN
dbo.lu_branchesTbl ON dbo.Mike_SalesCYCN.branch_key = dbo.lu_branchesTbl.branch_key INNER JOIN
dbo.lu_packagesTbl ON dbo.Mike_SalesCYCN.pkg_key = dbo.lu_packagesTbl.package_key INNER JOIN
dbo.lu_smf2003xTbl ON dbo.Mike_SalesCYCN.carbonated = dbo.lu_smf2003xTbl.carbonated AND
dbo.Mike_SalesCYCN.pkg_key = dbo.lu_smf2003xTbl.[Package Code] AND
dbo.Mike_SalesCYCN.brand_key = dbo.lu_smf2003xTbl.[Brand Code]
GROUP BY dbo.lu_branchesTbl.sales_centre



Did I say 4-5 JOINS? There are 7 here - and thats after de-normalizing 3 of the look_up tables. One of the bigger obstacles, I think, is the lu_smf2003xtbl table because the values of the corresponding field of that table has to be multiplied to the main sales data to get the proper results. I.e. for a region of, say, Ontario, the sales data will have to be multiplied by a different value than for Quebec.

I hope I'm explaining things well enough, please let me know if I am leaving anything out! Thanks again.
Go to Top of Page

eyechart
Master Smack Fu Yak Hacker

3575 Posts

Posted - 2004-07-13 : 16:46:30
do you have single column indexes on all the keys you join on?
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2004-07-13 : 16:52:20
What does the execution plan show you?

And which are the 10 million row tables?



Brett

8-)
Go to Top of Page

Carleon123
Starting Member

9 Posts

Posted - 2004-07-13 : 17:21:51
quote:
Originally posted by eyechart

do you have single column indexes on all the keys you join on?



No, I don't believe so. I have just started working on this project (inherited it, if you will) and there are many things I still need to do - such as indexing.

quote:
And which are the 10 million row tables?

The Mikes_SalesCYCN is the main data table (10 million rows). The other is Mikes_SalesPYCN (which is the same structure with Past Years data) - and is not included in the query above.
Go to Top of Page

gwhiz
Yak Posting Veteran

78 Posts

Posted - 2004-07-13 : 17:47:02
I would start with indexing especially with the 10 million row table make sure that every column used in the joins has an index built.
Go to Top of Page

Carleon123
Starting Member

9 Posts

Posted - 2004-07-15 : 13:02:18
Alright, all the fields used have been indexed. I also simplified the query a bit to the following:

SELECT     dbo.lu_branches.branch_name, SUM(dbo.SalesCYCN.cases) AS total_vol, SUM(dbo.SalesCYCN.cost) AS total_cogs, 
SUM(dbo.SalesCYCN.DeadNet_NSI) AS total_dnnsi, SUM(dbo.SalesCYCN.DeadNet_GP) AS total_gp, SUM(dbo.SalesCYCN.CMA_on_inv) AS total_cma
FROM dbo.lu_channels INNER JOIN
dbo.SalesCYCN ON dbo.lu_channels.channel_key = dbo.SalesCYCN.channel_key INNER JOIN
dbo.lu_brands ON dbo.SalesCYCN.brand_key = dbo.lu_brands.brand_key INNER JOIN
dbo.lu_bustype ON dbo.SalesCYCN.bustyp = dbo.lu_bustype.bustyp INNER JOIN
dbo.lu_branches ON dbo.SalesCYCN.branch_key = dbo.lu_branches.branch_key INNER JOIN
dbo.lu_packages ON dbo.SalesCYCN.pkg_key = dbo.lu_packages.pkg_key
GROUP BY dbo.lu_branches.branch_name



Here is the execution plan for the above query:

  |--Compute Scalar(DEFINE:([Expr1012]=If ([globalagg1018]=0) then NULL else [globalagg1020], [Expr1013]=If ([globalagg1022]=0) then NULL else [globalagg1024], [Expr1014]=If ([globalagg1026]=0) then NULL else [globalagg1028], [Expr1015]=If ([globalagg1030]
|--Stream Aggregate(GROUP BY:([lu_branches].[branch_name]) DEFINE:([globalagg1018]=SUM([partialagg1017]), [globalagg1020]=SUM([partialagg1019]), [globalagg1022]=SUM([partialagg1021]), [globalagg1024]=SUM([partialagg1023]), [globalagg1026]=SUM([parti
|--Sort(ORDER BY:([lu_branches].[branch_name] ASC))
|--Nested Loops(Inner Join, OUTER REFERENCES:([SalesCYCN].[bustyp]))
|--Hash Match(Aggregate, HASH:([SalesCYCN].[bustyp], [lu_branches].[branch_name]), RESIDUAL:([SalesCYCN].[bustyp]=[SalesCYCN].[bustyp] AND [lu_branches].[branch_name]=[lu_branches].[branch_name]) DEFINE:([partialagg1017]=COUNT_BIG([Sa
| |--Hash Match(Inner Join, HASH:([lu_channels].[channel_key])=([SalesCYCN].[channel_key]))
| |--Index Scan(OBJECT:([Sales].[dbo].[lu_channels].[IX_lu_channels_1]))
| |--Hash Match(Inner Join, HASH:([lu_brands].[brand_key])=([SalesCYCN].[brand_key]))
| |--Index Scan(OBJECT:([Sales].[dbo].[lu_brands].[IX_lu_brands]))
| |--Hash Match(Inner Join, HASH:([lu_packages].[pkg_key])=([SalesCYCN].[pkg_key]))
| |--Index Scan(OBJECT:([Sales].[dbo].[lu_packages].[IX_lu_packages_1]))
| |--Hash Match(Inner Join, HASH:([lu_branches].[branch_key])=([SalesCYCN].[branch_key]))
| |--Index Scan(OBJECT:([Sales].[dbo].[lu_branches].[IX_lu_branches]))
| |--Table Scan(OBJECT:([Sales].[dbo].[SalesCYCN]))
|--Index Seek(OBJECT:([Sales].[dbo].[lu_bustype].[PK_lu_bustype]), SEEK:([lu_bustype].[bustyp]=[SalesCYCN].[bustyp]) ORDERED FORWARD)



If I run the above query for the year to date (all rows), it takes about 2 1/2 minutes to run. This does not even include the last year sales data. If anyone has any ideas, I would be greatly appreciative! Feel free to suggest any ideas (such as Stored Procedurs, etc).

Thanks again.
Go to Top of Page
   

- Advertisement -