Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Indexes, Case Statements and Group Bys

Author  Topic 

davidpardoe
Constraint Violating Yak Guru

324 Posts

Posted - 2002-01-04 : 06:59:31
I am trying to figure out the most efficient method to do the following...

I have a table of raw data, eg.
URN CATEGORY
1 A
2 A
3 B
4 B
5 C
6 D

Now I want to run a group by query on this table where the group by variable is a derivation of the CATEGORY field, namely a CASE statement ie.-

SELECT
case
when CATEGORY='A' or CATEGORY='B' then 'XX'
when CATEGORY='C' or CATEGORY='D' then 'YY'
else 'ZZ' end
,count(*)
FROM RAW_DATA
GROUP BY
case
when CATEGORY='A' or CATEGORY='B' then 'XX'
when CATEGORY='C' or CATEGORY='D' then 'YY'
else 'ZZ' end

For ease of coding I am keen to create a temporary table that runs the CASE statement first on which I can then run the group by query.

Anyone got any thoughts as to which will be most efficient and where I should be creating indexes?

My actual application is far more complicated than this - complicated case statements, multiple raw data tables which I want to append to each other, lots of records (3 million+), lots of fields (200+).

Any pointers would be most helpful!

============
The Dabbler!

AjarnMark
SQL Slashing Gunting Master

3246 Posts

Posted - 2002-01-04 : 16:59:45
David,

Have you considered creating a computed column on your table? I haven't used these, so I don't know of any performance implications, but here is a clip out of the Alter Table sectin of BOL (SQL 7).

quote:
computed_column_expression
Is an expression that defines the value of a computed column. A computed column is a virtual column that is not physically stored in the table, but it is computed from an expression using other columns in the same table. For example, a computed column could have the definition: cost AS price * qty. The expression can be a noncomputed column name, constant, function, variable, and any combination of these connected by one or more operators. The expression cannot be a subquery.

Computed columns can be used in select lists, WHERE clauses, ORDER BY clauses, or any other locations where regular expressions can be used, with these exceptions:

- A computed column cannot be used as a key column in an index or as part of any PRIMARY KEY, UNIQUE, FOREIGN KEY, or DEFAULT constraint definition.

- A computed column cannot be the target of an INSERT or UPDATE statement.



--------------------------------------------------------------
1000 Posts, Here I come! I wonder what my new title will be...
Go to Top of Page

mono
Starting Member

36 Posts

Posted - 2002-01-07 : 10:55:36
Alternatively create a table for the mapping you do in the CASE and then join.


table MyCats:
CATEGORY mycat
--------------
A XX
B XX
C YY
D YY


SELECT
mycats.mycat,count(*)
FROM RAW_DATA inner join mycats on RAW_DATA.Category = mycats.category
GROUP BY mycats.category



Index on RAW_DATA.CATEGORY (need not be clustered).

michael

Go to Top of Page

davidpardoe
Constraint Violating Yak Guru

324 Posts

Posted - 2002-01-07 : 11:00:32
Thanks mono this is a good idea for categorical variable.

However, some of my variables are numeric which I wish to band (ie. my case statement would be something like:

CASE WHEN variable < 100 then '<100' WHEN variable < 200 then '101-200' ... etc.

There may be a way of doing this with a table in a similar fashion - I'll have a think.



============
The Dabbler!
Go to Top of Page
   

- Advertisement -