SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 General SQL Server Forums
 Database Design and Application Architecture
 Dealing with large number of columns
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

dextrous
Starting Member

15 Posts

Posted - 02/03/2010 :  04:01:29  Show Profile  Reply with Quote
Hello,
I'm struggling with a very large table in my application. There's one column with unique values (around 24 million rows) but there are around 120 other columns in the same database. Depending upon the client's input 2 or 3 or multiple values are compared (its a zipcode profile column) and it outputs the highest value.

When a client gives around a million of these, we are facing performance issues. At first I thought it might be large number of rows in the db, so I created views (each range has around only 500,000 records), so around 48 views. But the problem, I think, is also with the large number of columns in the table.

Is there an efficient way to handle large # of columns in a db like there's the views concept of large # of rows?

harsh_athalye
Flowing Fount of Yak Knowledge

India
5581 Posts

Posted - 02/03/2010 :  04:07:32  Show Profile  Visit harsh_athalye's Homepage  Click to see harsh_athalye's MSN Messenger address  Send harsh_athalye a Yahoo! Message  Reply with Quote
Do you need all the columns as output always? Is the table normalized? 120 seems to be quite a large number of columns

Harsh Athalye
http://www.letsgeek.net/
Go to Top of Page

Kristen
Test

United Kingdom
22415 Posts

Posted - 02/03/2010 :  04:11:59  Show Profile  Reply with Quote
Need to see a Query Plan for a "typical" query that is "too slow" please
Go to Top of Page

Lumbago
Norsk Yak Master

Norway
3271 Posts

Posted - 02/03/2010 :  07:24:13  Show Profile  Reply with Quote
Are you on 2008? If so you can look in to sparse columns...if you have a lot of NULLs in these 120 columns they could prove really handy. Sparse columns are stored as xml, hence if there is no value for one of the 120 columns nothing will be stored.

If this is not suitable, you could look at both vertical and horizontal partitioning...

- Lumbago
If the facts don't fit the theory, change the facts. Albert Einstein
Go to Top of Page

Kristen
Test

United Kingdom
22415 Posts

Posted - 02/03/2010 :  07:50:06  Show Profile  Reply with Quote
@Lumbago: I don't think the presence of 120 columns is making performance bad (particularly if the values in them are sparse!

I reckon it will be an optimisation issue, but if that doesn't work then you are right about Partitioning.

24M rows is not really a "huge" table, I'd class it as "big".

If most of those 120 columns are TEXT/VARCHAR(MAX), and never part of the WHERE clause, I might want to shift them out of the main table into a 1:1 sibbling-table ....
Go to Top of Page

Lumbago
Norsk Yak Master

Norway
3271 Posts

Posted - 02/03/2010 :  08:18:35  Show Profile  Reply with Quote
Well...I agree that the sparse columns might not be what dextrous is looking for after thinking a little more about it. It's more of a space saver than a performance too so I guess partitioning is the keyword to remember.

@dextrous: maybe you can post a part of your table and the queries you usually run...maybe there is room for optimizations?

- Lumbago
If the facts don't fit the theory, change the facts. Albert Einstein
Go to Top of Page

dextrous
Starting Member

15 Posts

Posted - 02/04/2010 :  02:12:38  Show Profile  Reply with Quote
ALTER

Procedure [dbo].[STR_SP_GetCultureByZcode] (

@Zipcode

INT,
@Querystr

Varchar(200),
@TblName

Varchar(100),
@existsZip

INT OUTPUT )

AS

Begin

SET

NOCOUNT ON;

DECLARE @Result VARCHAR(8000),
@CondStmt1

nvarchar(200)

--check Whether record exist in table with this zipcode


Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = ' + Convert(varchar(30),@Zipcode)

EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT', @existsZip OUTPUT


If(@existsZip > 0) -- If Exists get the data

Begin

Set @Result = 'SELECT TOP 1 WITH TIES theCol FROM (SELECT * FROM ' + @TblName + ' WHERE ZipData =' + Convert(varchar(30),@Zipcode) + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'

End

else -- If not exist take only 5 digits from zipcode and pass as input to query
Begin

if(Exists(Select ZipData from CountryOrigin_View0 Where Left(ZipData,5) = @Zipcode))

Begin

Set @Result = 'SELECT TOP 1 WITH TIES theCol FROM (SELECT * FROM CountryOrigin_View0 WHERE ZipData =' + Convert(varchar(30),Left(@Zipcode,5)) + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'

End

End

Exec(@Result) End

GO

SET

ANSI_NULLS OFF
GO

SET

QUOTED_IDENTIFIER OFF
GO


Go to Top of Page

Kristen
Test

United Kingdom
22415 Posts

Posted - 02/04/2010 :  03:01:46  Show Profile  Reply with Quote

Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = @Zipcode' + Convert(varchar(30),@Zipcode) 

EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT, @Zipcode INT', @existsZip OUTPUT, @Zipcode

this will improve the chances of the query plan being cached

Change:

if(Exists(Select ZipData from CountryOrigin_View0 Where Left(ZipData,5) = @Zipcode))

to first reduce @Zipcode to 5 characters - e.g.

SELECT @Zipcode = CONVERT(int, CONVERT(varchar(5), @Zipcode))

then do a range test on ZipData (you will need an index on ZipData column) and assuming I have understood ZipData usage correctly and my maths is reasonable

if(Exists(Select * from CountryOrigin_View0 
Where ZipData >= @Zipcode * 10
      AND ZipData < (@Zipcode + 1) * 10

Then, because you modified @Zipcode to 5 digits already, you can use the same query for both routes:

Set @Result = 'SELECT TOP 1 WITH TIES theCol FROM ([blue]SELECT * FROM[blue] ' + @TblName + ' WHERE ZipData = @Zipcode ) AS t1
UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC' 

and then use sp_ExecuteSQL instead of EXEC :

EXEC sp_executesql @Result, N'@Zipcode INT', @Zipcode

again, this will improve the chances of the query plan being cached

Can you replace SELECT * with

SELECT ' + @Querystr + ' FROM

thus avoiding pulling all 120 columns? (There may be some extra columns you need in addition to the ones in @Querystr)

I also question whether your logic is right for the second IF

You do IF EXISTS based on LEFT(ZipData, 5), and then your actual UNPIVOT WHERE clause is ZipData = LEFT(@ZipCode, 5) - which is a very different thing!

Edited by - Kristen on 02/04/2010 03:07:50
Go to Top of Page

Lumbago
Norsk Yak Master

Norway
3271 Posts

Posted - 02/04/2010 :  03:19:08  Show Profile  Reply with Quote
And I don't really see the need for the derived table in the "SET @Result"-query. Unless I'm missing something here the following will be far more efficient:

Set @Result = 'SELECT TOP 1 WITH TIES theCol FROM CountryOrigin_View0 WHERE ZipData =' + Convert(varchar(30),Left(@Zipcode,5)) + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ') ORDER BY theValue DESC'

- Lumbago
If the facts don't fit the theory, change the facts. Albert Einstein
Go to Top of Page

Kristen
Test

United Kingdom
22415 Posts

Posted - 02/04/2010 :  03:27:49  Show Profile  Reply with Quote
I've never used UNPIVOT (I'm normal!!!) so have no idea what works in that regard. Useful to know Lumbago, thanks.
Go to Top of Page

satish.gorijala
Posting Yak Master

India
182 Posts

Posted - 02/04/2010 :  03:55:18  Show Profile  Reply with Quote
Hi kristen,

I didn't understand your statement. check the bold one below.
I am getting error when i try to execute this statment

Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = @Zipcode' + Convert(varchar(30),@Zipcode)EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT, @Zipcode INT', @existsZip OUTPUT, @Zipcode

Exec STR_SP_GetCultureByZcode_New 85260,'INDI,CHIN,JPAN','CountryOrigin_View0',1

Msg 137, Level 15, State 2, Line 1
Must declare the scalar variable "@Zipcode85260".

developer :)

Edited by - satish.gorijala on 02/04/2010 04:44:14
Go to Top of Page

dextrous
Starting Member

15 Posts

Posted - 02/04/2010 :  04:02:17  Show Profile  Reply with Quote
satish works with me...so he will also have questions :)

thanks for your help, kristen & lumbago
Go to Top of Page

Kristen
Test

United Kingdom
22415 Posts

Posted - 02/04/2010 :  09:56:51  Show Profile  Reply with Quote
Did you delete the bit in Blue?

Put a PRINT statement in the SProc so you can see what is in @CondStmt1 and @Result before it is executed (comment out the EXEC / SP_ExecuteSQL temporarily if the error is preventing you seeing the PRINT debug message)
Go to Top of Page

satish.gorijala
Posting Yak Master

India
182 Posts

Posted - 02/07/2010 :  04:58:46  Show Profile  Reply with Quote
Hi kristen, i didn't delte the blue one from your suggestion. bcz i am getting some syntax error. I have refined stored proc with some of ur suggestions. please give suggestions from below stored proc what can i do more to optimize this thing and get results faster.


ALTER Procedure [dbo].[STR_SP_GetCultureByZcode]          
(          
  @Zipcode INT,                
  @Querystr Varchar(200),            
  @TblName Varchar(100),            
  @existsZip INT OUTPUT       
)                
AS
Begin       
SET NOCOUNT ON;              
 DECLARE @Result VARCHAR(8000),      
  @CondStmt1 nvarchar(200),
  @Zip5 varchar(50),
  @Zcode Varchar(50)

Set @Zcode = Convert(varchar(30),@Zipcode)
Set @Zip5 = Convert(varchar(30),Left(@Zipcode,5))         
 
 --check Whether record exist in table with this zipcode        
Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = ' + @Zcode
EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT', @existsZip OUTPUT        
      
If(@existsZip > 0)  -- If Exists get the data
Begin      
   Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM ' + @TblName + '  WHERE ZipData =' + @Zcode + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'                 
End      
else  -- If not exist take only 5 digits from zipcode and pass as input to query    
Begin       
    Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM CountryOrigin_View0  WHERE ZipData =' + @Zip5 + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'             
End          

Exec(@Result)
End
GO


quote:
Originally posted by Kristen

Did you delete the bit in Blue?

Put a PRINT statement in the SProc so you can see what is in @CondStmt1 and @Result before it is executed (comment out the EXEC / SP_ExecuteSQL temporarily if the error is preventing you seeing the PRINT debug message)



developer :)
Go to Top of Page

Kristen
Test

United Kingdom
22415 Posts

Posted - 02/08/2010 :  02:39:27  Show Profile  Reply with Quote
"i didn't delte the blue one from your suggestion. bcz i am getting some syntax "

Yeah, well you will do if you don't delete the piece marked strikethrough.

I marked it up so you would be able to see the changes I made and from that understand the changes. ...
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.12 seconds. Powered By: Snitz Forums 2000