Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Database Design and Application Architecture
 Dealing with large number of columns

Author  Topic 

dextrous
Starting Member

15 Posts

Posted - 2010-02-03 : 04:01:29
Hello,
I'm struggling with a very large table in my application. There's one column with unique values (around 24 million rows) but there are around 120 other columns in the same database. Depending upon the client's input 2 or 3 or multiple values are compared (its a zipcode profile column) and it outputs the highest value.

When a client gives around a million of these, we are facing performance issues. At first I thought it might be large number of rows in the db, so I created views (each range has around only 500,000 records), so around 48 views. But the problem, I think, is also with the large number of columns in the table.

Is there an efficient way to handle large # of columns in a db like there's the views concept of large # of rows?

harsh_athalye
Master Smack Fu Yak Hacker

5581 Posts

Posted - 2010-02-03 : 04:07:32
Do you need all the columns as output always? Is the table normalized? 120 seems to be quite a large number of columns

Harsh Athalye
http://www.letsgeek.net/
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-02-03 : 04:11:59
Need to see a Query Plan for a "typical" query that is "too slow" please
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2010-02-03 : 07:24:13
Are you on 2008? If so you can look in to sparse columns...if you have a lot of NULLs in these 120 columns they could prove really handy. Sparse columns are stored as xml, hence if there is no value for one of the 120 columns nothing will be stored.

If this is not suitable, you could look at both vertical and horizontal partitioning...

- Lumbago
If the facts don't fit the theory, change the facts. Albert Einstein
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-02-03 : 07:50:06
@Lumbago: I don't think the presence of 120 columns is making performance bad (particularly if the values in them are sparse!

I reckon it will be an optimisation issue, but if that doesn't work then you are right about Partitioning.

24M rows is not really a "huge" table, I'd class it as "big".

If most of those 120 columns are TEXT/VARCHAR(MAX), and never part of the WHERE clause, I might want to shift them out of the main table into a 1:1 sibbling-table ....
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2010-02-03 : 08:18:35
Well...I agree that the sparse columns might not be what dextrous is looking for after thinking a little more about it. It's more of a space saver than a performance too so I guess partitioning is the keyword to remember.

@dextrous: maybe you can post a part of your table and the queries you usually run...maybe there is room for optimizations?

- Lumbago
If the facts don't fit the theory, change the facts. Albert Einstein
Go to Top of Page

dextrous
Starting Member

15 Posts

Posted - 2010-02-04 : 02:12:38
ALTER

Procedure [dbo].[STR_SP_GetCultureByZcode] (

@Zipcode

INT,
@Querystr

Varchar(200),
@TblName

Varchar(100),
@existsZip

INT OUTPUT )

AS

Begin

SET

NOCOUNT ON;

DECLARE @Result VARCHAR(8000),
@CondStmt1

nvarchar(200)

--check Whether record exist in table with this zipcode


Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = ' + Convert(varchar(30),@Zipcode)

EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT', @existsZip OUTPUT


If(@existsZip > 0) -- If Exists get the data

Begin

Set @Result = 'SELECT TOP 1 WITH TIES theCol FROM (SELECT * FROM ' + @TblName + ' WHERE ZipData =' + Convert(varchar(30),@Zipcode) + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'

End

else -- If not exist take only 5 digits from zipcode and pass as input to query
Begin

if(Exists(Select ZipData from CountryOrigin_View0 Where Left(ZipData,5) = @Zipcode))

Begin

Set @Result = 'SELECT TOP 1 WITH TIES theCol FROM (SELECT * FROM CountryOrigin_View0 WHERE ZipData =' + Convert(varchar(30),Left(@Zipcode,5)) + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'

End

End

Exec(@Result) End

GO

SET

ANSI_NULLS OFF
GO

SET

QUOTED_IDENTIFIER OFF
GO


Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-02-04 : 03:01:46
[code]
Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = @Zipcode' + Convert(varchar(30),@Zipcode)

EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT, @Zipcode INT', @existsZip OUTPUT, @Zipcode
[/code]
this will improve the chances of the query plan being cached

Change:
[code]
if(Exists(Select ZipData from CountryOrigin_View0 Where Left(ZipData,5) = @Zipcode))
[/code]
to first reduce @Zipcode to 5 characters - e.g.
[code]
SELECT @Zipcode = CONVERT(int, CONVERT(varchar(5), @Zipcode))
[/code]
then do a range test on ZipData (you will need an index on ZipData column) and assuming I have understood ZipData usage correctly and my maths is reasonable
[code]
if(Exists(Select * from CountryOrigin_View0
Where ZipData >= @Zipcode * 10
AND ZipData < (@Zipcode + 1) * 10

[/code]
Then, because you modified @Zipcode to 5 digits already, you can use the same query for both routes:
[code]
Set @Result = 'SELECT TOP 1 WITH TIES theCol FROM ([blue]SELECT * FROM[blue] ' + @TblName + ' WHERE ZipData = @Zipcode ) AS t1
UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'
[/code]
and then use sp_ExecuteSQL instead of EXEC :
[code]
EXEC sp_executesql @Result, N'@Zipcode INT', @Zipcode
[/code]
again, this will improve the chances of the query plan being cached

Can you replace SELECT * with
[code]
SELECT ' + @Querystr + ' FROM
[code]
thus avoiding pulling all 120 columns? (There may be some extra columns you need in addition to the ones in @Querystr)

I also question whether your logic is right for the second IF

You do IF EXISTS based on LEFT(ZipData, 5), and then your actual UNPIVOT WHERE clause is ZipData = LEFT(@ZipCode, 5) - which is a very different thing!
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2010-02-04 : 03:19:08
And I don't really see the need for the derived table in the "SET @Result"-query. Unless I'm missing something here the following will be far more efficient:

Set @Result = 'SELECT TOP 1 WITH TIES theCol FROM CountryOrigin_View0 WHERE ZipData =' + Convert(varchar(30),Left(@Zipcode,5)) + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ') ORDER BY theValue DESC'

- Lumbago
If the facts don't fit the theory, change the facts. Albert Einstein
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-02-04 : 03:27:49
I've never used UNPIVOT (I'm normal!!!) so have no idea what works in that regard. Useful to know Lumbago, thanks.
Go to Top of Page

satish.gorijala
Posting Yak Master

182 Posts

Posted - 2010-02-04 : 03:55:18
Hi kristen,

I didn't understand your statement. check the bold one below.
I am getting error when i try to execute this statment

Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = @Zipcode' + Convert(varchar(30),@Zipcode)EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT, @Zipcode INT', @existsZip OUTPUT, @Zipcode

Exec STR_SP_GetCultureByZcode_New 85260,'INDI,CHIN,JPAN','CountryOrigin_View0',1

Msg 137, Level 15, State 2, Line 1
Must declare the scalar variable "@Zipcode85260".

developer :)
Go to Top of Page

dextrous
Starting Member

15 Posts

Posted - 2010-02-04 : 04:02:17
satish works with me...so he will also have questions :)

thanks for your help, kristen & lumbago
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-02-04 : 09:56:51
Did you delete the bit in Blue?

Put a PRINT statement in the SProc so you can see what is in @CondStmt1 and @Result before it is executed (comment out the EXEC / SP_ExecuteSQL temporarily if the error is preventing you seeing the PRINT debug message)
Go to Top of Page

satish.gorijala
Posting Yak Master

182 Posts

Posted - 2010-02-07 : 04:58:46
Hi kristen, i didn't delte the blue one from your suggestion. bcz i am getting some syntax error. I have refined stored proc with some of ur suggestions. please give suggestions from below stored proc what can i do more to optimize this thing and get results faster.


ALTER Procedure [dbo].[STR_SP_GetCultureByZcode]
(
@Zipcode INT,
@Querystr Varchar(200),
@TblName Varchar(100),
@existsZip INT OUTPUT
)
AS
Begin
SET NOCOUNT ON;
DECLARE @Result VARCHAR(8000),
@CondStmt1 nvarchar(200),
@Zip5 varchar(50),
@Zcode Varchar(50)

Set @Zcode = Convert(varchar(30),@Zipcode)
Set @Zip5 = Convert(varchar(30),Left(@Zipcode,5))

--check Whether record exist in table with this zipcode
Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = ' + @Zcode
EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT', @existsZip OUTPUT

If(@existsZip > 0) -- If Exists get the data
Begin
Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM ' + @TblName + ' WHERE ZipData =' + @Zcode + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'
End
else -- If not exist take only 5 digits from zipcode and pass as input to query
Begin
Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM CountryOrigin_View0 WHERE ZipData =' + @Zip5 + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'
End

Exec(@Result)
End
GO


quote:
Originally posted by Kristen

Did you delete the bit in Blue?

Put a PRINT statement in the SProc so you can see what is in @CondStmt1 and @Result before it is executed (comment out the EXEC / SP_ExecuteSQL temporarily if the error is preventing you seeing the PRINT debug message)



developer :)
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-02-08 : 02:39:27
"i didn't delte the blue one from your suggestion. bcz i am getting some syntax "

Yeah, well you will do if you don't delete the piece marked strikethrough.

I marked it up so you would be able to see the changes I made and from that understand the changes. ...
Go to Top of Page
   

- Advertisement -