Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Raptor Drives?

Author  Topic 

albertkohl
Aged Yak Warrior

740 Posts

Posted - 2007-02-21 : 14:10:10
So i'm not really new, but got a question. i've recently been looking to to the Western Digital Raptor Drives.

as far as performance, it's always been my understanding that the speed of the hard drive is just about always the bottle-neck of a computer. i'm currently running 2 stripped WD 500gb SATA drives for my SQL server (dual Xeon 2.8 with 2GB memory).

i'm thinking of upgarding to 4 WD Raptors (10k RPM) drives, the new 150GB models. anyone have an opionion? do you think i'll get a large performance increase?

the database that i run queries off of now is about 125 Million names, with about 80 fields in width. so it's rather large, and usally takes a fair abount of time to get my results back (we're talking anywhere from a minute, to half the day.)

do you think the raptors will slim that down significantly?

Kristen
Test

22859 Posts

Posted - 2007-02-21 : 14:25:41
"do you think the raptors will slim that down significantly?"

Like what? Would "from half-a-minute to quarter-of-a-day" be a good improvement?

Are you searching for Names on a wildcard basis? Selecting just a few hits, or a truckload [e.g. to bulk Email]

Are you equally likely to retrieve any one of the Names, or could they be partitioned?

I think you would get more improvement by:

Optimising the application
Increasing RAM
Increasing CPU
Increasing the NUMBER of drives

ahead of Increasing the rotational speed.

But its a bit "it depends" without knowing more about the specifics of the Application

Kristen
Go to Top of Page

albertkohl
Aged Yak Warrior

740 Posts

Posted - 2007-02-21 : 14:45:07
well, that would mean a 50% improvement :o) i am database guy (dont quite want to claim the database admin title yet, i'm a far reach from you guys) for a marketing company so yeah, i get a truck load of hits (for example, one i pulled today was 31 million.) they usually range from a few thousand, do a few hundred thousand.

i have the names partitioned out according to how often i pull data out of them, (e.g. i hae one table of people who have mortgages, one of homeowner (no mortgage), and one of people with a phone number)

also, with a 2.8 dual xeon, 2gb of memory, you dont think that was already overkill? i almost never max out the CPU, (usually it's running at around 2-5% capacity, and the memory isnt even half way used. 588 of the total 2gb.

as i said, i'd also be running 4 drives on a stripe, with faster seeks, and a way faster spindle, and less disk surface to scan.

also, when you say, optimise the application. i just to strait pulls out of query analyzer, the only overhead other then actully finding the data, is pulling it from server to client; i'm on a gigabit network, so not much bottle-neck there.

the drives i'm also thinking about getting are FULL-Duplex, as opposed to the half duplex drives i have. (although 99.999% of the quries i run are read)
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-02-21 : 15:27:44
"and the memory isnt even half way used. 588 of the total 2gb."

That doesn't sound good - given that you have 125M rows. Unless the database is tiny. Errmmm .. how big is the database?

Assuming that Database > Memory then Memory should be all used up (well, not straight after a Reboot, but a day or two later it should be)

"as i said, i'd also be running ..."

Well, the data should be flowing faster into SQL Server. However, why isn't it sort-of fast-enough already?

Maybe the Controller is cheap-and-nasty? !!

"i just to strait pulls out of query analyzer"

OK, but maybe some more appropriate indexes would help, or maybe you are doing:

WHERE ZipCode LIKE @MyZipCode
OR Name LIKE @MyName
OR City LIKE @MyCity

We have queries like that one millions of rows. They are all sub-1-second because we have hand optimised the scenarios where they might be used:

IF @MyZipCode IS NULL AND @MyName IS NOT NULL AND @MyCityIS NOT NULL
BEGIN
... Do this hand tuned Name+City query ...
END
ELSE
...

but if many/most of your queries are pulling 100,000's of rows then there isn't much that will help. That's a lot of data to shift around the place.

Kristen
Go to Top of Page

albertkohl
Aged Yak Warrior

740 Posts

Posted - 2007-02-21 : 15:39:22
well, we have all the fields indexed, and split into smaller tables, all in all, i'd say the entire DB is about 500gb
we do however index every column (i know it's dumb, but we dont really modify the data at all, we update once a quarter, so after that, the data is all ready to go, and indexed)

why wouldnt the server be utilizing more memory then? i have 3 servers running, and they are all in the same boat as far as how much memory is being used. all 3 servers are also only used to house the database, i have a copy of the same db on each server, that way i can kinda load balance it when things get busy.

as for the controller, it's an INTEL server board, specifically designed for a server class processor. so i would HOPE it's not a POOP controller. :o)

here is an example of a query i just pulled:


where

state_abbrv='fl'
and est_inc not in ('a','b','c')
and countY_name in ('volusia','seminole','orange','osceola','lake','polk','hillsborough','pinellas','pasco','hernando')


this is off a table that is about 64m names strong. the query netted 783,000 names in about 19 minutes.

it's not BAD considering what i'm doing (atleast i dont think it is) but if i can cut that down to 5 minutes, or even "Sub-1-second" i'd be jumping for JOY!


oh yeah, and that:IF @MyZipCode IS NULL AND @MyName IS NOT NULL AND @MyCityIS NOT NULL
BEGIN
... Do this hand tuned Name+City query ...
END
ELSE
...


thanks for the head ache :\
Go to Top of Page

albertkohl
Aged Yak Warrior

740 Posts

Posted - 2007-02-21 : 16:10:11
another quick example:

where

city='shawnee'
and state_abbrv='ok'
and wealth_rating>=6

this one took 11 minutes, and only netted 4000 names. the table i was pulling this out of is about 30m
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-02-21 : 16:28:18
"i'd say the entire DB is about 500gb"

At that size I reckon it ought to be using every last byte of memory you have available then ...

.. suggest you pursue that a bit further here to see why you aren't KILLING your existing servers RAM - I would hope the learned folk here might be able to shed some light on it.

"we do however index every column"

I think its worth reconsidering that. Every index you have that won't EVER be used (e.g. if its a column with only a YES/NO choice and the data is spread 50:50) will add to the time that the Query Optimiser takes to consider the best plan. Indexes that are not being used should be dropped.

Conversely, it may be worthwhile setting up some composite indexes (multiple keys in a single index) if they will "cover" some commonly-used queries.

SELECT MyName
FROM NamesTable
WHERE MyZipCode = 'xxx'

will perform extremely well with a composite index on "MyZipCode, MyName"

Also VERY important to make sure that the indexes are well maintained (optimal density, and defragged often; plus Statistics need to be updated regularly so they provide the best "hits" to optimiser - you may need to use FULL_SCAN on some of the Update Statistics to get a good enough sample.

Of course there is a law of diminishing returns if you have lots of different queries! But on the 80:20 rule its worth looking at your most-commonly-used-queries

where
state_abbrv='fl'
and est_inc not in ('a','b','c')
and countY_name in ('volusia','seminole','orange','osceola','lake','polk','hillsborough','pinellas','pasco','hernando')

I realise that these are probably largely "ad-hoc", so a bit hard to plan for.

But I think this one might be interesting to try some variations on. IME the Query Planner does a fine job, but as the database gets bigger and the queries get more varied, its often possible to outwit it.

Assuming that

state_abbrv='fl'
and est_inc not in ('a','b','c')

occurs quite "commonly" in the data (you'll know the answer to that far better than my Pure-Guess!) then it might be quicker to do:

DECLARE @TempTable TABLE
(
MyID int NOT NULL,
PRIMARY KEY
(
MyID
)
)

INSERT INTO @TempTable(MyID)
SELECT MyID
FROM dbo.MyNamesTable
WHERE countY_name in ('volusia','seminole','orange','osceola','lake','polk','hillsborough','pinellas','pasco','hernando')

SELECT ... columns ...
FROM dbo.MyNamesTable AS N
JOIN @TempTable AS T
ON T.MyID = N.MyID
WHERE N.state_abbrv='fl'
AND N.est_inc not in ('a','b','c')

(If I've got the "commonly occurring" round the wrong way then just swap what goes into @TempTable, and put the more-common bit in the second stage filter)

So basically put the "least-matches" data into a TempTable, and then JOIN that with the main table and the rest of the WHERE clause

NOTES:

1. Make sure you pre-create the table

(for your data volume "SELECT MyID INTO #MyTempTable FROM MyTable WHERE ..." type coding is going to be REALLY bad on locking in TEMPDB)

2. Make sure you have a Primary key on the Temp Tables

3-ish. If you are expecting lots of rows in the resultset then use #TempTable rather than @TableVar - I'm not sure what the sweet-spot is, you'll need to experiment.

"thanks for the head ache :\"

Yeah, well I suppose I was expecting that reply!

The issue is whether it is worth spending, say, 1/2 a day experiment and "building" knowledge/shell-queries, or fire-fighting the requirements instead!

I'm in the "Do prudent development as a priority over getting quick results" camp ...

Kristen
Go to Top of Page

albertkohl
Aged Yak Warrior

740 Posts

Posted - 2007-02-21 : 16:48:57
okay, one thing at a time. wow, now my head really hurts.

SELECT MyName
FROM NamesTable
WHERE MyZipCode = 'xxx'


will perform extremely well with a composite index on "MyZipCode, MyName"


i tried to figure out how to create composiie indexes, and ended up crying my self to sleep. what would i do to do one for the zip field? alot of counts we run will have a bunch of zipcodes, rarely do we pull one zipcode. will this tweak have the same affect?


also, the second query is assuming my original table as a primary key correct?
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-02-21 : 17:07:53
If SQL Server is using no more than 500 MB of memory, then you are probably not using the disk, except when data is first read into memory. I find this a little hard to believe with the size of the tables you described. Frankly, the numbers you have posted don't really seem likely.

What is output you get when you run this script in your database?
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=61762








CODO ERGO SUM
Go to Top of Page

albertkohl
Aged Yak Warrior

740 Posts

Posted - 2007-02-21 : 17:29:34
the page file usage is at 1.86gb (and that's of what's totally available, physical memory, and swap) , it's a strait line, and i've been watching it all day, i dont think it's moved a milimeter.

as for what i output, i dump about 13 of the 83 available fields.

firstname,lastname,fullname,address,city,state,zip,phone,mortgage date,lender,mortgage amount,wealth rating,income,age
Go to Top of Page

albertkohl
Aged Yak Warrior

740 Posts

Posted - 2007-02-21 : 17:31:10
i'll run that script to. is it gunna kill my server? should i wait until tonight and run it while everyone's gone?
Go to Top of Page

eyechart
Master Smack Fu Yak Hacker

3575 Posts

Posted - 2007-02-21 : 19:06:20
will you see a large performance improvement? possibly.

The raptors now support command queueing which will definitely help. TCQ is something that is available on most (all?) SCSI drives these days and is without question a requirement for server type workloads. The 150GB raptors support native command queueing which WD claims to be better than the TCQ implementation used in the last generation of raptors.

Don't skimp on the the RAID controller. Also, you will have to use a driver/controller combination that support AHCI mode in order to enable the TCQ/NCQ features. Make sure system supports that otherwise performance will suffer.

Or you could go with a SCSI solution and not have any problems at all as well as having the fastest possible configuration. See this thread over at SQL-Server-Performance.com for info http://www.sql-server-performance.com/forum/topic.asp?TOPIC_ID=16995



-ec

Go to Top of Page

albertkohl
Aged Yak Warrior

740 Posts

Posted - 2007-02-21 : 20:28:43
8 dbo.Consumer 126898918 72291544 72282448 152 8944 70597.2109 68.9426 583.35045 583.27705 .00123 .07217
9 dbo.Consumer2 113598889 66615896 66597200 536 18160 65054.5859 63.5299 600.48719 600.31866 .00483 .16370
19 dbo.mailable 109833517 64111440 64098104 464 12872 62608.8281 61.1414 597.72387 597.59953 .00433 .12001
6 dbo.condnc 65836629 43809792 43798216 384 11192 42783.0000 41.7803 681.40225 681.22220 .00597 .17408
5 dbo.con2phone 72197400 36760024 36759880 88 56 35898.4609 35.0571 521.37978 521.37774 .00125 .00079
7 dbo.conphone 66945895 31583960 31583848 88 24 30843.7109 30.1208 483.10617 483.10446 .00135 .00037
15 dbo.homeowner 46866686 27148200 27146216 104 1880 26511.9141 25.8905 593.16669 593.12334 .00227 .04108
3 dbo.allpro 187286913 12963952 12962280 48 1624 12660.1094 12.3634 70.88102 70.87188 .00026 .00888
11 dbo.delorme 65930982 11908968 11908872 40 56 11629.8516 11.3573 184.96286 184.96137 .00062 .00087
14 dbo.heritage 32003534 6986600 6986536 32 32 6822.8516 6.6629 223.54651 223.54446 .00102 .00102
20 dbo.mortgage 3940526 6892592 6704624 48 187920 6731.0469 6.5733 1791.13504 1742.28897 .01247 48.83360
12 dbo.dnc 134940732 3505560 3504960 24 576 3423.3984 3.3432 26.60200 26.59745 .00018 .00437
22 dbo.NewMovers 22215073 3123928 3123472 24 432 3050.7109 2.9792 143.99693 143.97591 .00111 .01991
16 dbo.ims 3255521 1850704 1849952 16 736 1807.3281 1.7650 582.12523 581.88869 .00503 .23150
17 dbo.jara 4162380 1805920 1802160 24 3736 1763.5938 1.7223 444.27997 443.35497 .00590 .91910
10 dbo.craftmat 2263125 1443728 1443528 16 184 1409.8906 1.3768 653.24605 653.15556 .00724 .08325
21 dbo.NewHomeowners 5496447 1355856 1355608 16 232 1324.0781 1.2930 252.59891 252.55271 .00298 .04322
25 dbo.tf013107 1460843 325584 325512 16 56 317.9531 .3105 228.22303 228.17256 .01122 .03925
2 dbo.alldelorme 2291838 225480 225456 8 16 220.1953 .2150 100.74513 100.73441 .00357 .00715
1 dbo.acsmtg 97698 24280 24248 16 16 23.7109 .0232 254.48546 254.15005 .16770 .16770
18 dbo.kall8 130963 3152 3128 16 8 3.0781 .0030 24.64550 24.45784 .12510 .06255
4 dbo.celeb 12071 2712 2672 16 24 2.6484 .0026 230.06280 226.66954 1.35730 2.03595
24 dbo.spanish 13242 344 280 24 40 .3359 .0003 26.60142 21.65232 1.85591 3.09319
23 dbo.sic 1395 40 32 8 0 .0391 .0000 29.36201 23.48961 5.87240 .00000
13 dbo.dtproperties 0 0 0 0 0 .0000 .0000 NULL NULL NULL NULL
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-02-21 : 21:06:02
"will this tweak have the same affect?"

Yes, it should do, provided that the Columns in the Query are "covered" by the index.

SELECT MyName
FROM NamesTable
WHERE MyZipCode IN ('xxx', 'yyy', 'zzz')

only uses the two columns [MyName] and [MyZipCode], so is covered by the index.

There are probably instances where you have shed-loads of values in the IN clause such that the optimiser decides its better just to trawl sequentially through the table itself, but for small-ish numbers of values that should be true.

I still think it is fishy that your system isn't using more RAM

Kristen
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-02-21 : 23:29:37
Very strange numbers from the script output.

Your tables are very large with 3 tables above 60 GB, and another 6 tables over 10 GB. The thing that is strange is that other than possibly having a clustered index on the tables, there seem to be no indexes at all on any of the tables. The script output shows average index bytes per row to be less than 1.

I can't believe that your SQL Server would not be using more than 500 MB of memory, unless you have it configured to not use more than that, or there is some other problem, like you are running some other application on this server that is using the memory.

The lack of useful indexes and mis-configured memory is likely the cause of your problems. I doubt that faster hard drives will make much impact on your performance problems. I would work on those before looking at replacing disk drives.

With a database that size, you might benefit by moving to SQL Server 2005 64-bit, and loading your server up with lots of memory and fast processors so that you can scan those large tables rapidly.

Without seeing the query workload, table structure, data distribution, and more detailed performance analysis, I can't really say more. If you don't have the expertise to take this on, probably your best investment would be hiring a good database consultant to look into the performance issues.




CODO ERGO SUM
Go to Top of Page

albertkohl
Aged Yak Warrior

740 Posts

Posted - 2007-02-22 : 00:15:35

I can't believe that your SQL Server would not be using more than 500 MB of memory, unless you have it configured to not use more than that, or there is some other problem, like you are running some other application on this server that is using the memory.

i installed SQL with the default setting, and installed sp4. other then that. that's it. is there a config file anywhere or something i can change to get SQL to utilize more memory? again this machine is a DUAL 2.8Ghz Xeon (thought those were good) and 2gb of memory. not enough u think?

Go to Top of Page

eyechart
Master Smack Fu Yak Hacker

3575 Posts

Posted - 2007-02-22 : 01:38:33
fire up perfmon and look at look at SQL Server:Memory Manager:Target Server Memory and SQL Server:Memory Manager:Total Server Memory counters. What are the values you see?


-ec
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-02-22 : 04:21:36
"again this machine is a DUAL 2.8Ghz Xeon (thought those were good)"

Yeah. 4 CPUs would be better, of course . Lots of on-CPU-cache is good too ... I think our servers have 512 (but that's from my aging memory!)

"2gb of memory. not enough u think?"

The first issue is whether SQL is using "all" of it (obviously the O/S needs some ...)

to use more than 2GB you would need Enterprise version of SQL Server 2000, or SQL Server 2005 [which has much more liberal memory limits within the license]

"there seem to be no indexes at all on any of the tables"

MVJ: Particularly given that albertkohl said earlier "we do however index every column"

Kristen
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-02-22 : 09:36:17
quote:
Originally posted by Kristen
...
"there seem to be no indexes at all on any of the tables"

MVJ: Particularly given that albertkohl said earlier "we do however index every column"

Kristen


I am speculating that they might have clustered indexes that include all the columns. Not likely to be a useful index, except on queries that need to lookup the data in the exact order of the clustered index.

The data rows in the large tables are fairly wide, 450 to 700 bytes, so I expect most queries would be using table scans. If that is the case, SQL Server should be using close to available system memory.



CODO ERGO SUM
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-02-22 : 09:43:22
"I am speculating that ... "

Ah ... I wouldn't have thought of that, but I can see that NOOBs might think that "ups" performance.

Kristen
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-02-22 : 09:59:45
quote:
Originally posted by Kristen

"I am speculating that ... "

Ah ... I wouldn't have thought of that, but I can see that NOOBs might think that "ups" performance.

Kristen


That's the trouble with knowing what you're doing; you don't have the imagination to try the really bad ideas.




CODO ERGO SUM
Go to Top of Page
    Next Page

- Advertisement -