Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SQL Server Administration (2005)
 Steps to identify a bottleneck

Author  Topic 

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2009-08-28 : 09:31:10
I wish I were posting about a nice tall, cold Budweiser...

Anyway, we are having a serious performance degradation on our testing databases. We have 5 testing database servers that all share the same external storage (an iSCSI SAN). Our implemenation of the iSCSI SAN is such that:
- each of the 5 servers have their own disks isolated - no sharing period
- for each server, each logical drive is comprised of its own physical disks - again - no sharing.

Now we have been running this way for quite sometime (2ish years) and over the past month something has begun to bring our database response times to a crawl.

Now, I'm looking for suggestions on how to troubleshoot and pinpoint what this problem is. From both a database perspective and a hardware perspective. At this point I am not going to rule anything out - it could be the database, it could be the hardware.

As an FYI - we have ran various profiles, performance counters, etc and we are leaning towards it being a hardware issue (but I'm not ruling out anything). We have taken measures to ensure none of the data files are fragmented on the file system, nor is there any internal fragmentation out of the ordinary.

DeepGround
Yak Posting Veteran

73 Posts

Posted - 2009-08-28 : 13:03:15
Have your databases grown beyond the cache size on your SAN? For example if your database is 100GB and your cache is 50GB the cache will not be utilized at all. You can test this with SQLIO, create a test file larger than your cache size, use random write and watch your IOPS drop by 10x.
Go to Top of Page

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2009-08-28 : 14:07:10
I'll look into that. We are definitely seeing the average disk queue length staying at between 40-100 (versus 2). Thanks for the tip.
Go to Top of Page

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2009-08-28 : 14:17:03
To follow up, what cache exactly are you referring to? I work with the individual that deals with the hardware and the only cache he can think of is the cache on the array controller (which is 128 MB). This is obviously smaller than our databases. There are 15 databases that range from 5 MB to 60 GB. In total, all the databases are approximately 120 GB. I'm assuming these sizes are perfectly within reason. And to go even further, the data files are broken into 3 groups - data, indexes and logs.
Go to Top of Page

DeepGround
Yak Posting Veteran

73 Posts

Posted - 2009-08-31 : 11:49:39
Your SAN only has 128MB on the controller? That is less than my Perc5 card has. Usually a SAN will have in the GB range. What matters is the size of any particular file that the database works with. If the .mdf file is smaller than the cache on your SAN you will get greatly improved IO performance. Same with the .log file.
Go to Top of Page

tfountain
Constraint Violating Yak Guru

491 Posts

Posted - 2009-08-31 : 22:07:01
I don't think because the SAN has a small cache is the reason we are now experiencing a drastic performance decrease. I think this avenue is a dead end.
Go to Top of Page

YellowBug
Aged Yak Warrior

616 Posts

Posted - 2009-09-01 : 07:10:10
You said that you have ran profiler traces and performance counters. I'd suggest you correlate them to get an overall view. If you haven't done this before, here's a good guide: http://www.mssqltips.com/tip.asp?tip=1212

Are all the databases suffering? Any specific queries? Any specific time of day?
What changes were implemented a month ago?
Go to Top of Page
   

- Advertisement -