Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SQL Server Administration (2005)
 data file performance recommendations

Author  Topic 

nic
Posting Yak Master

209 Posts

Posted - 2009-03-06 : 12:43:23
Hi,
I have read from a Dell SQL Server whitepaper that the optimal number of database data files should equal the number of cores per CPU socket. So if you have a single quad proc, I believe they are recommending 4 data files (1 MDF and 3 NDFs). Unfortunately I am struggling to find much more about the topic. I'm curious how others determine the number of data files to be used. I can easily create the database to have a single MDF and 3 NDF files but would be interested in other people's opinions. Does this improve performance? Would this be a general recommendation?

Overall the server has 3 RAID1 virtual disks so TempDB, Data and Logs are all on their own physical disks, but the data file allocation seems like the next step.

After creating the additional data files, is there any other maintenance recommendations etc (I assume I'd do the same with the tempDB data files)?

Here is the article:
http://www.dell.com/downloads/global/solutions/micro_2005_micro_2008_on_dss.pdf

Note: the article appears to be a little inconsistent. At one point it states "due to a balance in contention of multiple files against the increased I/O load, it recommends to break the number of data files into a number of file equal to half the number of cores". So this would imply having 2 data files for a single quad proc. But the rest of the whitepaper (diagrams etc) have 4 data files for a single quad core processor (page 10 in the article if you are curious)

Thanks,

Nic

guptam
Posting Yak Master

161 Posts

Posted - 2009-03-06 : 12:57:02
Hi Nic,

That is the general recommendation by Microsoft. Microsoft recommends that you create number of files equal to the number of Cores. So a Quad Core CPU = 4 files.

However, you should not need to do this for every database. It is strongly recommended you do this change for SQL Server 2005 tempdb because it can get used alot (in SQL 2005 it is used heavily as compared to SQL 2000).

So as for recommendation, I would look at how much a database is getting hit? Are you running into I/O block issues on physical level? Then splitting the database into multiple files is helpful to split the IO from for the database and gain improvement.

I have consider splitting database for following reasons:
- The database size is getting to big; having it in multiple files i can do file group backups and easier to restore.
- Our Disk Management guys tell me it is not a good idea to have hard disks blocks on SAN/NAS because larger the disk longer it takes to chkdsk it and such. So if the disk is getting full, I'll get another disk added and add file group to write to new disk.
- I am having I/O contention issues where writes blocked by reads on the raid controller level; I create multiple file groups distribute the reads and writes.
- In SQL Server 2005 you can use table partioning on large tables that is another case where you can use file groups.

I like following structure:
File Group: Primary
Objects: System Objects
File Group: User_Data
Objects: User Tables, Views, Clustered Indexes
File Group: NonCluster_Indexes
Objects: Non Clustered Indexes

As I said like ;-), don't have to though. On a database I worked on recently I broke it up to 13 files ... I had created file groups based on the functions of the database tables. And reorganized tables to locate it in their proper file group.

--
Mohit K. Gupta
B.Sc. CS, Minor Japanese
MCITP: Database Administrator
MCTS: SQL Server 2005
http://sqllearnings.blogspot.com/
Go to Top of Page

saurabhsrivastava
Posting Yak Master

216 Posts

Posted - 2009-03-06 : 13:18:03
Mohit is correct. Workload characterization is important before you take such decision. I remember on a 64-proccessor server I created just 4 datafiles for some databases and it worked fine for me because that server was hosting around 100 databases with a mix of low and heavy workload. If I would have follwed generic recommendation I would mess up whole system by creating 64 datafiles for each database. Keep in mind that processor landscape is changing very fast and 4-6 core socket are becoming norm so you have to be careful when you decide how many data files you need. Test.. test and Test ..that is my mantra because every project, workload is different so I like to benchmark system based on my customers specific requirement.
Go to Top of Page

nic
Posting Yak Master

209 Posts

Posted - 2009-03-06 : 14:04:21
Thanks for the recommendations, they are very helpful and I will do some more testing.

On a basic level, if I went with 4 data files, do I just create the 4 (1 MDF and 3 LDF) and for a basic implementation have just one Primary File group? By having the 4 data files will sql automatically use them? Since there is only one file group I wouldn't need to change the filegroups on individual objects etc. I can then test this scenario vs 2 data files etc.

For tempDB I assume the logic would be the same.

Finally, as I mentioned above the new server has 3 RAID1 virtual disks (so I can split the data, log and tempDB). Unfortunatley one of those disks will also contain the OS (all one server; no seperate storage).

here was my plan
disk1 (300GB): OS and tempDB
disk2 (300GB): data
disk3 (146GB): log

If tempDB is used extensively, would it be better to put it on its own disk and have the OS and log share? I know this is a different question but I am just trying to optimize things as much as possible. Which gets more use tempDB or log (our product uses a fair amount of temp/dervived tables)

Hopefully my questions make sense. I realize my knowledge of this is limited so any suggestions are helpful.

Nic
Go to Top of Page

sodeep
Master Smack Fu Yak Hacker

7174 Posts

Posted - 2009-03-06 : 14:15:26
SQL Server 2005 uses more TempDB features than SQL 2000.It has tendency to create n number of objects so you should have space allocated(Which is done by internal system pages GAM and SGAM).Inorder to reduce contention issue,you should create no of data files as number of processors.It is even best to put in RAID 10 where you can get maximum write performance.Also I wouldn't put TempDB in drives where system resources like OS,system metadata are located.You have to spread Filegroups across multiple drives to reduce I/O issues for user database.
Go to Top of Page

guptam
Posting Yak Master

161 Posts

Posted - 2009-03-06 : 14:18:40
Hi Nic,

Your configuration is do-able; ideally it is preferred to OS and SQL data files are not mixed. But if you can't get more space, I think you should do that better then having tempdb mixed with your regular db files.

If you create one file group and assign it 4 files; SQL Server will use all four files.

I would not recommend sharing OS and Log; LOG gets used by every database. As the data first gets written to LOG then Database. What the I/O counters, like AVG DISK QUEUE LENGTH for Reads and Writes. After that you can make better judgment.

There is point of no return in optimization; it is a fine art IMO. Looking for a one cut answer doesn't always work; each server usually requires slightly different configuration. But you can apply best practices to minimize the risks.

--
Mohit K. Gupta
B.Sc. CS, Minor Japanese
MCITP: Database Administrator
MCTS: SQL Server 2005
http://sqllearnings.blogspot.com/
Go to Top of Page
   

- Advertisement -