Mar 14 2012
Just add another one on there somewhere Bubba… wherever you can find room.
By default, new databases are created with the following basic file settings:
- Data File: 2 MB 1 MB unrestricted growth
- Log File 1Mb 10% unrestricted growth
Just exactly why is this bad?
- It will assuredly increase the frequency at which files will grow automatically… A transaction log file, if configured for autogrowth, will expand when it does not have enough free space to record a transaction. This could be due to a large transaction or cumulative transactions where the log is not truncated due to a combination of the recovery model and transaction log backup schedule.
- It holds up transaction processing… While the log file is expanded, the transaction that is causing the growth to occur is in a wait state, as is any other transaction that would need to write to the transaction log. This will increase the time required to process and commit the transaction(s) and may result in a timeout to the end user depending upon the connection settings for any affected requests.
- It causes external file fragmentation… Repeated file growth (automatic or manual) can result in external file fragmentation which in-turn affects performance negatively. SQL Server loves sequential reads and writes. A fragmented log file can result in an increase in non-sequential writes; a fragmented data file or index file will result in more non-sequential reads. This is because as SQL Server is required to increase the file size, it does so scattered on the disk as space allows. A logical file with bits all over the disk.
- Instant File Initialization is only applicable to Data Files… Instant file initialization exists in SQL Server 2005 and newer. This alleviates the issue with a file being inaccessible while it is being expanded. However, this only applies to data files – log files are not affected by instant initialization.
- VLF Pandalarium… Virtual Log Files. Think of ‘em as mini logs inside your transaction log file. Each time your log file increases in size you add more of them whether you know it or not. VLF sizes and numbers are based upon the size the log file grow at. In the case of using a percentage metric rather than a flat rate this will change as the file grows in size:
- Growth increments of < 64Mb will create 4 VLFs of equal size
- Growth increments between 64Mb and 1Gb will create 8 VLFs
- All growth above and including 1Gb will create 16 equi-sized VLFs
Why should you care? Inside of the transaction log, as log backups occur (under Full and Bulk-Logged recovery) inactive VLFs are cleared and marked for eventual re-use (the same occurs upon checkpoints under Simple recovery). This allows for the “round-robining” of the transaction log file after the last VLF in the physical log file is is consumed. If there are no inactive VLFs and autogrowth is enabled AND there is more space on disk than the amount of autogrowth that the log is set to grow at then physical log file will encounter an autogrowth event. If you grow using a percent you end up with differently-sized VLFs throughout your physical log file and the behavior of inactivating portions of your log file, and core behavior of how your log file operates is affected. For more on the subject (much, much more) I point you to yet another great article by Kimberly Tripp of SQLskills on the subject of VLFs.
Do you want to get more SQL Server training with a side of fun, networking, and rejuvenation? Then join Tim Ford, Allen White, Kevin Kline, Brent Ozar, Kendra Little, and Jeremiah Peschka in Alaska this May for SQL Cruise Alaska 2012! All registrants received a 1 year license for Quest Software’s SQL Server tools – estimated value of $8000 – for free. See sqlcruise.com for details.
So What Do You Do? – File Sizes, Backup Frequency, and Recovery Models
Configure A Baseline for Missed Manual Configuration Opportunities File growth settings are propogated via the model database. Configure your model database and each new database you create on the instance will use it as a template. Each database is different in its file requirements. However one thing I always advocate is setting a baseline just in case manual settings during a database creation are overlooked for one reason or another. DBA gets busy, you’re performing a go-live while 5 drinks in at the bar, the glare from the pool is making it hard to see what you’re doing while creating a new database…Personally I set a minimum database data file size that is appropriate based upon consultation with the Independent Software Vendor (ISV), internal development team, or other responsible party that has information about projected growth for the database prior to creating/migrating it to production. I also cap the maximum size for both files at an appropriate value based upon consultation with those same individuals. I’d rather cap the file size and cause a potential issue for one database rather than fill a data or log volume and ruin everyone’s day!
Note though that these are only a stopgap. It should not be taken as a crutch, but rather a starting point.
Ask Questions and Be Proactive Due diligence before creating a database is my preferred method of keeping external fragmentation low. When I am tasked with bringing a new database into our domain either from an ISV or internal development department there are certain questions I am sure to have answered by the individual responsible for development – be that one of my Developers or a vendor representative:
- Estimate for largest transaction
- Estimate for initial database size
- Estimate for annual growth in either MB/GB or percentage
- Frequency of transaction log backups or a scale of transaction workload so I can guage the frequency of log backups.
All of these questions taken as a whole provide me with a picture that determines what size I set the data and log files (as well as any index or secondary data files to) as well as the backup frequency. I size all my data files to what I expect them to grow to at the end of three years. Why? Because that is the estimated lifespan of our hardware and I would be likely migrating the database to a new host at that time and would re-assess the settings once again at that time. Granted in the age of SAN this three-year mark tends to become moot, but I still find it a good practice in reducing external fragmentation by reducing the number of times I expect to have to grow the database files.
Recovery Models Play An Important Role
I also have a key question for the customer representative: If there is a failure that requires recovery, is it expected that the database will need to be recovered to a point just before the failure or to the last good database backup? This is key for determining the recovery model for the database and as a result determining how the transaction log is re-used. If I’m told that the expectation is to restore to a point-in-time then I will place the database into Full Recovery; otherwise I will use Simple Recovery for the database in question.
You may notice that I’ve not addressed Bulk Logged recovery in the discussion until now. It behaves identically to Full Recovery in how the log file is truncated. The main difference between Full and Bulk Logged recovery is that certain bulk processes: BCP, SELECT INTO, Index rebuilds, and others are not logged. A complete list of bulk logged transaction are available from Microsoft. Any of these processes that occur since the last transaction log backup would need to be re-run after recovering a database. I personally don’t like using Bulk Logged recovery and only advocate it if you run into situations where bulk processes in your database are impacting performance negatively and you have exhausted all other optimization options.
So What Do You Do? – Autogrowth Settings
Even though I pre-size my database files for three-years of transactions I do still set autogrowth to on and configure growth increments just in case. It is critical for a DBA to understand the databases they support. You also need an understanding of how robust your disk substructure is when it comes to file extension. The key is to set a growth size in MB that will allow for growth in an emergent situation that is large enough to not have to endure multiple expansions to satisfy a need, but small enough not to cause undue time to expand properly – preventing transactions from being logged (transaction log file) or reads to occur (if a data or index file.) You’ll see that I stated these increments should be in MB. Never use a percentage of file size for a growth setting. Never. Never. And in case you are still thinking of setting this value as a percentage I will send an intern over to your office to drive you to a zoo to pick a cute little baby animal. The intern is then instructed to kick the baby animal in the face.
When you use percentage as an auto growth factor and the database is smaller in size you’ll probably encounter many repeated growth instances. At large file size or percentages you may encounter a timeout or long period of blocking while the file is grown.
So What Do You Do? – Monitor
I monitor my SQL Server databases for situations where my files have only 15% remaining space. I have a report from SQL Server Reporting Services that is on my laptop when I walk into my office in the morning that gives me my marching orders for that day’s file maintenance. This information is returned from a scheduled stored procedure on each SQL Server and then sent to a central DBA metadata repository nightly. Since I go through the hassle of doing this work I also save off a record every other day to a physical table in order to provide a history of database file sizes for trending analysis for the volatility of my data and log files.
The code for the stored procedure is available here. The code for the query to determine files of interest meeting criteria of your chosing is available here. Note that I’m using template parameters in this script. Use Ctl+Shift+M to replace the template parameter with a threshold value of your choosing. I use 85% by default.
Of course it goes without saying that you should always run your code in test first. I’m trustworthy, but sometimes stupid.
#9 – Service Accounts
Join me in Alaska!
Do you want to get more SQL Server training with a side of fun, networking, and rejuvenation? Then join Tim Ford, Allen White, Kevin Kline, Brent Ozar, Kendra Little, and Jeremiah Peschka in Alaska this May for SQL Cruise Alaska 2012!