Oct 3 2012
Why a photo of my cats as a featured image in this post?
- It illustrates that different base ingredients (mommy cat, daddy cat, cat eggs, and cat sperm, catnip) can result in different outcomes as George and Kip show here.
- Data states that posts with photos of cats receive more traffic.
- It’s the Internet. They’re cats. I think it’s the law.
- I was able to use the phrase “cat sperm” in a technical post about databases…. FIRSTIES!
Lazy Sunday Wake Up Well Before Noon…
I awoke far too early on a recent Sunday with an idea for a blog post in my head that would compare IO overhead between SQL Servers built on the default Windows 4Kb NTFS cluster size and the 64Kb NTFS cluster size recommended for Microsoft SQL Server databases.
I can’t turn this stuff off no matter how much I’d love to. At any rate my intentions included loading test data into identically-created databases on two different instances of SQL with those varying NTFS cluster sizes and then capturing the IO information available through the sys.dm_io_virtual_file_stats Dynamic Management Function. My plan was to show that loading records into these separate databases would result in higher IO activity on the server using a 4Kb block size. I set up my test with a table with a single record filling one row per page and the results were as expected (and forthcoming in my next post soon.) I then made the mistake that I usually do of SEEING WHAT HAPPENS WHEN WE DO MOAR…
I decided I would repeat the results for a record size that should yield ten records per row: a record size of 806 bytes.
What I was finding that when entering in data in 806 byte record sizes that I was not getting two records per data page row as I should expect. Instead of ending up with 99+% page density I was ending up with around 50% page density. I see you scratching your head and I had the exact same reaction. Keep in mind I was using a shortcut to enter data into a table. I was using the GO <some_number> command after the INSERT statement in order to re-run the same line of code <some_number> of times. Ultimately I come to you now to tell you that this will result in different page storage behavior than other methods of inserting data into tables within Microsoft SQL Server.
The code below illustrates the outcomes of different methods of loading tables and the ending page density (courtesy of my favoritest (it is TOO a word) of all the SQL Server Dynamic Management Objects: sys.dm_db_index_physical_stats.
--================================================================== -- CREATE FOO DATABASE TO STORE AND FIGHT FOO --================================================================== CREATE DATABASE [FOO] ON PRIMARY ( NAME = N'FOO_4' , FILENAME = N'C:\FOO.mdf' , SIZE = 100MB , FILEGROWTH = 0) LOG ON ( NAME = N'FOO_4_log' , FILENAME = N'C:\FOO_log.ldf' , SIZE = 100MB , FILEGROWTH = 0); GO --================================================================== -- SHOULD BE 10 RECORDS/PAGE ROW BASED UPON ROW SIZE OF 8060 BYTES --================================================================== --CREATE 3 IDENTICAL TABLES------------------------------- CREATE TABLE FOO.dbo.load_values ( column_1 char(799) NOT NULL) ON [PRIMARY]; CREATE TABLE FOO.dbo.load_select ( column_1 char(799) NOT NULL) ON [PRIMARY]; CREATE TABLE FOO.dbo.load_go ( column_1 char(799) NOT NULL) ON [PRIMARY]; --============================================================================== --INSERT 10 ROWS PER TABLE USING DIFFERENT METHODS; SHOULD ADD 1 PAGES TO TABLE --============================================================================== -- First As Values into dbo.load_values INSERT INTO FOO.dbo.load_values (column_1) VALUES ('1'), ('2'),('3'), ('4'),('5'), ('6'),('7'), ('8'),('9'), ('10'); GO -- Then using a SELECT statement to populate dbo.load_select INSERT INTO FOO.dbo.load_select (column_1) SELECT TOP 10 [BusinessEntityID] FROM [AdventureWorks2012].Person.Person; GO -- Finally using a GO 1000 construct As Values into dbo.load_go INSERT INTO FOO.dbo.load_go (column_1) VALUES ('1') GO 10 --================================================================== --OBSERVE THE WONDERMENTS OF INCONSISTENCY --================================================================== SELECT object_name(ixPS.[object_id]) AS [table_name] ,ixPS.[page_count] , ixPS.[record_count] , ixPS.[avg_record_size_in_bytes] , ixPS.[avg_page_space_used_in_percent] FROM sys.[dm_db_index_physical_stats] (DB_ID('FOO'), NULL, NULL, NULL, 'detailed') ixPS ;
Entering data in using subsequent, multiple VALUES or via a SELECT store data identically, however when I was using the shortcut of GO 10 to load ten identical records I ended up with a page density far lower than expected and not in conformity with what is expected. Just a reminder that not all shortcuts get you to your goals any sooner.
Do you want to get more SQL Server training with a side of free consulting, professional development, networking, and rejuvenation? Then join me and other leaders in the SQL Server Community (including Buck Woody, Aaron Bertrand, Kevin Kline, Stacia Misner, and Allen White) in 2013 for SQL Cruise 2013 where we will cover not just the technical side of SQL Server, Powershell, and Business Intelligence, but also the aspects of your skill set that make you a more valuable resource for your company and yourself.