Tag: Performance

FusionIO and the Go Faster Button

Nobody has ever said that FusionIO cards were slow (because they aren’t). Especially if you compare their performance to regular spinning media, or high performance SANs. After all, no SAN will allow you to measure storage write latency in microseconds.

Anyone that has had their database reside on a FusionIO card has had nothing but good things to say about them. The only thing is that a lot of people are probably not making the most out of the cards that they have. FusionIO cards actually have an artificial performance restriction put on them, and can go even faster than they do now.

Continue reading “FusionIO and the Go Faster Button”

Improving Performance When Querying Multiple DMVs

A couple of days ago I posted a stored procedure (sp_GetAGInformation) which queried multiple DMVs to pull together a bunch of AvailabilityGroup information. If you took a look at the code you would see that it used a couple of CTEs (Common Table Expressions).

CTEs are a great way to do recursive work, and they can also greatly simplify reading code. A CTE without recursion is really nothing more than a subquery that is nicely wrapped.

For example:

Basic+CTE

Basic CTE

Is the same thing as:

Basic+Subquery

Basic Subquery

This can easily lead you down the path towards poor performance. It is quite easy to define a CTE once and use it multiple times, not realizing that every time you use the CTE then you are performing that subquery, meaning is has to be evaluated and executed.

For smaller queries this is not usually a problem, but for larger queries and more importantly when working with DMVs this can be a serious performance problem.

Continue reading “Improving Performance When Querying Multiple DMVs”

Stop Bad Database Design

Every year that goes by I sit in hope that I won’t see bad database design.

Every year I am disappointed.

 

As an example here is a table create statement that I saw the other day (table and column names have been changed to protect the innocent)

CREATE TABLE [dbo].BestestTableEVAR(

 Col1 [int] IDENTITY(1,1) NOT NULL,

 Col2 [uniqueidentifier] NULL,

 Col3 [uniqueidentifier] NOT NULL,

 Col4 [smallint] NULL,

 Col5 [smallint] NOT NULL,

 Col6 [bit] NOT NULL,

 Col7 [xml] NULL,

 Col8 [xml] NULL,

 ColA [xml] NULL,

 ColB [xml] NULL,

 ColC [datetime2](2) NULL,

 ColD [datetime2](2) NULL,

 COlE [datetime2](2) NULL,

 ColF [datetime2](2) NULL,

 CONSTRAINT [PK_BestestTableEVAR] PRIMARY KEY CLUSTERED 

(

 Col3 ASC

)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 90) ON [PRIMARY]

) ON [PRIMARY]

 

So what’s wrong with this?

The clustered primary key on this table is a GUID. Is that bad? That’s an unequivocal YES! Read all that Kimberly Tripp (blog|twitter) has to say about GUIDs in database design.

What makes this all the more crazy is that the table has an identity column. That’s a natural clustering key ready and waiting to be used and yet for some reason it’s not.

This table is going to fragment like crazy, it won’t scale and performance will be hideous. Additionally, thanks to the XML columns this table can’t even be rebuilt online meaning there’s no way to help the fragmentation or performance without actually taking the table offline to do it, meaning it can’t handle any transactions. This is a problem in a table on an OLTP system.

 

I would go back and change some things. Let’s say you wanted to keep the table structure the same, that’s fine, but let’s be smart about the keys and indexes.

It would make sense to change the identity column to be clustered (I would also make this the primary key) and then, to ensure uniqueness on Col2 which is the current primary key a unique index is warranted.

CREATE TABLE [dbo].BestestTableEVAR(

 Col1 [int] IDENTITY(1,1) NOT NULL,

 Col2 [uniqueidentifier] NULL,

 Col3 [uniqueidentifier] NOT NULL,

 Col4 [smallint] NULL,

 Col5 [smallint] NOT NULL,

 Col6 [bit] NOT NULL,

 Col7 [xml] NULL,

 Col8 [xml] NULL,

 ColA [xml] NULL,

 ColB [xml] NULL,

 ColC [datetime2](2) NULL,

 ColD [datetime2](2) NULL,

 COlE [datetime2](2) NULL,

 ColF [datetime2](2) NULL,

 CONSTRAINT [PK_BestestTableEVAR] PRIMARY KEY CLUSTERED 

(

 Col1 ASC

)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 90) ON [PRIMARY]

) ON [PRIMARY];

 

CREATE UNIQUE NONCLUSTERED INDEX UI_BestestTableEVAR_Col2 on BestestTableEVAR (Col2);

Sure, we still won’t be able to rebuild the indexes online, but we won’t have the same crazy levels of fragmentation that we would have had before.

 

I know I’ll be seeing a lot of bad design this year and I know that I’ll be forced to push that bad design into production. Doesn’t stop me trying to change things however. Help your devs out, let them know when their design is a problem. Who knows, maybe you’ll change things for the better.

Optimizing Database Restores

You can Google/Bing/AltaVista and get a lot of results for speeding up your backups. Those results will include using third-party tools, performing file and filegroup backups, differential backups and the like. Some of the more technical blogs will talk about adjusting some of the backup parameters to improve throughput.

There’s really not much written (that I’ve found, please add comments with links to posts you’ve found) that talks about tuning database restores.

I started messing with some restore parameters last week and found that it’s possible to make some dramatic performance improvements.

 

How this started

We were performing a data center migration at work. As a part of this migration we were going from SQL 2000 to SQL 2008R2 for some of our databases.

The database, we’ll call MyDB, is 620GB in size. There are 5 data files and a single log file. Current backups are performed using Snap technology, which wasn’t going to help us with performing a migration as we needed to get the data over and setup logshipping, so we needed to perform a regular backup.

Third party tools were not an option as this SQL instance was running SP3a and we weren’t able to find anything that would work below SP4.

A full backup was taken to local disk, it took a little over 9 hours (server and disk limitations prevented any work to try and improve that).

The backup file was transferred to a USB drive, taken to the new datacenter and copied over to a CIFS share where it would be available for performing restores.

 

Hardware/software setup

It’s going to help to have an idea of the configuration that’s being used here.

Hardware:

  • Server: HP DL585 G7 (clustered)
  • CPU: 4 x AMD 6176SE (2294Mhz, 12 cores) for a total of 48 cores
  • RAM: 128GB
  • Storage: Netapp SAN with WAFL filesystem – Fiber connected with 2 x 4Gb HBAs, MPIO configured for least queue depth
  • Network: 10Gb/s (HP NC375i) – 2 connections, configured for failover only, no load balancing
  • One LUN defined (as mount points) for each file in the database to be restored

 

Software:

  • SQL 2008R2 CU7 Enterprise Edition
  • Max server memory: 100GB
  • Affinity masks not set
  • Max degree of parallelism: 6
  • 4 x tempdb files

 

 

Basic restore

The initial restore was performed with default options, no optimization was performed.

RESTORE DATABASE MyDB FROM DISK = 'CIFSDeviceBackupSQLBackupsProductionForRestore.bak' 

WITH MOVE 'MyDB' TO 'L:MyDB_DataMSSQLDataMyDB_Data.mdf'

    , MOVE 'MyDB2' TO 'L:MyDB_Data2MSSQLDataMyDB_Data2.ndf'

    , MOVE 'MyDB3' TO 'L:MyDB_Data3MSSQLDataMyDB_Data3.ndf'

    , MOVE 'MyDB4' TO 'L:MyDB_Data4MSSQLDataMyDB_Data4.ndf'

    , MOVE 'MyDB5' TO 'L:MyDB_Data5MSSQLDataMyDB_Data5.ndf'

    , MOVE 'MyDB_Log' TO 'L:MyDB_LogsMSSQLDataMyDB_Log.ldf'

    , STATS = 5

    , NORECOVERY

 

Restore time was 2:48.

 

Optimization techniques

I knew that adjusting the MaxTransferSize and BufferCount settings could provide significant improvements in backup performance (go read a fantastic SQLCat article on this) and figured that the same would be true for performing restores.

Before I could start changing values I needed to know what values were being used with the basic restore. To do this I enabled trace flags 3213 (trace SQL activity during a backup/restore) & 3605 (output trace activity to the SQL error log).

DBCC TRACEON (3213, -1)

DBCC TRACEON (3605, -1)

 

With the flags enabled I performed another restore which provided the following information:

image

So by default we’re running with a BufferCount of 6 and MaxTransferSize of 1,024.

 

The MaxTransferSize value is the largest unit of transfer to be used between SQL Server and the backup media. This is specified in bytes and can range in value from 65536 (64 KB) to 4194304 (4 MB).

The BufferCount specifies the number of IO buffers to be used for the restore operation. Any positive number can be used for this, however you have to be careful with larger numbers as you could potentially cause out of memory errors.

 

Changing the MaxTransferSize

Not having any obvious hardware restrictions I decided that as the restore was being performed across the network adjusting the MaxTransferSize might provide the biggest benefit.

Feeling confident I just bumped it to the max and performed the restore.

RESTORE DATABASE MyDB FROM DISK = 'CIFSDeviceBackupSQLBackupsProductionForRestore.bak' 

WITH MOVE 'MyDB' TO 'L:MyDB_DataMSSQLDataMyDB_Data.mdf'

    , MOVE 'MyDB2' TO 'L:MyDB_Data2MSSQLDataMyDB_Data2.ndf'

    , MOVE 'MyDB3' TO 'L:MyDB_Data3MSSQLDataMyDB_Data3.ndf'

    , MOVE 'MyDB4' TO 'L:MyDB_Data4MSSQLDataMyDB_Data4.ndf'

    , MOVE 'MyDB5' TO 'L:MyDB_Data5MSSQLDataMyDB_Data5.ndf'

    , MOVE 'MyDB_Log' TO 'L:MyDB_LogsMSSQLDataMyDB_Log.ldf'

    , STATS = 5

    , NORECOVERY

    , MAXTRANSFERSIZE = 4194304 

 

This time the restore took 45 minutes, a reduction of 2 hours over the default.

 

Changing the BufferCount

Having already reduced the restore time by over 70% I wanted to see if I could get a little bit more performance from it. So I made an adjustment to the BufferCount value and set it to 75.

RESTORE DATABASE MyDB FROM DISK = 'CIFSDeviceBackupSQLBackupsProductionForRestore.bak' 

WITH MOVE 'MyDB' TO 'L:MyDB_DataMSSQLDataMyDB_Data.mdf'

    , MOVE 'MyDB2' TO 'L:MyDB_Data2MSSQLDataMyDB_Data2.ndf'

    , MOVE 'MyDB3' TO 'L:MyDB_Data3MSSQLDataMyDB_Data3.ndf'

    , MOVE 'MyDB4' TO 'L:MyDB_Data4MSSQLDataMyDB_Data4.ndf'

    , MOVE 'MyDB5' TO 'L:MyDB_Data5MSSQLDataMyDB_Data5.ndf'

    , MOVE 'MyDB_Log' TO 'L:MyDB_LogsMSSQLDataMyDB_Log.ldf'

    , STATS = 5

    , NORECOVERY

    , MAXTRANSFERSIZE = 4194304 

    , BUFFERCOUNT = 75

 

This restore took a little less time, 38 minutes, shaving another 7 minutes off the already shortened time. At this point I was down about 78% over the regular restore command.

I tried pushing the BufferCount to 100 to see if I could trim things further but found no appreciable difference and so left it at 75.

 

Comparing backup types

As a part of the move log shipping was to be setup to offload reporting for users and so I needed to perform a restore of the database from the new SQL 2008R2 server to another with exactly the same specifications.

The native SQL 2000 backup file was 520GB in size, with no optimizations it restored in 2:48, and optimized restored in just 38 minutes (not bad for a network restore).

The compressed SQL 2008R2 backup file was just 121GB in size. An optimized restore of this database took just 23 minutes. That’s an 86% saving over the unoptimized SQL 2000 based restore.

 

Summary

There are times when you are not able to perform partial restores, or you need to restore entire databases to a new environment in as short a time as possible. Don’t always accept the defaults that SQL Server will give you in these situations. There are things that you can do to improve the performance of your restores (and your backups). It’s worth your spending some time now working in your environment and optimizing your restores.

A little time playing with these things now could be the difference between your database being down for 20 minutes as opposed to 3 hours.

How much is 2 1/2 hours worth to your business?

 

 

Such fun.