Twitter

Entries in Indexes (1)

Monday
Oct242011

PASS Summit–Instant ROI

When you turn to your boss and tell them that you want to attend the PASS Summit then you are more likely than not to be asked the dreaded “What’s the return on investment that we would get by paying for you to attend or by giving you the time off to attend?”

Sometimes that can be a little hard to answer, especially if you are asking for extra money to attend a precon or two.

I thought I’d share an immediate return on investment that happened the day after I got home from the Summit where I used information that I had learned from a precon to fix a production problem in a fraction of the time than it would have taken me to figure out the problem before…

 

Houston, we have a problem

I received a call from the NOC saying that there were timeouts happening in production and this was impacting customers. They needed someone to take a look and figure out the issue.

I VPN’d in and saw that CPU on the system in question was running at 99% and there was a great deal of database blocking and a whole lot of spids waiting for resources. Knowing that this server usually sat there quietly humming along with only around 20% of it’s CPU being used I knew there was a serious problem. A quick ask and I learned that there had been no code changes released on either the database or application side so it was time to dig in.

 

What’s running?

With there being no changes it was time to start digging further. I turned to the comfort blanket of sp_whoisactive to dig further.

One of my favorite aspect of sp_whoisactive is that by default it shows you the sql_text for the running spids. This allowed me to very quickly identify that the same piece of code from one of our stored procedures was causing a problem. The update looked pretty simple:

UPDATE TOP (@rcdcount)
        [dbo].[TableWithIssue]
SET     [Session_Id] = @Session_Id
      , [Dtm] = @Date
OUTPUT  inserted.[Id]
      , inserted.[Val]
      , inserted.[Dtm]
        INTO
  @tblOutput
WHERE   [Id] = @Id
        AND [Used_Dtm] IS NULL
        AND ([Dtm] IS NULL
             OR ([Dtm] IS NOT NULL
                 AND [Dtm] < @Date
                )
            ) ;

When I threw a sample execution of the proc into SSMS and took a look at the plan things appeared to be very much off

image

 

There’s a lazy spool that has a cost of 304% and the clustered index update costs 1228%? That’s a little off.

Time to break out another favorite… SQL Sentry Plan Explorer

image

Ok, this was something much more readable. It adjusted the associated costs and made it quick and easy to understand. Because Plan Explorer is very flexible it’s actually possible to just show costs based upon CPU. As this was where my problem was I decided to adjust it to show more relevant data.

 

image

Aha, all of a sudden the worst offender has changed from the clustered index update to the clustered index scan. That’s where I need to focus on the changes.

 

Breaking things down

Time to eliminate the clustered index scan. That should resolve everything.

The first thing was to take a look at the predicate. Hitting F4 in SSMS brings up the properties window of whatever object you have selected, in this case by choosing the clustered index scan I was quickly able to grab the predicate:

[mydb].[dbo].[TableWithIssue].[Id]=[@Id]
AND [mydb].[dbo].[TableWithIssue].[Used_Dtm] IS NULL
AND ([mydb].[dbo].[TableWithIssue].[Dtm] IS NULL
OR [mydb].[dbo].[TableWithIssue].[Dtm] IS NOT NULL
AND [mydb].[dbo].[TableWithIssue].[Dtm]<[@Buffer_Date])

Removing the duplicate columns from this I wound up with just three columns for the predicate:

Id
Used_Dtm
Dtm

 

Next I checked the output list:

Id
Session_Id
Dtm

 

The Id and Dtm columns are in the predicate list so this left me with just the Session_Id column in the output list.

This indicated to me that creating the following index would eliminate the index scan and provide me with a vastly improved index seek:

CREATE INDEX idx_TableWithIssue_Used_Dtm 
    ON dbo.TableWithIssue (Id, Used_Dtm, Dtm) 
        INCLUDE (Session_Id)

 

Before creating the index I went through and confirmed that there were no other indexes that might potentially cover this query with a slight change, however there was just a single clustered primary key in place with a column that would not be included in this new index.

I added the index, which took about two minutes and then recompiled the stored procedure.

CPU immediately dropped to 10% and things started humming along. I pulled up a new query plan and threw it into Plan Explorer:

 

image

 

Much better.

 

That was easy

Certainly seems that way. Without the stuff I picked up in Gail Shaw (blog|twitter) and Grant Fritchey’s (blog|twitter) precon it would have taken me a lot longer to figure out exactly where the problem was and what was needed to be done to fix it. Their guidance on reading execution plans really helped guide me quickly to the problem, understand the data that was presented and implement a solution in a minimal amount of time.

Return on investment? Fixing a production down issue in 20 minutes as opposed to two hours. Is that worth the price of a precon? Yeah, I’m pretty sure it is.

 

Final notes

When I first dealt with this issue I mistakenly thought that the lazy spool was the reason for the high CPU load, just because when I added the index it went away. After doing some further reading and really understanding how they work in a little more detail I realize that it’s vanishing was just a byproduct of adding the index (and indeed it’s replaced by an eager spool in a different part of the plan).

While this was an easy way to show the ROI for that particular precon there was so much more at the Summit than can just be realized by this one incident. Ongoing items while at work which may not be as visible are also improved by the knowledge that I gained. An understanding of SQL 2012 features like AlwaysOn groups will help decide whether we should implement the next version when it comes out as well as provide guidance on where we should or should not deploy certain features. It all adds up.

The fact is that my job didn’t pay for me to attend the Summit this year, nor did it pay for me to go to either of the precons that I attended. Given that there was a lot of out of pocket expense did I find a return on investment for myself? I sure did. I know that I’d rather being doing something fun than spending 2 hours in front of my laptop on a Sunday afternoon trying to figure stuff out.

 

 

Such fun!