Improving Performance When Querying Multiple DMVs

A couple of days ago I posted a stored procedure (sp_GetAGInformation) which queried multiple DMVs to pull together a bunch of AvailabilityGroup information. If you took a look at the code you would see that it used a couple of CTEs (Common Table Expressions).

CTEs are a great way to do recursive work, and they can also greatly simplify reading code. A CTE without recursion is really nothing more than a subquery that is nicely wrapped.

For example:

Basic+CTE

Basic CTE

Is the same thing as:

Basic+Subquery

Basic Subquery

This can easily lead you down the path towards poor performance. It is quite easy to define a CTE once and use it multiple times, not realizing that every time you use the CTE then you are performing that subquery, meaning is has to be evaluated and executed.

For smaller queries this is not usually a problem, but for larger queries and more importantly when working with DMVs this can be a serious performance problem.

Continue reading “Improving Performance When Querying Multiple DMVs”

Gathering AG Information – sp_GetAGInformation

The release of SQL 2012 brought about SQL Server AlwaysOn Availability Groups (AGs) as a new way to manage HA for databases.

With AGs came a whole lot of new DMVs to give you information. They also provided a nice dashboard which gives a view into the status of a particular AG
AG+Dashboard

AG Dashboard

This can be quite useful, however it is missing a great deal of information, that as a DBA, I would find useful, like read routing and listener configurations. On top of that the dashboard only provides information on one of the AGs at a time. If you have more than one AG then you have to open up an entirely new dashboard.

This just wasn’t working out for me, and so I wrote a stored procedure (sp_GetAGInformation) to provide me with the configuration information for all the AGs running on a server.

When executed it provides:

  • Availability Group Name
  • Listener Name (if exists)
  • Primary Replica Name
  • Automatic Failover Partner (if exists)
  • Sync Secondary Replicas (if any)
  • Async Secondary Replicas (if any)
  • Read Routing Replicas (if any, in routing order)
  • List of Databases in Availability Group

sp_GetAGInformation+Results
Results of executing sp_GetAGInformation

As you can quickly see in the above example the AGAdvWrks AG has a listener, an auto-failover partner and two servers in the read routing order. It also contains two databases. AGTestAG doesn’t have any sync secondaries, or a listener, and only contains a single database.

If you have several AGs running in your environment this can be a real time saver. What’s also great is to pull this data centrally and report against it.

For example, right now I have a PowerShell process that queries every server, pulls the data back to a central location and reports on any changes in the configuration (if a servers gets pulled out for some reason, or a database added or removed from an AG). This can be an a real timesaver, in particular when you need to connect to a primary, but aren’t sure which server it is (given that neither SQLPS nor SSMS support multisubnet failover connection settings).

One of the limitations is that the data can only be obtained from the primary in an AG as certain sets of the data only reside there, and the read routing configuration can be (and should be) set differently on each server.

Give sp_GetAGInformation as try and let me know what you think. Any ideas for improvements are warmly welcomed.

A Better Way To Get SQL File Space Information

I wish this was actually a better way, but really it’s an idea for a better way to get at the data.

Something that has come up at every place I’ve worked is the desire to monitor growth on data and log files for databases. If you aren’t using some 3rd party monitoring software this is a pretty painful operation which forces us to jump back to cursor type operations where we have to gather information one database at a time (although this does not apply to transaction logs).

 

Gathering Log Space

To currently gather log space you would use DBCC SQLPERF(‘LOGSPACE’) which will give you size and used percentage for every log file on a server.

Dumping that information into a table allows you to manipulate the data to figure out KB/MB/GB usage (rather than just plain percent).

As an example (GatherLogSpace.sql):
download (1)

Gathering Log Space

This is a reasonable amount of work, but what if you want to get space used in data files? Well that gets a whole lot more complicated.

 

Gathering Data File Space

In order to gather information on space used in data files you actually have to go to each and every database and run a query using sys.dm_db_file_space_usage.

As an example (GatheringDataFileSpace.sql):
download (2)

Gathering the space available in data files

Strangely the sys.dm_db_file_space_usage DMV only contains the data file information. The log file data can only be obtained using the first process. This means that there is no single place that you can go to get file usage information for a database.

 

What about PowerShell?

PowerShell could give you the data, but it is still a cursor type operation, and you have to work through each database, drilling down to the log information, the filegroup and file inside each of those filegroups.

What we really need is a nice set based query that we could run in TSQL to give us all the information that is needed.

 

What I Would Like To See

Right now you could go and grab the size for each and every file for every database inside of SQL Server by querying sys.master_files:

download (3)
Basic information from sys.master_files

There’s a whole lot more information in sys.master_files, including growth data, so you can know exactly how much each file will grow the next time it does.

Even though there’s this great data there is nothing to give us space used data. It would probably take too much overhead to keep that information up to date, but there’s no reason that we shouldn’t be able to easily get at the data when we want to.

Ideally we would see the existing DMV changed into a function, which would accept a database_id and file_id, and return the same data that it does now.

If that were the case it would be simple to run one query that would give us the size of each file, used space, free space, and how much the file would grow the next time an autogrowth hit.

This could quickly and easily give us all the data we want:

download (4) Mock up of easily obtained file space information

As a hope and desire to see this happen I’ve created a Microsoft Connect item – sys.dm_db_file_space_usage Should Be A Function

If you manage databases, and it’s something that you’d really like to have (who wouldn’t?) then please go and give it an upvote. There’s a chance it might make it into the product at some point, and that would make a DBAs life a little bit easier.

 

Traffic Flow With Read-Intent Routing

One of the big advantages to using SQL Server Availability Groups is the ability to automatically push read traffic over to a secondary server. This is particularly useful for larger queries that would take a few seconds to run and consume large amounts of resources. It’s not something recommended for short, fast queries, just because the additional latency of connecting to the secondary could slow down the overall response time for the query.

The Microsoft documentation on setting up Read-Only Routing in SQL AGs is pretty solid and explains how to get this up and running.

 

Firewall and traffic routing

In secure environments there is usually a firewall that resides between the front end web, application or mid-tier servers and the backend database server. This firewall would block all traffic to the backend except for specific ports to specific IP addresses. This is one of the defense in depth items that helps to keep your databases secure.

When using a firewall in conjunction with SQL Server Availability Groups (AGs) it is common to just open up the firewall to the AG Listener. That way there is a single IP open for all the database servers that reside in the AG and any machine that is not acting as the AG primary is not available through the firewall (reducing attack vectors again, a nice side effect).

Given this you might well expect that when routing traffic off to a readable secondary in the AG that it would follow the flow of:

download (5)

Here the client (either directly or through a web, app, or mid-tier) performs an action that does a read query against the AG Listener. The expected traffic flow would be (from what we would see IP address wise, the AG Listener would actually connect to the primary, in this case SQL1):

Client – AG Listener – Readable Secondary – AG Listener – Client
so
Client – SQLAG01 – SQL2 – SQLAG01 – Client

This way the primary server (in this case SQL1) would arbitrate all the traffic for the query that comes in. In fact read routing does not function this way.

In order to perform the expected task of reducing the load on the primary the primary actually tells the client to redirect to the secondary server, and so the process goes:

download (6)

The correct communication is

Client – AG Listener – Secondary – AG Listener – Client – Secondary – Client
or
Client – SQLAGL01 – SQL2 – SQLAGL01 – Client – SQL2 – Client

When the client request comes in SQL has to check that the readable secondary is available to accept the query (otherwise it will go to the next server in the routing list, which is why you should always have the primary as the last server in the routing list, just in case every other server is out of service).

This means the query will take a little longer to execute as the arbitration and network changes will take additional milliseconds to complete (why it is not ideal for small, fast selects).

 

Where does the firewall come in?

Using a firewall and only opening up the IP of the Listener is the best way to handle security, but if you want to use readable secondary server and read-intent routing that’s not going to work. Due to the way that the traffic is routed you would need to open up the firewall to each individual server and port that would be a secondary.

So in our above example the firewall would need to be opened to SQLAGL01, SQL1 & SQL2 in order to support client requests. If those rules aren’t opened then you’re client traffic will be blocked and you’ll get the dreaded “Named Pipes Provider: Error 40” error, which isn’t much of a help.

 

Testing your read-intent connections

A really useful way of testing your read-intent connections is to use a quick PowerShell script from your front end server (if running Windows) prior to putting it into rotation. Download Check-ReadRouting.PS1 and enter the AG Listener name, or IP Address and the name of a database in the AG. If things are working correctly it will return the name of the primary and first server in your read-only routing list.

If you get a timeout then you have either not set the read-intent URL correctly for your secondary, or you are having firewall issues connecting, and so should investigate further.

 

Read-routing can be really powerful and useful, you just have to be careful of the gotchas in getting it working correctly.

 

Querying Change Tracking Tables Against a Secondary AG Replica

If you aren’t familiar with Change Tracking I would recommend heading out and reading Kendra Little’s Change Tracking master post which contains a wealth of information. 

I’ve been using CT for a while now and it does what it says on the box. The performance can be painful when querying tables that have changed a lot (the changetable function performs a huge amount of aggregation and seems to be optimized for 1000 rows of data). One of the things that I’ve always wanted to do is perform loads into an ODS from a DR site. 

I use AvailabilityGroups to ensure that a near real-time copy of the data is kept in another data center in another part of the country. I’ve tried a couple of times to query the change information from one of the secondary replicas, but sadly it’s not supported and so I would get the error

Msg 22117, Level 16, State 1, Line 1
For databases that are members of a secondary availability replica, change tracking is not supported. Run change tracking queries on the databases in the primary availability replica.

 

Yesterday I was messing around with database snapshots and was really happy to discover that it is possible to use the changetable function against a snapshot and not receive any errors. This will only work against readable secondary replicas (as the database needs to be online in order to be able to take the snapshot).

This is also the case with log shipped copies of databases. If the database is in standby then you can access the changetable function directly, or do so off a snapshot.

 

It doesn’t seem like this is a big deal, but if you like to load data into an ODS or Data Warehouse server and it’s not located in the same location as your AG primary, then this is huge as you can asynchronously write data over a WAN and then do all your data loads local to the ODS. This is far more efficient (and your network admin will like you a lot more) than pulling the data over in a large chunk nightly and saturating a part of the network.

Just another way that you can make your DR system work for you.

Rolling Upgrades With Availability Groups – A Warning

One of the great options provided by Availability Groups, in SQL Server 2012 Enterprise Edition and newer, is the ability to perform rolling upgrades to new Service Packs or Cumulative Updates.

The basic idea is that you apply the update to one of the AG secondary servers and then perform a failover of SQL to that server which then does the necessary things on the user databases to bring them up to the level of the update. The big advantage to this is that it minimizes the outage required to get the SP/CU applied, so that you are down for a few seconds instead of 40 minutes.

 

This is works really well for your regular user databases, however there is a problem when applying a CU or SP to a secondary server where a Integration Services (typically called SSISDB) is a member of an Availability Group. If you attempt to apply the CU/SP then it can fail and the SSISDB be left in an offline state. 

In order to apply the CU/SP you would first have to remove SSISDB from the Availability Group and recover it on each server you want to patch. Once you have completed patching all the servers you can add SSISDB back to the AG. But for that period of time you will be at risk, so get through and patch a couple of the machines and get the AG working for those as soon as possible.

Interestingly this does not apply for all CU/SP releases. Some do not make changes to SSISDB and this isn’t required. You can only find this out by patching, so be sure to get it going in your test environments first.

An Open Letter To SQLSaturday & User Group Organizers

Dear event organizers;

 While this letter is directed at the SQL Server community the points I present here are valid in other situations.

 

The SQL Community

One of the things that makes SQL Server stand out so much from other products (not just RDBMS) is the level of support and caring that it receives from its user base, frequently known as the SQL Community. There are many levels to this: from folks that read a daily newsletter, to those that help run a professional association dedicated to the product. 

Sitting between those two areas are those that like to give back to the community by writing blog posts, answering peoples questions, and presenting at one of the many events that are run throughout the world. These SQL related events are run at several levels: from small local user groups which may attract 20-30 people, to SQL Saturday events that pull in a few hundred, all the way up to the large conferences where attendees number in the thousands.

 

 Starting Speaking 

Everybody has to start somewhere. The people that speak at the large conferences have perfected their craft at the smaller events. The smaller arenas make for a tighter audience where material can be first presented, then improved over time. As important as the adjustments in the material are the changes in the presenters. They figure out the things that do, and do not, work for them. They learn how to read the crowd, handle questions, push through when problems arise with demos, and pace themselves to give a great and informative session which gives the attending audience great things to walk away with.

Breaking into this speaking arena is as difficult as it is daunting. This is why things like SQL Saturday and local user groups are so great. They allow a fairly safe opportunity for people to get their feet wet. Despite the relatively low visibility a presenter may get from an event like this it is still incredibly scary for them to step up in front of complete strangers that first time.

 

Anna’s Story

Anna (not a real person, name randomly generated) has worked with SQL Server for 10 years and knows the ins and outs of optimizing stored procedure performance. She makes the decision that she would like to share some of this knowledge with other people, and so spends a few hours making an outline of a presentation with all the things she wants to put in there. After figuring out the key items she puts together an abstract for her local SQL Saturday event and it gets accepted. After another 40 hours of work Anna has a presentation that she feels will wow the crowd. She shows up Saturday morning and walks into a room. There’s a projector in front of her, a screen behind her, and 50 people seated, waiting for her to start. Another 15 people file in, there are no chairs, so they are standing around, leaning against the walls. Anna looks around trying to find a friendly face, but there is nobody she knows there.

 Anna manages to get her laptop connected and is projecting things at the right resolution on the screen behind her. That’s never an easy task, especially when people are watching. The clock ticks round and she starts her presentation. As she welcomes everyone to the room and gives a brief outline of what people should expect from her session she sees a couple of people get up and leave. 

 Anna now wonders what she did wrong. What happened that these people left? What did she just say? 

A little rattled, Anna tries to remain on track and gets into her slide deck. She explains the concepts that are up on the screen and a man at the back of the room raises his hand and asks a question. Fortunately it’s a quick one and she’s able to answer briefly and jump into a demo. The first demo goes well, but the guy is raising his hand again and asking another question. This one is a little more complicated. Anna tries to explain but the guy isn’t getting the answer and is starting to get a little impatient with her. Other people are turning to look at him and giving looks, but nobody is saying a word. Isolated, Anna tries to handle it as best she can and return back to the presentation.

 As Anna progresses through her slide deck she can hear the guy at the back mumbling to himself. Then he opens up and instead of asking a question blurts out loudly that she’s wrong and he knows it. Anna isn’t sure how to respond to this. She tries her best to push on but she is feeling intimidated by this loud individual who seems to want to do nothing more than prove how much smarter he is than her. 

Then there is an issue with one of the demos. A query doesn’t behave how it should and Anna is confused as to why. She pushes on, hoping that nobody noticed what happened. Of course the one obnoxious guy is quick to point it out. 

 Exhausted from this Anna wraps up the remaining slides quickly, thanks the room, and shuts down her laptop. People slowly file out, dropping slips of paper containing feedback on a desk as they do so. A couple of people stop by to say thank you for the presentation and ask where they can get copies of the slides and demos. She points them to her website, they thank her and leave. Someone else stops to ask a question, but exhausted she can barely answer.

The next presenter comes wandering in the room getting ready to setup for her session. She asks how it went. Anna just looks at her and shakes her head dejectedly. 

 Anna walks out of the room vowing to never present again. It was a horrible experience that she never wants to repeat. She leaves behind 40 slips of paper that rate her session as a 5 out of 5. She misses the comments that state that this was one of the best sessions that they’ve seen and how her explanation of parameter sniffing finally made them get it and that they can’t wait to take all that they’ve learned back to the office and explain it to the devs and DBAs so that they can get it too.

The community just lost a great speaker who can reach people on topics and that has a wealth of experience behind her. 

  

To SQLSaturday and User Group Event Organizers

Anna is just a single example of what new speakers run into constantly. I’ve seen this happen first hand to new presenters. It’s not pretty and it’s not right. Public speaking is a deep held fear for most people, so standing up in front of an audience is a very scary proposition.

I would like to make a suggestion that, in my opinion, should be implemented at all events. When someone is new (or newer) to speaking given them a “buddy.” Pair them up with a more experienced speaker who has done the circuit, is comfortable in front of a room, and knows how to handle a crowd or a question. Give that new speaker a friendly face in the room.

 Does it make a difference? Ask Anna what it would have been like if she had been able to look out and see someone give her a smile and a nod, just letting her know that she was doing a good job. 

How much easier would it have been if Anna had someone to help her get setup on the projector?

What about someone to stand up when and ask if the intense questions could be taken offline so that a more in depth discussion could take place to provide the answers that were being asked for?

How would Anna have felt if someone had asked the gentleman berating her to please stop and raise any big issues after the presentation? Or backed up what Anna had been saying?

 If someone had been there would Anna have been as exhausted when things were done? Would she have quickly skipped over peoples questions? Or would she have been engaged, and thrilled that people wanted to hang out and ask about the things she presented? Would she have walked out of the room deciding to never present again, or would she have eagerly grabbed the feedback papers and gone through them trying to gauge peoples feelings?

The chances are that Anna would have walked out feeling tired, but great about what she did. She would have been talking away with her “buddy” about what worked, what didn’t, and asking if they had any idea what happened with that demo that went wrong. They would walk away, tweak the session some, add some things, remove some things, fix that demo and move on to submit and present at another event.

On top of that Anna would have made at least one friend that day. The “buddy” never really goes away. It’s someone that they keep in contact with, talk to every once in a while, and bounce new ideas off of. The community grows by one new fully engaged member. 

  

To New Speakers

If you are a first time or reasonably new speaker and are feeling nervous about giving that presentation, speak to the event organizers. Ask them ahead of time if there is someone who might be able to come sit in your session to provide some support.

Should you get no assistance or response from the organizers then try to find someone yourself. For a local user group look to find folks who go regularly and drop them a line, see if you can get together beforehand. For a SQL Saturday ask people in the speaker room; scary though they may seem they are really decent people who would be more than happy to help.

 Don’t ever feel like you have to go out there alone. There are always options to get someone on your side.

  

TL:DR

  • Organizers: don’t let new speakers go out and present without giving them individual support 
  • Presenters: if the organizer doesn’t help you, find someone who can be in the room when you present to give you some support

 

 

SHOWPLAN Permission Denied For A Sysadmin

I came across a curious issue the other day when trying to look at the actual execution plan of a stored procedure that was being developed. Strangely I couldn’t get an estimated plan either. All I got was the confusing error:

 

SHOWPLAN permission denied in database ‘TestDB’

As a sysadmin on the system in question I was very confused by this and it took a little bit of investigation to find out that the problem was actually being caused by the execution context of the stored procedure. 

Here’s a quick repro:

 

First, create a new user and table, then load that table up with a few rows (just for demo purposes)


IF (SELECT name FROM sys.databaseprincipals WHERE name = 'TestUser') IS NULL
  
CREATE USER [TestUser] WITHOUT LOGIN;
GO

IF (SELECT OBJECTID('dbo.SomeData', N'T')) IS NULL
BEGIN
   CREATE TABLE
dbo.SomeData
      
(
          
Id INT IDENTITY(1,1) NOT NULL
           ,
NothingNess CHAR(10) DEFAULT ''
          
, CONSTRAINT PK_SomeData PRIMARY KEY(Id)
       )
END
GO

INSERT INTO dbo.SomeData DEFAULT VALUES;
GO 100

Now we’ll create a couple of procs to pull data from the table. The first will execute under the context of the calling user, and the other under the context of the user TestUser that was just created.

CREATE PROCEDURE dbo.GetSomeData @Id INT
AS
SELECT
Id, NothingNess
FROM dbo.SomeData
WHERE Id = @Id
GO

CREATE PROCEDURE dbo.GetSomeDataTestUser @Id INT
WITH EXECUTE AS
'TestUser'
AS
SELECT
Id, NothingNess
FROM dbo.SomeData
WHERE Id = @Id
GO

Now we can execute both of these procedures and get results just fine.

 

EXEC dbo.GetSomeData 1;
EXEC dbo.GetSomeDataTestUser 1;

However if we attempt to run while grabbing the actual execution plan, or attempt to grab an estimated plan then the second call dbo.GetSomeDataTestUser will fail with the SHOWPLAN permission denied error.

What happens is that when the stored procedure runs it does so under the context of TestUser, and that user does not have SHOWPLAN permissions in the database. This means that you cannot display an execution plan, not could you gather TIME or IO statistics. 

This didn’t make much sense to me at first, just because I was a sysadmin, but after thinking for a while it is actually the correct way to handle things. All the permissions are contained for execution of the code within TestUser. So if TestUser does not have the rights to do something, or access something then the calling user would not be able to get to that information.

As a workaround for this all that needs to happen is to grant the showplan permission to TestUser and everything will be fine. For example:

GRANT SHOWPLAN TO TestUser;

It’s worth noting that this permission is database scoped, so don’t expect to execute it in one database and have it work everywhere.

Technet has a good write-up about all the things covered by SHOWPLAN, it is worth a read for more information.


TL;DR – If you get SHOWPLAN denied errors check that the context for the user actually execution the query has the requisite permissions, and grant showplan if needed. 

SQL Server 2012 Data Loss Bug

Microsoft have just posted a hotfix for a very serious bug in SQL Server 2012

This bug could potentially cause you to lose data during online index rebuilds.

 

At this time I would not recommend downloading the hotfix (wait for a CU which will address the problem). In the meantime follow the posted workaround which is to perform online index rebuilds with a MAXDOP of 1. For example:

 

ALTER INDEX <idxname> ON <objectname> REBUILD WITH (ONLINE=ON, MAXDOP=1);