A Better Way To Get SQL File Space Information

I wish this was actually a better way, but really it's an idea for a better way to get at the data.

Something that has come up at every place I've worked is the desire to monitor growth on data and log files for databases. If you aren't using some 3rd party monitoring software this is a pretty painful operation which forces us to jump back to cursor type operations where we have to gather information one database at a time (although this does not apply to transaction logs).

 

Gathering Log Space

To currently gather log space you would use DBCC SQLPERF('LOGSPACE') which will give you size and used percentage for every log file on a server.

Dumping that information into a table allows you to manipulate the data to figure out KB/MB/GB usage (rather than just plain percent).

As an example (GatherLogSpace.sql):

Gathering Log Space

This is a reasonable amount of work, but what if you want to get space used in data files? Well that gets a whole lot more complicated.

 

Gathering Data File Space

In order to gather information on space used in data files you actually have to go to each and every database and run a query using sys.dm_db_file_space_usage. 

As an example (GatheringDataFileSpace.sql):

Gathering the space available in data files

Strangely the sys.dm_db_file_space_usage DMV only contains the data file information. The log file data can only be obtained using the first process. This means that there is no single place that you can go to get file usage information for a database.

 

What about PowerShell?

PowerShell could give you the data, but it is still a cursor type operation, and you have to work through each database, drilling down to the log information, the filegroup and file inside each of those filegroups.

What we really need is a nice set based query that we could run in TSQL to give us all the information that is needed.

 

What I Would Like To See

Right now you could go and grab the size for each and every file for every database inside of SQL Server by querying sys.master_files:

Basic information from sys.master_files

There's a whole lot more information in sys.master_files, including growth data, so you can know exactly how much each file will grow the next time it does.

Even though there's this great data there is nothing to give us space used data. It would probably take too much overhead to keep that information up to date, but there's no reason that we shouldn't be able to easily get at the data when we want to.

Ideally we would see the existing DMV changed into a function, which would accept a database_id and file_id, and return the same data that it does now.

If that were the case it would be simple to run one query that would give us the size of each file, used space, free space, and how much the file would grow the next time an autogrowth hit.

This could quickly and easily give us all the data we want:

Mock up of easily obtained file space information

As a hope and desire to see this happen I've created a Microsoft Connect item - sys.dm_db_file_space_usage Should Be A Function

If you manage databases, and it's something that you'd really like to have (who wouldn't?) then please go and give it an upvote. There's a chance it might make it into the product at some point, and that would make a DBAs life a little bit easier.

 

Traffic Flow With Read-Intent Routing

One of the big advantages to using SQL Server Availability Groups is the ability to automatically push read traffic over to a secondary server. This is particularly useful for larger queries that would take a few seconds to run and consume large amounts of resources. It's not something recommended for short, fast queries, just because the additional latency of connecting to the secondary could slow down the overall response time for the query.

The Microsoft documentation on setting up Read-Only Routing in SQL AGs is pretty solid and explains how to get this up and running.

 

Firewall and traffic routing

In secure environments there is usually a firewall that resides between the front end web, application or mid-tier servers and the backend database server. This firewall would block all traffic to the backend except for specific ports to specific IP addresses. This is one of the defense in depth items that helps to keep your databases secure. 

When using a firewall in conjunction with SQL Server Availability Groups (AGs) it is common to just open up the firewall to the AG Listener. That way there is a single IP open for all the database servers that reside in the AG and any machine that is not acting as the AG primary is not available through the firewall (reducing attack vectors again, a nice side effect).

Given this you might well expect that when routing traffic off to a readable secondary in the AG that it would follow the flow of:

Here the client (either directly or through a web, app, or mid-tier) performs an action that does a read query against the AG Listener. The expected traffic flow would be (from what we would see IP address wise, the AG Listener would actually connect to the primary, in this case SQL1):

Client - AG Listener - Readable Secondary - AG Listener - Client
so
Client - SQLAG01 - SQL2 - SQLAG01 - Client

This way the primary server (in this case SQL1) would arbitrate all the traffic for the query that comes in. In fact read routing does not function this way. 

In order to perform the expected task of reducing the load on the primary the primary actually tells the client to redirect to the secondary server, and so the process goes:

The correct communication is 

Client - AG Listener - Secondary - AG Listener - Client - Secondary - Client
or
Client - SQLAGL01 - SQL2 - SQLAGL01 - Client - SQL2 - Client

When the client request comes in SQL has to check that the readable secondary is available to accept the query (otherwise it will go to the next server in the routing list, which is why you should always have the primary as the last server in the routing list, just in case every other server is out of service).

This means the query will take a little longer to execute as the arbitration and network changes will take additional milliseconds to complete (why it is not ideal for small, fast selects).

 

Where does the firewall come in?

Using a firewall and only opening up the IP of the Listener is the best way to handle security, but if you want to use readable secondary server and read-intent routing that's not going to work. Due to the way that the traffic is routed you would need to open up the firewall to each individual server and port that would be a secondary.

So in our above example the firewall would need to be opened to SQLAGL01, SQL1 & SQL2 in order to support client requests. If those rules aren't opened then you're client traffic will be blocked and you'll get the dreaded "Named Pipes Provider: Error 40" error, which isn't much of a help. 

 

Testing your read-intent connections

A really useful way of testing your read-intent connections is to use a quick PowerShell script from your front end server (if running Windows) prior to putting it into rotation. Download Check-ReadRouting.PS1 and enter the AG Listener name, or IP Address and the name of a database in the AG. If things are working correctly it will return the name of the primary and first server in your read-only routing list.

If you get a timeout then you have either not set the read-intent URL correctly for your secondary, or you are having firewall issues connecting, and so should investigate further.

 

Read-routing can be really powerful and useful, you just have to be careful of the gotchas in getting it working correctly.

 

Querying Change Tracking Tables Against a Secondary AG Replica

If you aren't familiar with Change Tracking I would recommend heading out and reading Kendra Little's Change Tracking master post which contains a wealth of information. 

I've been using CT for a while now and it does what it says on the box. The performance can be painful when querying tables that have changed a lot (the changetable function performs a huge amount of aggregation and seems to be optimized for 1000 rows of data). One of the things that I've always wanted to do is perform loads into an ODS from a DR site. 

I use AvailabilityGroups to ensure that a near real-time copy of the data is kept in another data center in another part of the country. I've tried a couple of times to query the change information from one of the secondary replicas, but sadly it's not supported and so I would get the error

Msg 22117, Level 16, State 1, Line 1
For databases that are members of a secondary availability replica, change tracking is not supported. Run change tracking queries on the databases in the primary availability replica.

 

Yesterday I was messing around with database snapshots and was really happy to discover that it is possible to use the changetable function against a snapshot and not receive any errors. This will only work against readable secondary replicas (as the database needs to be online in order to be able to take the snapshot).

This is also the case with log shipped copies of databases. If the database is in standby then you can access the changetable function directly, or do so off a snapshot.

 

It doesn't seem like this is a big deal, but if you like to load data into an ODS or Data Warehouse server and it's not located in the same location as your AG primary, then this is huge as you can asynchronously write data over a WAN and then do all your data loads local to the ODS. This is far more efficient (and your network admin will like you a lot more) than pulling the data over in a large chunk nightly and saturating a part of the network.

Just another way that you can make your DR system work for you.