T-SQL April 18, 2022April 19, 2022

Using STRING_AGG to Return Schema Information

A few weeks ago, someone came to me wanting a dump of tables in a database and all the columns as well. They were working on some data governance and the first step was to know roughly what existed in the database.

After a chat, we decided what they needed was a list of columns and all the tables where those columns were used and another list of tables with a list of the columns used in the table. To help import the data into their tooling they needed a comma separated list of the tables for each column and the columns for each table.

After spending many years using various forms of FOR XML to do this sort of thing, I decided it was time to make my life easier and use STRING_AGG instead.

STRING_AGG can take a set of data and use the provided delimiter to create a list. As a bonus, you can also use the WITHIN GROUP to order the list of items within the STRING_AGG statement.

Here’s a quick example of I used STRING_AGG to return the table and column data requested by the governance folks.

	/* Return a list columns in the database and for each column a comma delimited list of tables in which that column appears */
	SELECT
	c.name AS ColumnName
	, STRING_AGG(CONCAT( QUOTENNAME(s.name), '.', QUOTENAME(t.name)), ', ')
	WITHIN GROUP (ORDER BY CONCAT(QUOTENAME(s.name), '.', QUOTENAME(t.name))) AS TableListForColumn
	, COUNT(*) AS TableUsageCountForColumn
	FROM sys.tables AS t
	JOIN sys.columns AS c ON c.object_id = t.object_id
	JOIN sys.schemas AS s ON s.schema_id = t.schema_id
	GROUP BY c.name
	ORDER BY TableUsageCountForColumn DESC
	, ColumnName ASC;

	/* Return a list of tables in a database and for each table a comma delimited list of columns in that table */
	SELECT
	CONCAT(QUOTENAME(s.name), '.', QUOTENAME(t.name)) AS TableName
	, STRING_AGG(c.name, ', ' )
	WITHIN GROUP (ORDER BY c.name) AS ColumnList
	, COUNT(*) AS CountColumnsInTable
	FROM sys.tables AS t
	JOIN sys.columns AS c ON c.object_id = t.object_id
	JOIN sys.schemas AS s ON s.schema_id = t.schema_id
	GROUP BY concat(quotename(s.name), '.', quotename(t.name))
	ORDER BY CountColumnsInTable DESC
	, TableName ASC

view raw GetSqlServerColumnUse.sql hosted with ❤ by GitHub

PASS, Random August 19, 2020

The Changing Community Landscape?

Just dropping in the thoughts I posted in a Twitter thread earlier today –

Random thoughts on the changing landscape around the SQL Server community and training events (more a train of thought than well constructed reasoning with data, so don’t expect some kind of wonderful insight here). Should the demise of PASS happen it will leave a significant gap in the “independent” training space (I say independent here despite times when folks on the board are working for vendors – I genuinely feel that they have overall integrity when it comes to fairness and note that I am excluding C&C from this).

Continue reading →

AlwaysOn, Availability Groups, Backups May 19, 2020

The Deception of the Log Reuse Wait Type

One day you’re happily working away and get interrupted by an alert that says your transaction logs are getting full or maybe growing. Ever the vigilant person you logon, crank open SSMS or ADS and take a look at sys.databases to check out the log_reuse_wait_desc.

AVAILABILITY_REPLICA the DMV proudly advises. Availability Groups – the lifesaver of your organization and the bane of your nightmares.

You crack open a SQL script and run it against the primary to see which replica is running behind. Strangely all the replicas appear to be up to date. The send queue is hovering around 60. The redo queue is not greater than a thousand on any of the asynchronous replicas.

Unsure why this is happening you use the tried and trusted method of restarting the hadr endpoint to see if that kicks things into motion. Nothing is resolved. The log is geting fuller and fuller and you really don’t want that thing to grow.

What do you do next?

You may go around to the secondary replicas and remove them from the AG, one at a time, to see if they fix the problem. Alternatively, you go ahead and check the running sessions on each of the replicas to see if there’s something going on you haven’t considered.

Your query discovers that the AG replica in your DR site is still taking a backup from overnight. It should have completed hours ago, but for some reason it is still running. Wondering if it is related you kill the backup.

With the backup dead, suddenly the transaction log on your primary clears and you can get back to your interrupted ~~nap~~ hard work.

This is an example of a choice in messaging within SQL Server that makes it more difficult to find a solution.

In this case the problem of the transaction log not truncating would have ideally been marked as ACTIVE_BACKUP_OR_RESTORE, after all, it was a backup that was blocking the log truncation. But the folks that wrote the product felt AVAILABILITY_REPLICA was more appropriate because the backup was running on a different replica than the primary and that was the root location for the truncation issue.

So, be aware, messaging may not always accurately reflect problems that are occurring.

Backups April 2, 2019

The Curious Case of the Restore Slowdown

For quite some time I had been seeing a situation where database restores were slowing down after 50-60% of the restore was completed (you can output restore progress using WITH STATS = <some value 1-100> or use DMVs or sp_whoisactive). This was not really a big deal with small databases, it would just take an extra couple of minutes to complete. But when I had the need to restore a 16 TB database it became a real problem.

The restore would kick off at a decent rate giving me around 2 GB/sec (restoring over a WAN from multiple files to a server with 144 cores, 512 GB of RAM, and multiple data files on NVME storage). This was acceptable. Then, a little over halfway through the restore performance dropped to 200 MB/sec and then 100 MB/sec (this being viewable through perfmon)

Restore Performance Drop

This issue I could reproduce time and again. Something was up with the product.

In investigating I kicked off another restore and watched the process. Things started swimmingly reading from the 4 backup files on the NAS and processing fast. Then the restore seemed to complete reading 3 of those files and slow down dramatically reading the fourth one. At the same time I could see a single core on the database server being pegged at 100% usage.

There wasn’t anything I could do to resolve this issue (short of using a ridiculous number of backup files in an attempt to make the low perf period as short as possible).

Given that this had the potential for a real RTO/RPO problem it was time to call in the big guns over at Microsoft support.

We ran through a few more scenarios including taking stack dumps during the period when the restore was low (DBCC STACKDUMP <- NEVER EVER DO THIS ON A PROD SYSTEM).

Working through the dumps the MS support engineer found bug from last year whereby disks formatted with a 4K sector size could experience slow restore performance.

I checked up on my servers to confirm that we were using 4K using the built in FSUTIL command. In this case FSUTIL FSINFO NTFSINFO <drive>. This provides a wealth of useful information about the disk, and showed that it was indeed using a 4K sector size.

4K sector size

Fortunately this issue was fixed in October last year with

SQL Server 2014 Service Pack 3
SQL Server 2014 Service Pack 2 – Cumulative Update 12
SQL Server 2016 Service Pack 2 – Cumulative Update 1
SQL Server 2016 Service Pack 1 – Cumulative Update 9
SQL Server 2017 – Cumulative Update 7

The sad thing for me was that the particular instance I was restoring to was SQL Server 2012, for which no patch was released (as that version is out of mainstream support). But I was able to confirm that the slowdown did not happen when attempting to restore against an appropriately patched version of a fully supported version of SQL Server (time to put on the harder press for convincing management to allow us to upgrade).

So if you are seeing problems with restores getting slow and are on SQL Server 2014/2016/2017 get yourself patched up to date. If you are stuck on an older version either redo the low level formatting on your storage (not recommended), take uncompressed backups (mildly recommended), or work to get yourself on a fully supported version of the product.

PASS, Professional Development, Rant April 1, 2018

The Argument Against Attending The PASS Summit

Back in 2016 I wrote a post called The Missing Piece of SQL Conferences which provided a perspective of a few DBAs, at various points in their careers, and what the Summit could provide in helping them to move forward. Upshot is that I believed that there was a serious gap for senior level DBAs (this could be true of other folks working in the SQL space, but I focused on what I know best). The post came to mind this week when I was talking with my boss.

“Are you planning on attending Summit this year?” he asked. Instinctively I went to respond in the affirmative, but stopped myself and thought about it for a few seconds. I contemplated the things that I had learned over the last few years of attendance, the things that I could potentially learn this year. This provided me with a few questions that I needed to address as someone with > 15 years as a DBA working with SQL Server.

Continue reading “The Argument Against Attending The PASS Summit” →

Rant January 25, 2018

High Availability–It’s Not All About The Hardware/Software

As I write this I’m sitting my my house working with the last hour of juice left in my laptop. We had a major Snow + Seattle = Bad Newssnowstorm here in the Seattle area which was followed by a big ice storm and some heavy winds, all of which have coincided to knock out power to over 200,000 homes. I’ve been without power for over 75 hours now and it’s given me a lot of time to think about multiple facets of High Availability (HA) and Disaster Recovery (DR).

HA/DR are critical items for all companies. HA in various degrees, but most certainly DR and the ability to provide some kind of business continuity model. Different companies, large and small, have different requirements around HA/DR. Some could easily remain down for a couple of days with no lingering effects, others cannot sustain downtime on their most critical systems of more than a couple of minutes. Based upon those requirements (and any budget constraints) a company could implement HA/DR using Failover Clustering, Mirroring, Log Shipping, SAN Replication, Peer-to-Peer Replication, VMotion or some other technology.

Where HA/DR often falls down

This is all good and well but what about the people that you need in order to get things up and running, the folks that you need in order to provide you with ongoing support in the event that a situation arises where you have to go to your DR site, or provide support for that critical system where a mirror has failed?

In the last 80 hours I have been called by the NOC at work stating that they have an issue. I don’t have power at home so my internet access is off. I don’t have sufficient power in my phone to tether it (don’t tell the wireless company I can do that) and it would take 4 hours sitting in the car to get sufficient power to allow me to do that. I do have a mobile broadband USB device, however I’m not getting coverage in my house right now from that carrier and I have been unable to get out of my road. In other words in the whole HA/DR side of things I’m a single point of failure. Luckily I have a colleague with power who I was able to point the NOC to in order to provide resolution to the issue.

My colleague and I live less than 5 miles apart, we are just luckily on different circuits and substations for the power company. If neither of us had been able to dial in then the problem would not have been resolved.

If you truly want to implement HA/DR you need to go further than just looking at your hardware infrastructure or the way that your software is setup. It means more than your DNS configuration and your Active Directory setup. There’s more to it than setting a Failover Partner in your application connection string. It’s about the people that you need to support these things!

Artificial limitations

I’ve been a DBA for about 12 years now. I’ve not been hands on with a server since doing datacenter implementation work about 5 years ago. Other than in very small shops it’s rare that a DBA touches a server, most don’t even have access to the datacenter in which their database servers reside. There has been a huge move towards setup in Colocation Datacenters (CoLo’s) over the last few years, meaning that more often than not the datacenter where you servers reside are probably not even in the same physical building that you are. There is also the current move to the Cloud initiative that’s gaining some traction. Given all of this information I ask you this:

Why are DBAs so often tied to a physical location for work?

This is a serious question. There are so few telecommuting options for DBAs out there. You can look on the job boards and see virtually none. There’s a lot of relocation opportunities if you want to move to another state, but it doesn’t make a lot of sense. If you are going to geographically disperse your hardware due to HA/DR needs why would you not do that same for the teams that have to support it?

So many companies point blank refuse to even entertain the idea of telecommuters from another state (heck the vast majority won’t even contemplate allowing IT support staff to work from home a couple of days a week).

The same companies that happily outsource all development processes to an unknown entity in India or China refuse to entertain the idea of allowing Windows/Unix Admins or DBAs to telecommute.

I feel that it is somewhat ironic that the same people that expect you to come to the office and be tied to a physical location also have you be on call so that you can provide remote support when something goes wrong.

Moving beyond the office

Forcing IT support staff to come to a physical location 5 days a week makes little to no sense to me. I can understand the desire to have folks in the same physical location but as your organization grows it really needs to find a way to get past that mentality. There are various options such as LiveMeeting or GoToMeeting that can provide a telepresence in meetings for people that work remotely. Instant Messaging can be used to constantly keep people in touch. Services such as Yammer can be used to quickly and easily provide collaborative thought outside of meetings.

Looking at my current situation, if my colleague and I had both been without power we would have had no way to provide support. Our servers sit in a CoLo that we don’t have access to so even if we could have gotten there, we would have had no hands on way to work with the servers.

Now imagine we had a third person on the team that lived and worked in another state (or that the second person was living in another state and telecommuted in). Even with power down for me (or the entire state) we would still have someone available to provide support (unless by extremely unusual circumstances they were also down). With a remote DR site they could have effected whatever steps were necessary to bring that online and check that things were working ok.

Let’s look at this from another angle. I’m in the Pacific Time Zone and get to work usually 7:00-7:30am and leave 4:30-5:00pm. If we had someone working in the Eastern Time Zone and they worked the same hours we would have coverage (without emergencies handled by being on call) from 4am PST – 4:30pm PST. That’s an extra 3 hours of standard DBA coverage. Imagine how much more work could get done. How overnight problems could be identified and resolved earlier?

We live in a global society and the company that I work for does business internationally. Would it not make a lot more sense to provide a wider margin of coverage? Think about it, it would also improve the quality of life for anyone that had to be on call. For me the worst time to get called out is about 4am as I’ll never get back to sleep. If there was someone already in the office on the east coast I wouldn’t need to get called out. The same for them with evening plans. They could go out to dinner or a movie knowing that they wouldn’t have to worry about missing a call.

TL;DR

It’s the 21st century, there are many options to prevent the need for anyone to be physically tied to an office location (certainly amongst IT staff). In order to provide higher levels of support to companies they really need to embrace telecommuting options beyond providing emergency support.

Late Note: After 80 hours PSE were able to restore power to my neighborhood. Throughout the outage PSE did a fantastic job of handling things. Ultimately over 375,000 were without power at one time or another. PSE provided constant information wherever possible via their Twitter feed and website. They brought in crews from around the country to assist in the restoration efforts. All in all I think they handled things really well. I have no complaints with how they handled any of this. (I just wish I’d been closer to the top of the restore list)

Career August 25, 2017

How To Not Be a Commodity DBA – Part 1

A couple of months ago I wrote about the concept of the DBA as a commodity (DBAaaC), and how it can be so easy for management to consider any DBA as a basic, and replaceable resource within their organization. This got me to thinking (and it would seem that my thinking takes a long time) that maybe a little guidance would be in order to help prevent you from becoming a DBAaaC.

Not everyone is in the same situation, so I’ve broken this down in to a couple of parts, one for those of you that have the time, and inclination to get things done, and those of you that can barely find a half hour to themselves.

First up…when time for you isn’t a problem…

Continue reading “How To Not Be a Commodity DBA – Part 1” →

Random August 16, 2017

The Nature of Absurdity

The other night I was adjusting some quorum settings for the WSFC that was underlying a couple of AGs at work. Quorum needed fixing up as we were going to perform a site failover while some network switches were updated on the primary site, and I didn’t want to risk the potential of an outage should a WAN link go down between the primary DC and the DR DC. Fortunately PS makes this pretty simple, I just ensured that quorum would be maintained by the votes in the DR DC by removing them from a couple of machines in the primary DC:

(Get-ClusterNode -Name MySQLNode1).NodeWeight=0;

(Get-ClusterNode -Name MySQLNode2).NodeWeight=0;

And then checking the votes on the WSFC to ensure that the DR site would stay up no matter what:

Get-ClusterNode | ft Name, NodeWeight -auto

Then I failed over the AG, I could have used T-SQL to do this in SSMS or SQLCMD:

ALTER AVAILABILITY GROUP MyAG ON REPLICA = 'MyDrSQLNode3' WITH (AVAILABILITY_MODE = SYNCHRONOUS_COMMIT);

ALTER AVAILABILITY GROUP MyAG ON REPLICA = 'MyDrSQLNode3' WITH (FAILOVER_MODE = AUTOMATIC);

ALTER AVAILABILITY GROUP MyAG FAILOVER;

But I’ve written a GUI interface to do this sort of thing in C#. This provides our NOC the ability to perform failovers themselves so that they can apply Windows CUs without having to call someone up to failover. The C# originally used T-SQL under the covers, but I’ve changed it recently so that it uses SMO (although have had to work around an issue using T-SQL):

AvailabilityGroup.Failover();

The once the FEX and switch work was completed I added the votes back to the primary DC and failed everything back again.

It was while I was doing all this that I got to thinking about just how absurd all of this stuff is.

20 years ago I would have laughed at the thought of all this random jargon and what it might mean
15 years ago I would have laughed at the though of Microsoft creating a proper shell language, with them preferring dim heavy VBScript
10 years ago I’d told you that you needed a disk quorum for your MSCS because otherwise you were going to have FCI problems with a machine was restarted
5 years ago I’d have laughed at the idea of using AGs for production (that laughter has turned to tears many times)
3 years ago the closest I would have gotten to C# would have been a tart orange
2 years ago the concept of being a major contributor to a large open source project would have been laughable
1 year ago I’d have given you an evil look at the merest mention of SMO

This stuff all moves so fast, it’s just ridiculous, there’s just no way to keep up with it all. Think of all the other changes over the last dozen years of SQL Server and Windows…things like DMVs, Azure, AWS, compression, TVPs, temp variables, CTEs, output clauses, lag and lead, partition by, partition tables, etc…

Spend a moment to think about all the stuff you’ve been through, for however long you’ve been doing this nonsense, and realize the absurd nature of it all.

By the way, this was deliberately acronym heavy, so here are some definitions:

AG – Availability Group
AWS – Amazon Web Services
CTE – Common Table Expression
DC – Data Center
DMV – Dynamic Management View
DR – Disaster Recovery
FEX – Cisco Fabric Extender (network stuffs)
FCI – Failover Cluster Instances
GUI – Graphical User Interface
MSCS – Microsoft Cluster Service (precursor to WSFC)
NOC – Network Operations Center (those folks who are the first in line when something goes bad, and also the ones that have the lousy job of waking you up in the middle of the night because something went wrong)
PS – PowerShell
- ft – Format-Table (in Get-ClusterNode command)
- -auto – automatically size data columns returned to the client
Quorum – a majority of votes in a WSFC that is required to ensure that the clustering services stay online to ensure that there is not a risk of a split-brain scenario which could cause massive data problems (oversimplification)
SMO – (SQL) Server Management Objects
SQL – Structured Query Language
SQLCMD – SQL Server Command Line Interface
SSMS – SQL Server Management Studio
T-SQL – Transact-SQL
TVP – Table-Valued Parameters
VBScript – Microsoft Visual Basic Scripting Edition
WAN – Wide Area Network
WSFC – Windows Server Failover Cluster

I’m pretty sure just going through all the SQL Server word salad is enough to drive a person to drink.

Availability Groups, SMO August 3, 2017

Bug – Getting Read-Routing Data Using SMO

I was futzing around with SMO recently, and wanted to grab information around the read-routing list in an Availability Group. Things went fine for 2012 and 2014, but when it came to SQL Server 2016 I ran into a problem.

The read-routing property in SMO is a string collection (reference https://msdn.microsoft.com/en-us/library/microsoft.sqlserver.management.smo.availabilityreplica.readonlyroutinglist.aspx), which is fine for 2012/4, but SQL Server 2016 introduced the idea of load-balanced read-routing. When used this provide you with the option of having multiple read-only replicas which can handle traffic, and are load-balanced (it only uses a basic round-robin algorithm, but that’s a lot better than the single replica option that exists in the earlier versions). In order to correctly know how read-routing is configured you need to know what replicas are in a load-balanced group. SMO does not provide you with the ability to get at this information (nor does it give you the chance to configure read-routing in a load-balanced scenario), as it is a basic collection, and has no further property information around it.

I’ve created a Connect item (Gathering Read Routing information using SMO is inaccurate in SQL 2016) to try and get MS to look into this, and maybe provide a fix, please upvote if you can.

If you want to try this for yourself you can use the C# below, try it against the lower versions, and then on a load-balanced 2016 configuration. While it’s still possible to get this data from TSQL I find it annoying that the server management objects that are specifically written to deal with this stuff don’t do the job.


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.SqlServer.Management.Smo;
using Microsoft.SqlServer.Management.Common;

namespace SmoTesting
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Enter the servername");
string connectServer = Console.ReadLine();
Console.WriteLine("Enter the AG name");
string agName = Console.ReadLine();


Server srv = new Server();
try
{
srv = new Server(connectServer);
srv.ConnectionContext.StatementTimeout = 60; //timeout after 60 seconds running the query

foreach (AvailabilityGroup ag in srv.AvailabilityGroups)
{
if (ag.Name == agName)
{
ag.PrimaryReplicaServerName.ToString());
foreach (AvailabilityReplica ar in ag.AvailabilityReplicas)
{
if (ar.Name.ToString() == "connectServer")
{
foreach (Object obj in ar.ReadonlyRoutingList)
{
Console.WriteLine(" {0}", obj);
}
}
}
}
}
}
catch (Exception ex)
{
Console.WriteLine(ex.InnerException.ToString());
}
finally
{
srv.ConnectionContext.Disconnect();
}
Console.WriteLine("press a key");
Console.Read();
}
}
}

Rant May 15, 2017

DBA as a Commodity

If you’ve been working around the IT industry for a few years then you have probably heard mention of commodity servers. These are machines that run a standardized hardware spec, and are easily interchangeable with others. The price point around the machines is that it is typically cheaper to replace than it is to repair them, so once the warranty runs out you just throw the old one away and get a new one.

I was thinking about this the other day, and how it really goes beyond servers, which is where I thought about Database Administrators as a Commodity (or DBAaaC).

When first considering DBAaaC as an idea it really does not seem to make much sense. After all DBAs have to do all kinds of different work, and many have very different sets of skills that are applicable to a wide variety of situations.

If you are thinking that then the odds are that you are a DBA, or have worked closely with one over the years. This is not something that the vast majority of people, in particular IT management type folks seem to think.

What makes me believe this to be the case?

DBA Job Requirements

I’ve been in the DBA business for about 17 years now (why yes, that does make me feel old, thank you for asking). In this time I have had a few jobs, interviewed for many, and been an interviewer for even more. I have seen more generic DBA job desscriptions than I care to remember.

Generic Job Description — Typical job description

There are hundreds of jobs descriptions for SQL Server DBA jobs that look almost exactly like the one above, do a search yourself and see what commonalities seem to exist between roles. Think back to the last job description that you read, was it something similar to this?

Would it surprise you to learn that the above job description was actually copied from the job description of a company that was looking for an Oracle DBA?

Certifications

How many jobs have you seen looking for someone holding an MCDBA? That’s a certification that existed through SQL Server 2000 and was replaced by the MCITP with 2005, and then with the MCSE Data Platform in the last couple of years. It would seem that with management team that these things are easily interchangeable, after all there really is not difference between running SQL Server 2000 running in a cluster, and SQL Server 2016 with a multi-subnet Availability Group running on a cluster that is not joined to a domain, using DMVs to help you identify performance bottlenecks.

What about jobs that are looking for an MVP? That’s an award given by Microsoft for community contributions. It is not a barometer of technical knowledge, and yet I have seen several DBA jobs looking for people who are MVPs.

Over the years I have seen many jobs that also will only consider people that have bachelor degrees, because apparently having a bachelor degree means that you will be good at your job. I do not have a degree (of any kind), and know several excellent DBAs that do not either, but every one of us would be excluded from these jobs just because we did not take four years of college (and we’ll ignore for now how that plays into systemic racism in the US).

Experience

Here are the general guidelines for DBAs that I’ve seen over the years:

Junior DBA – < 3 years of experience
DBA – 3-6 years of experience
Senior DBA – > 6 years of experience

These are usually baked right into the job descriptions, and it is how companies seem to manage peoples careers and job progressions (not to mention candidates looking for job seem to think that their skills put them in line for a job thanks to years working with SQL Server, not with actual knowledge).

This completely ignores whether a person looking for a DBA role has 5 years or varied experience, or 5 years doing the same thing day in and day out (checking if jobs ran, and answering business questions).

All the Technologies

Your generic job description will frequently be looking for someone with experience (if not expertise) with

SSIS
SSRS
SSAS/OLAP
DTS (yes really)
AlwaysOn (even though it’s not a thing)
MSCS (even though it was replaced by WSFC almost a decade ago)
VLDB (many places believe > 50GB = VLDB)
VMWare
Azure
TCP/IP
Windows 2003/2008/2008R2/2012/2012R2
TSQL
Maintenance Plans
Replication
and many more…

There tend to be long lists, like the above, for job roles. It is really just a list of buzzwords that the hiring manager (or the recruiting team) have put together so that they can plug them into their software to automatically weed out candidates that don’t have those words in their resumes.

The above examples are all ways to quickly recognize if a company is likely to treat anyone coming in as a DBAaaC.

What would be better?

A More Reasonable Job Description

Given the very generic information that is listed for almost all DBA positions what would make a role stand out? Usually it is something that is a little more crafted, and provides and understanding of the world in which the DBA works. For example a job that listed the following things:

Required 3+ years with Availability Groups
Understanding of DMVs and how to use them to troubleshoot server performance
Can tune queries to improve performance and lower server impact
Ability to manage SQL Server backups and restores to meet companies documented RPO/RTO
Experience managing maintenance for tables larger than 100GB in size
Knowledge of how Windows and Active Directory settings can impact SQL Server performance
Can work with development teams on best practice guidelines for new code, and provide performance enhancements when problems occur
Can troubleshoot SSIS packages
Preferred current MCSE certification

This by no means would be a perfect list, however it does provide some good information, and would let a potential candidate know that management here at least has a clue as to what is required to perform in the role, and provides a basis for a conversation.

Working as a DBA

Once you are in a company how can you recognize whether or not you are being treated as a DBAaaC?

Think about potential new hires that the company are looking to make. Are they using generic job descriptions?

Do promotions seem to happen thanks to longevity in the role, rather than be based upon some more hard based criteria around the work that you do? Is there a planned career path for you to move to the next position?

Do you perform the same work day in, and day out? Are you solely focused on backups, or indexing, or checking on failed jobs?

Is there a way that you can receive ongoing training to help you get additional knowledge? (even if this means just paying for some online videos, or giving you a section of time every month to spend on skills development)

Does your manager treat you like a truly valued member of the team, or are you just there ensuring that things are set to run smoothly?

If any of the above ring true to you then the chances are that you are seen as a commodity, and you can be thrown away at any point, and replaced by someone else (at least management believes that to be the case).

A Word of Advice to Management

There is every chance that you believe that the people that work for you are interchangeable, no matter the role (I’ve seen this over the years with developers, database, Windows, and Unix admins). People really aren’t, except at the lowest skill levels (and frequently even not then).

People are not machines, do not think that you can just throw the old one away and pick a new one up. Consider the ongoing learning that is required for your people to have skills that remain current, and help you move ahead. Certainly, at times you will have people leave, or you may have to let people go, but don’t treat your entire team as if they could be replaced at a moments notice.

If you think that all DBAs are the same, spend some time looking on Stack Exchange at some of the questions asked, and consider what questions on there your DBAs could answer (and what questions they may post). You’ll quickly see that DBAs are all different, they have different areas of focus, and each can give you different things. They are far from being a commodity.