Tag: Rant

High Availability–It’s Not All About The Hardware/Software

As I write this I’m sitting my my house working with the last hour of juice left in my laptop. We had a major Snow + Seattle = Bad Newssnowstorm here in the Seattle area which was followed by a big ice storm and some heavy winds, all of which have coincided to knock out power to over 200,000 homes. I’ve been without power for over 75 hours now and it’s given me a lot of time to think about multiple facets of High Availability (HA) and Disaster Recovery (DR).

HA/DR are critical items for all companies. HA in various degrees, but most certainly DR and the ability to provide some kind of business continuity model. Different companies, large and small, have different requirements around HA/DR. Some could easily remain down for a couple of days with no lingering effects, others cannot sustain downtime on their most critical systems of more than a couple of minutes. Based upon those requirements (and any budget constraints) a company could implement HA/DR using Failover Clustering, Mirroring, Log Shipping, SAN Replication, Peer-to-Peer Replication, VMotion or some other technology.

 

 

Where HA/DR often falls down

This is all good and well but what about the people that you need in order to get things up and running, the folks that you need in order to provide you with ongoing support in the event that a situation arises where you have to go to your DR site, or provide support for that critical system where a mirror has failed?

In the last 80 hours I have been called by the NOC at work stating that they have an issue. I don’t have power at home so my internet access is off. I don’t have sufficient power in my phone to tether it (don’t tell the wireless company I can do that) and it would take 4 hours sitting in the car to get sufficient power to allow me to do that. I do have a mobile broadband USB device, however I’m not getting coverage in my house right now from that carrier and I have been unable to get out of my road. In other words in the whole HA/DR side of things I’m a single point of failure. Luckily I have a colleague with power who I was able to point the NOC to in order to provide resolution to the issue.

My colleague and I live less than 5 miles apart, we are just luckily on different circuits and substations for the power company. If neither of us had been able to dial in then the problem would not have been resolved.

 

If you truly want to implement HA/DR you need to go further than just looking at your hardware infrastructure or the way that your software is setup. It means more than your DNS configuration and your Active Directory setup. There’s more to it than setting a Failover Partner in your application connection string. It’s about the people that you need to support these things!

 

 

Artificial limitations

I’ve been a DBA for about 12 years now. I’ve not been hands on with a server since doing datacenter implementation work about 5 years ago. Other than in very small shops it’s rare that a DBA touches a server, most don’t even have access to the datacenter in which their database servers reside. There has been a huge move towards setup in Colocation Datacenters (CoLo’s) over the last few years, meaning that more often than not the datacenter where you servers reside are probably not even in the same physical building that you are. There is also the current move to the Cloud initiative that’s gaining some traction. Given all of this information I ask you this:

Why are DBAs so often tied to a physical location for work?

 

This is a serious question. There are so few telecommuting options for DBAs out there. You can look on the job boards and see virtually none. There’s a lot of relocation opportunities if you want to move to another state, but it doesn’t make a lot of sense. If you are going to geographically disperse your hardware due to HA/DR needs why would you not do that same for the teams that have to support it?

So many companies point blank refuse to even entertain the idea of telecommuters from another state (heck the vast majority won’t even contemplate allowing IT support staff to work from home a couple of days a week).

The same companies that happily outsource all development processes to an unknown entity in India or China refuse to entertain the idea of allowing Windows/Unix Admins or DBAs to telecommute.

 

I feel that it is somewhat ironic that the same people that expect you to come to the office and be tied to a physical location also have you be on call so that you can provide remote support when something goes wrong.

 

 

Moving beyond the office

Forcing IT support staff to come to a physical location 5 days a week makes little to no sense to me. I can understand the desire to have folks in the same physical location but as your organization grows it really needs to find a way to get past that mentality. There are various options such as LiveMeeting or GoToMeeting that can provide a telepresence in meetings for people that work remotely. Instant Messaging can be used to constantly keep people in touch. Services such as Yammer can be used to quickly and easily provide collaborative thought outside of meetings.

Looking at my current situation, if my colleague and I had both been without power we would have had no way to provide support. Our servers sit in a CoLo that we don’t have access to so even if we could have gotten there, we would have had no hands on way to work with the servers.

Now imagine we had a third person on the team that lived and worked in another state (or that the second person was living in another state and telecommuted in). Even with power down for me (or the entire state) we would still have someone available to provide support (unless by extremely unusual circumstances they were also down). With a remote DR site they could have effected whatever steps were necessary to bring that online and check that things were working ok.

Let’s look at this from another angle. I’m in the Pacific Time Zone and get to work usually 7:00-7:30am and leave 4:30-5:00pm. If we had someone working in the Eastern Time Zone and they worked the same hours we would have coverage (without emergencies handled by being on call) from 4am PST – 4:30pm PST. That’s an extra 3 hours of standard DBA coverage. Imagine how much more work could get done. How overnight problems could be identified and resolved earlier?

We live in a global society and the company that I work for does business internationally. Would it not make a lot more sense to provide a wider margin of coverage? Think about it, it would also improve the quality of life for anyone that had to be on call. For me the worst time to get called out is about 4am as I’ll never get back to sleep. If there was someone already in the office on the east coast I wouldn’t need to get called out. The same for them with evening plans. They could go out to dinner or a movie knowing that they wouldn’t have to worry about missing a call.

 

 

TL;DR

It’s the 21st century, there are many options to prevent the need for anyone to be physically tied to an office location (certainly amongst IT staff). In order to provide higher levels of support to companies they really need to embrace telecommuting options beyond providing emergency support.

 

 

Late Note: After 80 hours PSE were able to restore power to my neighborhood. Throughout the outage PSE did a fantastic job of handling things. Ultimately over 375,000 were without power at one time or another. PSE provided constant information wherever possible via their Twitter feed and website. They brought in crews from around the country to assist in the restoration efforts. All in all I think they handled things really well. I have no complaints with how they handled any of this. (I just wish I’d been closer to the top of the restore list)

Who Needs a DBA?

I was perusing my RSS Feeds this morning, and came across an article by Steve Jones (b|t) about Changing Times. In this post Steve talks about Microsoft not having any DBAs in their cloud team, and that it is all managed by DevOps folks. This message came from a talk that was given at SQLBits and summarized by Victoria Holt.

MS devops facts

Wow, this looks amazing, and really seems to ring the death knell for the DBA…or does it?

Let’s consider what we are looking at here, and for the purposes of this, I am going to focus on Azure SQL Database (or whatever we’re calling it this week).

There are 1.7 million production databases that do not have DBAs. Well, that’s flat out wrong, there might not be DBAs at Microsoft, but you can be pretty sure that there are DBAs out there managing a portion of those. I guess we should really focus on the Microsoft side of things.

So there are 1.7 million databases, what is Microsoft’s role in these databases? Other than keeping up the infrastructure that the databases are based upon not much really. They have no posted performance metrics to which they must aspire. They are not going to be in the business of ensuring that your backups are taken, your HA requirements are based upon you setting up the services and regions correctly, and they don’t care how fast or slow your query is. There is an SLA for uptime, and don’t worry, if it gets below 99% you get a 25% discount.

Given that there are no requirements to perform any work that a typical DBA would be required to do why would they need a DBA at all? They wouldn’t. It is this kind of information that makes people who work as DBAs question their existence and future, and frankly, there is no need to do that.

As a DBA you will find that the cloud will become a significant part of your future (if it is not already), and your role will shift, but there will always be a place for the on-premises products (for as long as it is made), and that means DBAs to manage it. Heck if you want to see what the world without DBAs looks like just skim through dba.stackexchange sometime.

I feel that a lot of messaging coming out of Microsoft puts a not insignificant level of fear into some people, as they attempt to set an example where the only people that really matter are developers, and developers sitting in a DevOps role. Is Microsoft going to change their tune around any of this? Nope, I don’t see that happening at any point. One thing you could question is, given that there are all these Azure databases that don’t have DBAs why is it that Microsoft has any DBAs working for them at all? Surely DevOps people would be all that they need.

In-Memory OLTP – Shiny, But Useful?

SQL 2014 introduces In-Memory OLTP to the product (although I prefer the name Hekaton).  It is a significant leap forward so far as the technology goes, and can lead to significant performance improvements when utilized, but is it going to be a technology for the masses? Maybe eventually, but right now I see limited use cases for it.

 

Limitations on compiled tables

First up there are limitations around what you can do with the In-Memory tables. There’s no option to perform a truncate or merge into one. You also cannot perform a cross-database join with one, or access it from a CLR module using context connections (see Accessing Memory-Optimized Tables Using Interpreted Transact-SQL).

Interested in using contained databases for portability? Forget using an In-Memory table then, just not going to happen. You can’t put one on a particular filegroup or partition either, or use computed columns. Like to use IDENTITY? Not any more. DRI? Nope, not supported, nor are clustered primary keys.

That being said you can do pretty much anything else with these tables, just so long as you don’t want to use a LOB data type, datetimeoffset, or potentially have a row overflow (see Supported Data Types). Then again, were that the case you shouldn’t be doing those kinds of things in your OLTP system. We’ve never seen anyone do that, right?

You could also have problems with transactions in these tables, and so you have to be very careful with setting retry logic

You might want to be careful when creating your tables too, because once they are there you can’t alter them, you actually have to drop and recreate it.

 

Limitations on indexes

Unique indexes are a no go, so are filtered ones. Want to drop or alter an index? Well that’s a table drop and recreate right there. Those null columns you have? Well you can’t have those in an index. Maybe it’s time to start listening to Joe Celko and getting rid of those pesky nullable columns.

 

Limitations on compiled procedures

On the bright side your developers will no longer be able to use cursors, so it’s not all doom and gloom. Set based items are a little more challenging though, you cannot do a multi-row insert using INSERT…VALUES, but that’s pretty minor, and just requires more typing.

You won’t be getting too recursive, CTEs aren’t supported, nor are subqueries. 

DELETE and UPDATE from aren’t allowed, nor are CASE statements. Forget about using UDFs, or the OUTPUT clause to get information back.

Want to get data from more than one table? UNION won’t get you there, nor will INTERSECT or EXCEPT. OUTER JOINS and APPLY aren’t supported either.

Filtering your results? Don’t expect to use ( OR IN (..)) as a construct or NOT, and you better be sure what you are looking for because LIKE isn’t allowed either. 

By no means is this a complete list, there are lots more limitations. 

 

So many limitations, such little time

In-Memory is fast, that’s been demonstrated, but there are so many restrictions around it’s usage that for most companies it is not going to be something that is even worth looking at. The use cases for it are probably too extreme.  Yes, it will get used, and used to great effect by some, but I think that we are looking at less than 1% of the install base could actually see true use cases and be able to adjust their development methods and code base to be able to support it (at least in this version).

So while this is an exciting breakthrough in performance you’ll have to excuse me while I keep working on tuning my code and indexes for performance, and setting up my AGs for DR.

 

 

 

For a full list of Unsupported Transact-SQL Constructs for In-Memory OLTP visit Technet

 

So Long MCM

The MCM cert is dead. So is the replacement MCSM and the MCA. That’s according to an email that was distributed tonight by Shelby Grieve, the Director of Certification Product Management. 

The end of the program comes in a short email that clearly talks of the end of the certs

Microsoft will no longer offer Masters and Architect level training rotations and will be retiring the Masters level certification exams as of October 1, 2013.

That’s pretty clear information right there.  

This from an email that came out at 10pm, the Friday of a long holiday weekend in the US. 

Microsoft appears to be attempting to bury this when people aren’t around, hoping that nobody will notice and that the furor will have calmed down some by the time people start checking back in on Tuesday.

 

Attaining an MCM is no easy feat. There is a reason that there are only a couple of hundred people with that cert. Taking this away really pains everyone who has worked so hard to get it. Those who have given up many hours, and for some many thousands of dollars. 

I guess it just means being a part of an exclusive club now, one that will not get any larger.  

That is a really sad thing, because there are people out there that should have this cert, that are actively working towards it, and now that is going to be taken away.

 

Follow Up 8/31/13 08:23

Jen Stirrup (blog | @jenstirrup) has created a Connect item “Please don’t get rid of the MCM and MCA programs“. If these are certs that you have any interest in at all I highly recommend you go and vote it up. Being honest, it won’t mean much, but it might at least let folks understand the deep level of dissatisfaction its killing has created. 

 

 

  

Developing To Specifications

I’m a DBA. As a class of people you will find that DBAs have a tendency to rant a little about developers. I would certainly be someone that you would find in that category. The trouble is that most of the time I don’t think that it is the developers fault, it is just a case of shooting the messenger.

As an example let’s look at a new database release that was being asked.

The biggest issue was that the code was provided so late in the cycle that the issues found (and there were a great many of them) could not be fixed. Why could they not be fixed? Because an arbitrary release data had been provided to the customer and this data could not slip for any reason whatsoever. Artificial deadlines, one of the worst things that devs and DBAs have to deal with.

The developers agreed to get the code fixes added to their backlog and to get it into a sprint for fixing in the next month. So after much discussion and a firm commitment we decided to move ahead with the release.

My next question to the dev team was “how are you going to get the data out? You have a few procs here for loading data into tables, but nothing for being able to consume that data afterwards.”

The response was a little stunning:

The only requirement to get done by is to have the data written to a database. After this deploy, we are going to create a way to get the data out

Outstanding. Way to develop to requirements.

In this instance I cannot really place blame on the dev team, they are just doing what is asked of them. I think the bigger problem is sitting with the folks who are gathering the business requirements and translating those, along with timelines, up to the business.

I think that it might be time for DBAs to stop pointing fingers at devs and start holding accountable those who are causing these problems, namely the Business Analysts and Project Managers.

Who’s with me on this?

How To Log Off of Server 2012

I’m finding little things that are petty annoyances when working with Windows Server 2012.

NoLogoffOption

The most recent one of these is the lack of a logoff option when connected to a server via a remote desktop session. When using the charm you can get the option to Disconnect, Shut down or Restart the server, but not log off.

It turns out that the solution is pretty simple, but not intuitive.

Just open up PowerShell and type logoff.