Tag: Rant

SCCM & SQL Server – A DBAs Worst Nightmare

Microsoft put out some great products, no really, they do. There are any number of applications and tools available for you to be able to do pretty much anything. One thing gaining popularity recently is System Center Configuration Manager (SCCM) which can be used to provide patch management, software distribution, inventory management, server provisioning and more.

Nightmare+on+SCOM+street Freddy & SCCM – a nightmare double feature

SCCM is great for businesses that are growing and need to maintain control over the devices used and maintain compliance across the enterprise.

This is all great. Businesses use it, businesses need it. SCCM has been designed to provide a relatively straightforward deployment that does not require any strong level of expertise. This is where SCCM falls down for me, as a DBA.

What is the problem?

SCCM does its own database management. It is a set it and forget it kind of thing. This is done so that an enterprise without SQL Server DBAs can go ahead and perform the deployment and management with any specialist knowledge.

This is all good and well, except when you do have a SQL Server DBA on staff; you have multiple deployments of SQL; and you like to perform consolidate servers wherever possible.

SCCM does some things which go completely against my wishes as a production DBA:

  • Requires sysadmin on SQL Server to both install and run the application
  • Requires Windows admin rights on the SQL Server
  • Installs software on Windows to perform backups of SQL Server
  • Adjusts SQL Server configuration settings (CLR & max text repl size)
  • Enables the TRUSTWORTHY option for the SCCM database
  • Sets the database recovery model to SIMPLE

Fortunately I found a lot of this information up front and decided that there was no way I was going to try and consolidate this database with any other in my environment. The security model is lacking in the worst fashion, and there is not much worse than taking all control away from a DBA.

I was glad that I made this choice as the SCCM decided to restart SQL as a part of the installation process. That would have caused a production outage if I had attempt to co-locate it with other low used databases.

Short recommendation

Being brief….if your sysadmins are looking to deploy SCCM in your environment, ask for a dedicated VM for SQL Server. Any attempt to consolidate this database will leave you open to massive security holes and production outages.

Fun With Recruiters

I love it when I get those special kinds of emails from recruitment agencies who claim they have the perfect position. I got one of those kinds of emails last week, I thought I would share it (as well as my response).

 

Title: Front End Web Development Lead
Position Type: Direct Placement
Location: Bothell, WA, United States
Description:

Duration: 0-6 month(s)
Job Description:
Front-End Web Development Lead – Bothell, WA
Every day over 19,000 Amdocs employees, serving customers in more than 60 countries, collaborate to help our customers realize their vision. We have a 30-year track record of ensuring service providers¿ success by embracing their most complex, mission-critical challenges. 100% of Fortune¿s Global 500 quad-play providers rely on Amdocs to help them run their businesses better.
Amdocs is a ¿can do¿ company that leads the industry, is fully accountable and most importantly, always delivers. This is our DNA. Our success has been sparked and sustained by hiring exceptional people. If this sounds like you— if you have the drive, focus and passion to succeed in a fast-paced, delivery-focused, global environment– then Amdocs would like to talk with you. Amdocs: Embrace Challenge, Experience Success.
– Please Note: All applicants must be currently authorized to work in the United States without employer sponsorship now or in the future.
Role Overview:
We are looking for a Front-End Web Development Lead to be a team lead directing a multi-shore group of developers tasked with providing issue resolution support for a very large-scale web retail store. Some of the responsibilities and duties include, but are not limited to:
Interface with defect assurance team to accept inbound production issues for resolution
Direct and coordinate work of offshore development team to ensure accurate and timely resolution of front-end production issues
Interface with customer development, business, and other teams as needed to provide good service, promote team visibility and positive perception
As team grows, evaluate potential additional team candidates and support Amdocs executive management by providing expert advice as required to grow our presence with the customer and provide continuous improvement
Provide analytical support to identify, develop, and drive strategic improvement initiatives involving functionality improvements, innovation solutions, and development and implementation methodologies
Serve as trusted advisor to management and client
Work day-to-day with key client management, development fulfillment partner, QA testing organization, providing expert support to each as needed and appropriate
Support development of improved governance of production defect management, including definitions of severity, criteria for prioritization, and defect management lifecycle processes.
Requirements:
5+ years front-end web development experience
5+ years hands on experience with the following key technologies: JSP Integration, HTML / HTML 5, AJAX, CSS, JavaScript, JSON, XML, JQuery
Strong leadership skills
Preferences:
Large scale /enterprise web retail experience
Integration with ATG Commerce
Integration with Adobe CQ
Experience with other industry standard integration technologies (e.g. WebLogic)
Technical leadership experiences in relevant technologies
Telecom experience
All Amdocs roles require strong verbal and written communications skills, position-appropriate mentoring/leadership abilities, ability to quickly master new systems and/or processes, capacity to stay organized while managing competing priorities, and a deep customer service orientation, both internally and externally.

 

I’m a database guy, I’ve never been a developer let alone a dev lead, and so I replied…

 

As a solutions provider I would expect you have have some great analytics. This leads me to ask the question as to what part of my skillset or background leads you, or anyone at your company to believe that I would be a good fit for, or consider the opportunity that you list below.

 

If I ever get a response I’ll be sure to post it.

Do You Trust Your Application Admins?

I was sitting at my desk, happily minding my own business when an alert came through that a database backup had failed. Ok, backups fail, I just figured one of the transaction log backups hiccupped (we’ve been having some problems the last few days).

When I looked at the failure it was a backup trying to write to the C drive on the server.

I NEVER backup to C. It can easily fill the drive and take down the system.

A bigger indicator that something was up was that all of our backups are done across a 10Gb network to a centralized location for ease of tape backup. This indicated that someone, not a DBA, had the access to run a SQL backup.

I trawled through the permissions on the server and nobody has that level of access so I couldn’t figure out who had done this and how.

 

So What Happened?

Looking through the SQL logs I saw multiple attempts by a contractor to login to SQL, all of which failed, then about 5 minutes after the backup error came through. Interesting stuff, so I walked over to the contractor and asked what was going on.

After he was unable to login he went to the application admin who helped him out with access…using the application service account.

One of the third party applications from Microsoft some unnamed vendor has a database on that server. Due to the nature of the well designed code the database owner has to be the same as the service account of the application. The application admin knows this password (not my doing).

After logging this contractor in as the application service account the app admin walked away and left him to his own devices. As a result this contractor was dbo on a database which manages security for the entire company. We should just consider ourselves lucky all this guy did was attempt to perform a backup.

 

Preventative Actions

In order to try and prevent this kind of thing in the future I am looking at implementing a login trigger for the service account which checks the host and application connecting and denying access to anything not in a specifically approved list. There is also a conversation going on to possibly disable interactive logons for the service account using a group policy at the domain level.

 

It is a Matter of Trust

While the application admin is obviously at serious fault here it leads to a question of how well do you trust your admin team?

Domain admins will be able to access your SQL Servers (get over it, there is no way you can keep them out, if they really want in there are numerous ways for them to do so).

Anyone with a password could share that with someone else and allow them to access your servers.

Ultimately you have to trust those that you work with to do the right thing. It’s always sad when those people let you down.

Vendor Support–The Good And Bad

When you go out and buy yourself new hardware or software you have the option of purchasing maintenance agreements at the same time. For software this generally provides the ability to constantly upgrade to the latest and greatest product. For hardware this tends to provide onsite support for when things go wrong and an SLA around that support arriving and the hardware being fixed.

I’ve been dealing with hardware and software vendors for years, I thought I’d share a couple of stories the really depict excellent service and the stuff that you never want to deal with as a customer.

 

The bad

When I started at one of my previous positions I walked into a Dell shop. If you aren’t familiar with that term it means that all hardware purchased was through Dell and that it gave us steeper discounts on the hardware that we would purchase.

I was put in charge of the Windows team pretty early at this company and we started to go through a hardware refresh. I sat down with the team and started asking questions about how things were with Dell. To a person they liked the hardware and what it delivered however they hated the service. There were common problems with SLAs not being met, the wrong replacement parts being delivered and phone support being unable to provide decent assistance.

I brought these issues to the Dell account rep and explained that we were looking at a fairly significant budget spend the next year on hardware (>$1m) and that I needed to see better results from the support team over there if I was going to spend any of that money with them.

Over the next 6 months I thoroughly documented every engagement with their support staff. This support engagements included:

  • Server down – customer impact
  • Hardware problem, replacement part needed – non-customer impacting
  • General troubleshooting assistance required – non-customer impacting

I’m sad to say that Dell was only able to meet the 4 x 7 x 365 agreement we had for hardware support in 10% of the cases that we opened. Techs would show up late (or not at all), parts would be incorrect even when the tech was onsite in time (techs did not bring the parts, they would be delivered separately) and we would have trouble getting anything above a level 2 tech person on the phone who’s troubleshooting ability seemed to be limited to “have you tried turning it off and back on again”.

This was several years ago and Dell might have significantly improved their support since then, however when I left the company we did not have a single Dell server in any of the three datacenters I had built out.

 

The good

Software has bugs. We all know that and have experienced problems with vendor applications, but what happens when you run into a significant issue and how does the vendor respond?

Recently SQLSentry released a new version of their Performance Advisor for SQL Server tool which is for monitoring and tuning SQL Server. I performed an upgrade to the new version and resumed monitoring, I didn’t run into any issues or problems.

A couple of days later I got a call from our Windows folks stating they had an alarm on high memory utilization on the monitoring server. I logged in to take a look and was shocked to see the SQLSentry monitoring service had consumed over 5GB of memory. I bounced the service and it reset itself. Over the next couple of days memory usage increased again, causing me to restart the service.

At this point I engaged the support folks, in particular Jason Hall (blog|twitter). We started triaging the issue.

We started up perfmon and captured a few counters to file to try and localize the memory leak. This allowed us to discover the leak was in unmanaged code, making the trouble a lot tougher to track down.

The next step was to install the Windows Debugging Tools from Microsoft. With these deployed and a set of symbols downloaded we used UMDH to capture the before and after log heap allocations for the monitoring service. One comparison log later and we were able to track the issue down to a leak in Microsoft’s managed wrapper for the VDS (Virtual Disk Service) subsystem which is used by SQLSentry to monitor mount points.

I’m running several multi-node, multi-instance SQL Server Failover Clustered Instances and make extensive use of mount points (current count is 136 monitored mount points).

To test and confirm that VDS was the actual issue one of the SQLSentry development team threw together a very small 50 line application that I could hit a couple of buttons on an watch memory usage. It took a bit of a tired mouse finger, but I was able to verify quickly that VDS was indeed the problem.

Now fully understanding the problem in hand the SQLSentry team quickly built their own COM wrapper to handle mount point monitoring and provided me with a new build of the product. I went through a standard deployment and started the services back up again. A week later and the service is still running at around 500MB.

 

Throughout the process of problem triage, issue identification and resolution it was a very engaged support process with an appropriate level of urgency for each of the steps. Everything was handled to completion and I have been very happy with the support I received. That’s why my maintenance for this product will be renewed next year. I know that the money spent is worth it.

 

TL;DR

In the past I have spent a lot of money on very high levels of support from Dell and received nothing but poor service. As a result they lost several million dollars of business.

On the flip side I spent a small amount of money on maintenance with SQLSentry and received excellent support and levels of engagement which will help retain me as a long term customer.

 

I’d be interested to hear about your experiences with these vendors,  or any other.

The Importance Of Good Documentation

Believe it or not I’m not actually talking about server documentation here (for an excellent post on that go read Colleen Morrow’s The Importance of a SQL Server Inventory).

I have spent the last 12 days dealing with a single production release. It is being considered a significant release, but to be honest it really isn’t. The biggest challenge has been to do with the way that the release documentation has been provided and the fashion in which the scripts have been built.

 

What I got

Here’s a brief example of a change request I’ve seen:

  • Change Request:
    • Update database – products (this links to a Sharepoint page)
    • Use code from this location (links to a file share)
  • Sharepoint page
    • Go to this location (but replace the middle part of the link with the link from the change request page)
    • Copy this subfolder to your machine
    • Follow the process on Sharepoint page 2 to deploy the code
    • Once Sharepoint page 2 is complete run script X
  • Sharepoint page 2
    • run script 1
    • run script 2
    • run script 3

 

Pretty painful right? Now multiply that by 8 for each of the database code deployments that needed to be completed. No fun, no fun at all.

 

What do I want?

It’s going to be a work in progress but we’ll be working with this particular dev team to put together a unified document to simplify the release structure.

Here’s what I want to see:

  • Change Request:
    • Update database – products – deployment instructions attached
  • Attachment
    • Deploy script 1 (link to script)
    • Deploy script 2 (link to script)
    • Deploy script 3 (link to script)
    • Deploy script X (link to script)
    • Rollback script (link to script)

 

The difference?

Instead of having to reference several different Sharepoint locations in addition to a change control document I now have a single document, attached to the change, which clearly defines the process for the release, the order for scripts to be executed, a link to each of those scripts and the relevant rollback information.

It’s not something that I think is too out of line to provide, but I’ve found the folks who have been providing releases in this method are extremely resistant to change. I can understand that, but to be fair, they aren’t the ones under the gun trying to put something in to a production environment in a consistent and stable manner.

I’ve lots of fun meetings coming up to talk about this.

 

What about you?

How do you get your change control documentation? Is it something plainly written and easy to follow? Or do you have to have a degree in cryptography to get code in to production?

Stop Bad Database Design

Every year that goes by I sit in hope that I won’t see bad database design.

Every year I am disappointed.

 

As an example here is a table create statement that I saw the other day (table and column names have been changed to protect the innocent)

CREATE TABLE [dbo].BestestTableEVAR(

 Col1 [int] IDENTITY(1,1) NOT NULL,

 Col2 [uniqueidentifier] NULL,

 Col3 [uniqueidentifier] NOT NULL,

 Col4 [smallint] NULL,

 Col5 [smallint] NOT NULL,

 Col6 [bit] NOT NULL,

 Col7 [xml] NULL,

 Col8 [xml] NULL,

 ColA [xml] NULL,

 ColB [xml] NULL,

 ColC [datetime2](2) NULL,

 ColD [datetime2](2) NULL,

 COlE [datetime2](2) NULL,

 ColF [datetime2](2) NULL,

 CONSTRAINT [PK_BestestTableEVAR] PRIMARY KEY CLUSTERED 

(

 Col3 ASC

)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 90) ON [PRIMARY]

) ON [PRIMARY]

 

So what’s wrong with this?

The clustered primary key on this table is a GUID. Is that bad? That’s an unequivocal YES! Read all that Kimberly Tripp (blog|twitter) has to say about GUIDs in database design.

What makes this all the more crazy is that the table has an identity column. That’s a natural clustering key ready and waiting to be used and yet for some reason it’s not.

This table is going to fragment like crazy, it won’t scale and performance will be hideous. Additionally, thanks to the XML columns this table can’t even be rebuilt online meaning there’s no way to help the fragmentation or performance without actually taking the table offline to do it, meaning it can’t handle any transactions. This is a problem in a table on an OLTP system.

 

I would go back and change some things. Let’s say you wanted to keep the table structure the same, that’s fine, but let’s be smart about the keys and indexes.

It would make sense to change the identity column to be clustered (I would also make this the primary key) and then, to ensure uniqueness on Col2 which is the current primary key a unique index is warranted.

CREATE TABLE [dbo].BestestTableEVAR(

 Col1 [int] IDENTITY(1,1) NOT NULL,

 Col2 [uniqueidentifier] NULL,

 Col3 [uniqueidentifier] NOT NULL,

 Col4 [smallint] NULL,

 Col5 [smallint] NOT NULL,

 Col6 [bit] NOT NULL,

 Col7 [xml] NULL,

 Col8 [xml] NULL,

 ColA [xml] NULL,

 ColB [xml] NULL,

 ColC [datetime2](2) NULL,

 ColD [datetime2](2) NULL,

 COlE [datetime2](2) NULL,

 ColF [datetime2](2) NULL,

 CONSTRAINT [PK_BestestTableEVAR] PRIMARY KEY CLUSTERED 

(

 Col1 ASC

)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 90) ON [PRIMARY]

) ON [PRIMARY];

 

CREATE UNIQUE NONCLUSTERED INDEX UI_BestestTableEVAR_Col2 on BestestTableEVAR (Col2);

Sure, we still won’t be able to rebuild the indexes online, but we won’t have the same crazy levels of fragmentation that we would have had before.

 

I know I’ll be seeing a lot of bad design this year and I know that I’ll be forced to push that bad design into production. Doesn’t stop me trying to change things however. Help your devs out, let them know when their design is a problem. Who knows, maybe you’ll change things for the better.

Please Don’t Use Deprecated Data Types

I know that a lot of vendors like to write for the lowest common denominator (i.e. SQL 2000) but really folks it’s gone too far. I’m sick of cracking open vendor code that’s certified for SQL 2008 and seeing things like IMAGE and TEXT data types. Microsoft deprecated these things back when they released SQL 2005 (see http://msdn.microsoft.com/en-US/library/ms143729(v=SQL.90).aspx under Textpointers). Why are you persisting these things six years later?

I bring this up because I’ve come across further egregious usage of these data types in vendor code yet again. The vendor in question? Microsoft.

Yes, that’s right, the folks that deprecated the data type six years ago is still using it to a large extent within the ReportServer and ReportServerTempDB databases that support SQL Server Reporting Services. Seriously Microsoft? Can you please get with the plan and fix this nonsense?

The following query, run against the ReportServer database will show 14 different tables (31 columns) using a variety of NTEXT and IMAGE data types.

select st.name as TableName, t.name as DataType, sc.name as ColumnName 

    from sys.types t 

        inner join sys.columns sc

            on t.system_type_id = sc.system_type_id

        inner join sys.tables st

            on sc.object_id = st.object_id

where 

    t.name in ('image', 'text', 'ntext')

order by 

    st.name, t.name

 

I have filed a Connect item asking Microsoft to fix this. Please go vote for it at https://connect.microsoft.com/SQLServer/feedback/details/714117/ssrs-using-deprecated-data-types-in-its-databases and help us rid the world of this old stuff.