Tag: SQLBP

SQL July 9, 2012

Changing SQL Agent Job Subsystem Queue Lengths

In the event you are running a lot of jobs on your SQL Server, or you happen to have a great number of jobs that kick off close to each other and use the same subsystem (i.e. PowerShell) then you might receive a warning in the SQL Agent Error Log stating that the job step “is being queued for the PowerShell subsystem”.

This queuing is deliberate. Microsoft put in safeguards around the various subsystems that the Agent calls in order to help prevent any particular subsystem from taking over all the worker threads (after all you wouldn’t want your 25 PowerShell job steps preventing your queries from running because they’d taken all the worker threads would you?)

This is one of the reasons that I like to use dedicated servers just to handle all jobs/ETL processes. I like to keep my local jobs to just maintenance processes (backup/reindex/checkdb) wherever possible. In doing this I need to be sure that I am able to increase the max worker threads associated to a subsystem to be sure that jobs don’t get delayed. As such I keep an eye on the SQL Agent Error Log and server resources to check that I don’t get thread starvation issues (and being just a job server that’s pretty unusual).

Today I saw a bunch of warnings in the log to do with PowerShell job steps being queued. This wasn’t a surprise as I had added a whole bunch of new jobs and had not adjusted things from the default values. First I went through and grabbed the default information for each of the subsystems so that I could easily revert if the need arose:

SELECT [subsystem_id]
      ,[subsystem]
      ,[max_worker_threads]
  FROM [msdb].[dbo].[syssubsystems]

As you can see here there are only 2 worker threads set for PowerShell. I need to bump this up to 20 so that it could handle the concurrent job load. I have plenty of memory available on the server so I’m not concerned with the overhead associated with the loading of PowerShell either. Always something you need to keep in mind (oh and you have to be a sysadmin to be able to update this table).

UPDATE msdb.dbo.syssubsystems
SET max_worker_threads = 20
WHERE subsystem = N'PowerShell'  

Checking the value again yields the updated information:

SELECT [subsystem_id]
      ,[subsystem]
      ,[max_worker_threads]
  FROM [msdb].[dbo].[syssubsystems]

Now it’s a simple case of restarting the SQL Server Agent service for the change to take effect.

Note: Applying Service Packs or Cumulative Updates could potentially reset this value, so make sure you document the change and confirm it after any upgrade (for example http://support.microsoft.com/kb/972759 )

Backups, SQL June 12, 2012

Generic Database Restores

The other day I found myself in the position of needing to restore multiple databases to a server. In this case all the log files would be on one drive and all the data files on another. Rather than sit there and write restore scripts for ~20 databases I figured it would be quicker to create a quick wrapper that I could just pass the filename into a then let nature take its course.

This is very basic, no restore performance tuning enhancements for example, however it does just the job that I need it to.

If I was smart I would have made this a stored procedure and then iterated through the list of backups in the directory using powershell and then executed the proc for each one. Maybe next time.

Anyhow, view the script below or download Database Restores Generic.sql

SET NOCOUNT ON
 
DECLARE @File NVARCHAR(1000) = N'MyDatabase.bak'
DECLARE @Path NVARCHAR(1000) = N'D:SQLBackups'
DECLARE @MDFPath NVARCHAR(1000) = 'E:MSSQLData'
DECLARE @LDFPath NVARCHAR(1000) = 'L:MSSQLLogs'
 
DECLARE @FullLoc NVARCHAR(2000)= @Path + @File
 
DECLARE @DatabaseName NVARCHAR(128)
DECLARE @RestoreMDF NVARCHAR(2000)
DECLARE @RestoreLDF NVARCHAR(2000)
DECLARE @RestoreCommandFull NVARCHAR(4000)
DECLARE @sqlexec NVARCHAR(4000) 
 
DECLARE @MDFID INT
DECLARE @LDFID INT
DECLARE    @MDFName NVARCHAR(128)
DECLARE @LDFName NVARCHAR(128)
 
 
DECLARE @RestoreFiles TABLE
    (
      ID INT IDENTITY(1,1),
      LogicalName NVARCHAR(128) , PhysicalName NVARCHAR(260) , [Type] CHAR(1) , FileGroupName NVARCHAR(128) , [size] NUMERIC(20, 0) ,
      MAXSIZE NUMERIC(20, 0) , FileID BIGINT , CreateLSN NUMERIC(25, 0) , DropLSN NUMERIC(25, 0) , UniqueID UNIQUEIDENTIFIER ,
      ReadOnlyLSN NUMERIC(25, 0) , ReadWriteLSN NUMERIC(25, 0) , BackupSizeInBytes BIGINT , SourceBlockSize INT , FileGroupID INT ,
      LogGroupGUID UNIQUEIDENTIFIER , DifferentialBaseLSN NUMERIC(25, 0) , DifferentialBaseGUID UNIQUEIDENTIFIER ,
      IsReadOnly BIT , IsPresent BIT , TDEThumbprint VARBINARY(32)
    )
 
DECLARE @RestoreHeader TABLE
    (
        BackupName NVARCHAR(128), BackupDescription NVARCHAR(255), BackupType SMALLINT, ExpirationDate DATETIME,
        Compressed CHAR(1), POSITION SMALLINT, DeviceType TINYINT, UserName NVARCHAR(128), ServerName NVARCHAR(128),
        DatabaseName NVARCHAR(128), DatabaseVersion INT, DatabaseCreationDate DATETIME, BackupSize NUMERIC(20,0),
        FirstLSN NUMERIC(25,0), LastLSN NUMERIC(25,0), CheckpointLSN NUMERIC(25,0), DatabaseBackupLSN NUMERIC(25,0),
        BackupStartDate DATETIME, BackupFinishDate DATETIME, SortORder SMALLINT, [CodePage] SMALLINT, UnicodeLocaleId INT,
        UnicodeComparisonStyle INT, CompatabilityLevel TINYINT, SoftwareVendorId INT, SoftwareVersionMajor INT,
        SoftwareVersionMinor INT, SoftwareVersionBuild INT, MachineName NVARCHAR(128), Flags INT, BindingID UNIQUEIDENTIFIER,
        RecoveryForkID UNIQUEIDENTIFIER, [Collation] NVARCHAR(128), FamilyGUID UNIQUEIDENTIFIER, HasBulkLoggedData BIT,
        IsSnapshot BIT, IsReadOnly BIT, IsSingleUser BIT, HasBackupChecksums BIT, IsDamaged BIT, BeginsLogChain BIT,
        HasIncompleteMetaData BIT, IsForceOffline BIT, IsCopyOnly BIT, FirstRecoveryForkID UNIQUEIDENTIFIER,
        ForkPointLSN NUMERIC(25,0), RecoveryModel NVARCHAR(60), DifferentialBaseLSN NUMERIC(25,0), 
        DifferentialBaseGUID UNIQUEIDENTIFIER, BackupTypeDescription NVARCHAR(60), BackupSetGUID UNIQUEIDENTIFIER,
        CompressedBackupSize BIGINT
        )
 
SELECT @sqlexec = 'RESTORE FILELISTONLY FROM DISK = ''' + @FullLoc + ''''
INSERT INTO @RestoreFiles EXEC (@sqlexec)
 
 
SELECT @sqlexec = 'RESTORE HEADERONLY FROM DISK = ''' + @FullLoc + ''''
INSERT INTO @RestoreHeader EXEC (@sqlexec)
 
SELECT @DatabaseName = DatabaseName FROM @RestoreHeader
 
 
SELECT @MDFID = MIN(ID) FROM @RestoreFiles WHERE [Type] != 'L'
WHILE @MDFID IS NOT NULL
BEGIN
 
 
    IF @MDFID = 1
        BEGIN
            SELECT @RestoreMDF = 'WITH MOVE ' + CHAR(39) + LogicalName + CHAR(39) + ' TO ' + CHAR(39) + @MDFPath +  REVERSE(LEFT(REVERSE(PhysicalName), CHARINDEX('', REVERSE(PhysicalName))-1)) + CHAR(39) + CHAR(13) FROM @RestoreFiles WHERE ID = @MDFID
            END
    ELSE
        BEGIN
            SELECT @RestoreMDF = @RestoreMDF + ', MOVE ' + CHAR(39) + LogicalName + CHAR(39) + ' TO ' + CHAR(39) + @MDFPath +  REVERSE(LEFT(REVERSE(PhysicalName), CHARINDEX('', REVERSE(PhysicalName))-1)) + CHAR(39) FROM @RestoreFiles WHERE ID = @MDFID
        END
 
SELECT @MDFID = MIN(ID) FROM @RestoreFiles WHERE ID > @MDFID AND [Type] != 'L' 
END
 
 
SELECT @LDFID = MIN(ID) FROM @RestoreFiles WHERE [Type] = 'L'
WHILE @LDFID IS NOT NULL
BEGIN
    SELECT @RestoreLDF = ISNULL(@RestoreLDF,'') + ', MOVE ' + CHAR(39) +LogicalName + CHAR(39) + ' TO ' + CHAR(39) + @LDFPath +  REVERSE(LEFT(REVERSE(PhysicalName), CHARINDEX('', REVERSE(PhysicalName))-1)) + CHAR(39) FROM @RestoreFiles WHERE ID = @LDFID
 
SELECT @LDFID = MIN(ID) FROM @RestoreFiles WHERE ID > @LDFID AND [Type] = 'L' 
END
 
 
 
SELECT @RestoreCommandFull = 'RESTORE DATABASE ' + QUOTENAME(@DatabaseName) + ' ' + CHAR(13) + 'FROM DISK = ''' + @FullLoc + '''' + CHAR(13)
SELECT @RestoreCommandFull = @RestoreCommandFull + @RestoreMDF + @RestoreLDF + CHAR(13) + ', NORECOVERY, STATS=20;'
--PRINT @RestoreCommandFull
 
EXEC (@RestoreCommandFull)
 

Rant May 29, 2012

Vendor Support–The Good And Bad

When you go out and buy yourself new hardware or software you have the option of purchasing maintenance agreements at the same time. For software this generally provides the ability to constantly upgrade to the latest and greatest product. For hardware this tends to provide onsite support for when things go wrong and an SLA around that support arriving and the hardware being fixed.

I’ve been dealing with hardware and software vendors for years, I thought I’d share a couple of stories the really depict excellent service and the stuff that you never want to deal with as a customer.

The bad

When I started at one of my previous positions I walked into a Dell shop. If you aren’t familiar with that term it means that all hardware purchased was through Dell and that it gave us steeper discounts on the hardware that we would purchase.

I was put in charge of the Windows team pretty early at this company and we started to go through a hardware refresh. I sat down with the team and started asking questions about how things were with Dell. To a person they liked the hardware and what it delivered however they hated the service. There were common problems with SLAs not being met, the wrong replacement parts being delivered and phone support being unable to provide decent assistance.

I brought these issues to the Dell account rep and explained that we were looking at a fairly significant budget spend the next year on hardware (>$1m) and that I needed to see better results from the support team over there if I was going to spend any of that money with them.

Over the next 6 months I thoroughly documented every engagement with their support staff. This support engagements included:

Server down – customer impact
Hardware problem, replacement part needed – non-customer impacting
General troubleshooting assistance required – non-customer impacting

I’m sad to say that Dell was only able to meet the 4 x 7 x 365 agreement we had for hardware support in 10% of the cases that we opened. Techs would show up late (or not at all), parts would be incorrect even when the tech was onsite in time (techs did not bring the parts, they would be delivered separately) and we would have trouble getting anything above a level 2 tech person on the phone who’s troubleshooting ability seemed to be limited to “have you tried turning it off and back on again”.

This was several years ago and Dell might have significantly improved their support since then, however when I left the company we did not have a single Dell server in any of the three datacenters I had built out.

The good

Software has bugs. We all know that and have experienced problems with vendor applications, but what happens when you run into a significant issue and how does the vendor respond?

Recently SQLSentry released a new version of their Performance Advisor for SQL Server tool which is for monitoring and tuning SQL Server. I performed an upgrade to the new version and resumed monitoring, I didn’t run into any issues or problems.

A couple of days later I got a call from our Windows folks stating they had an alarm on high memory utilization on the monitoring server. I logged in to take a look and was shocked to see the SQLSentry monitoring service had consumed over 5GB of memory. I bounced the service and it reset itself. Over the next couple of days memory usage increased again, causing me to restart the service.

At this point I engaged the support folks, in particular Jason Hall (blog|twitter). We started triaging the issue.

We started up perfmon and captured a few counters to file to try and localize the memory leak. This allowed us to discover the leak was in unmanaged code, making the trouble a lot tougher to track down.

The next step was to install the Windows Debugging Tools from Microsoft. With these deployed and a set of symbols downloaded we used UMDH to capture the before and after log heap allocations for the monitoring service. One comparison log later and we were able to track the issue down to a leak in Microsoft’s managed wrapper for the VDS (Virtual Disk Service) subsystem which is used by SQLSentry to monitor mount points.

I’m running several multi-node, multi-instance SQL Server Failover Clustered Instances and make extensive use of mount points (current count is 136 monitored mount points).

To test and confirm that VDS was the actual issue one of the SQLSentry development team threw together a very small 50 line application that I could hit a couple of buttons on an watch memory usage. It took a bit of a tired mouse finger, but I was able to verify quickly that VDS was indeed the problem.

Now fully understanding the problem in hand the SQLSentry team quickly built their own COM wrapper to handle mount point monitoring and provided me with a new build of the product. I went through a standard deployment and started the services back up again. A week later and the service is still running at around 500MB.

Throughout the process of problem triage, issue identification and resolution it was a very engaged support process with an appropriate level of urgency for each of the steps. Everything was handled to completion and I have been very happy with the support I received. That’s why my maintenance for this product will be renewed next year. I know that the money spent is worth it.

TL;DR

In the past I have spent a lot of money on very high levels of support from Dell and received nothing but poor service. As a result they lost several million dollars of business.

On the flip side I spent a small amount of money on maintenance with SQLSentry and received excellent support and levels of engagement which will help retain me as a long term customer.

I’d be interested to hear about your experiences with these vendors, or any other.

SQL April 25, 2012

Who Owns Your Databases And Jobs?

Quick question, who owns your databases? And how about your jobs? Don’t know? Go find out…now find out without using the GUI.

Querying sys.databases will give you the owner_sid. The trouble is that you have to relate that owner_sid back to a user. The same with sysjobs, that sid needs to relate to someone.

You could attempt joining against sys.server_principals, however if there’s a user that doesn’t exist or is a part of a domain group they wouldn’t show up and you would just get a NULL returned. Try the following example:

SELECT  d.NAME ,

        sp.NAME

FROM    sys.databases d

        LEFT JOIN sys.server_principals sp ON d.owner_sid = sp.sid;

SELECT  s.NAME ,

        sp.NAME

FROM    msdb.dbo.sysjobs s

        LEFT OUTER JOIN sys.server_principals sp ON s.owner_sid = sp.sid ;

Now what, how can you quickly and easily get the owner? A little known system function will come to the rescue.

SUSER_SNAME (server_user_sid) will return the login name associated with a SID. Using this you can very quickly grab the information about who owns what without having to play with joins or get incomplete results:

SELECT  name ,

        SUSER_SNAME(owner_sid)

FROM    sys.databases;

SELECT  name ,

        SUSER_SNAME(owner_sid)

FROM    msdb.dbo.sysjobs;

There’s some great little hidden gems lying around in SQL Server like this. You just need to find them.

Rant April 24, 2012

The Importance Of Good Documentation

Believe it or not I’m not actually talking about server documentation here (for an excellent post on that go read Colleen Morrow’s The Importance of a SQL Server Inventory).

I have spent the last 12 days dealing with a single production release. It is being considered a significant release, but to be honest it really isn’t. The biggest challenge has been to do with the way that the release documentation has been provided and the fashion in which the scripts have been built.

What I got

Here’s a brief example of a change request I’ve seen:

Change Request:
- Update database – products (this links to a Sharepoint page)
- Use code from this location (links to a file share)
Sharepoint page
- Go to this location (but replace the middle part of the link with the link from the change request page)
- Copy this subfolder to your machine
- Follow the process on Sharepoint page 2 to deploy the code
- Once Sharepoint page 2 is complete run script X
Sharepoint page 2
- run script 1
- run script 2
- run script 3

Pretty painful right? Now multiply that by 8 for each of the database code deployments that needed to be completed. No fun, no fun at all.

What do I want?

It’s going to be a work in progress but we’ll be working with this particular dev team to put together a unified document to simplify the release structure.

Here’s what I want to see:

Change Request:
- Update database – products – deployment instructions attached
Attachment
- Deploy script 1 (link to script)
- Deploy script 2 (link to script)
- Deploy script 3 (link to script)
- Deploy script X (link to script)
- Rollback script (link to script)

The difference?

Instead of having to reference several different Sharepoint locations in addition to a change control document I now have a single document, attached to the change, which clearly defines the process for the release, the order for scripts to be executed, a link to each of those scripts and the relevant rollback information.

It’s not something that I think is too out of line to provide, but I’ve found the folks who have been providing releases in this method are extremely resistant to change. I can understand that, but to be fair, they aren’t the ones under the gun trying to put something in to a production environment in a consistent and stable manner.

I’ve lots of fun meetings coming up to talk about this.

What about you?

How do you get your change control documentation? Is it something plainly written and easy to follow? Or do you have to have a degree in cryptography to get code in to production?

SQL March 5, 2012

Refreshing Development Databases

Been a busy week at work and I’ve had to perform multiple dumps of production databases back to a development server. This is something that would normally be pretty quick and easy however a couple of things impacted that and led to having to make some choices about how to perform the task.

Here are the relevant details that I need to use to decide how to move the backup files:

The servers are in two different datacenters and the WAN link between them is not very fast
The backups are stored on a CIFS share that is non-routable for anything other than machines on that VLAN (for security reasons)
There is only a single machine in the datacenter that has the ability to reach through the firewall to the development server
I had to get the data across the WAN as fast as possible, however did not want to have to babysit what would be a several hour process (it should restarted in the case of any issues)
All the hosts involved are running Windows 2008R2 SP1
I did not want to install any extra software on any of the servers
This is not an ongoing process and so I wanted something quick. No need to try and over-engineer the situation

How to get at the backup

My first thought was to use the improved TCP stack in Windows 2008R2, log on to the development server and pull the data from the CIFS share. Unfortunately this was out as the development machine does not have access to the subnet in order to pull the data.

Next was to perform a backup directly across the WAN to the development machine. This would not work due to firewall restrictions preventing communication between the machines (and this would also not be network optimized)

The only option left was to use the single machine in the datacenter (let’s call this the jump server), on the relevant VLAN, which had access through the firewall to push the backup to the development box.

How to optimize the copy and have it restartable

Opening a terminal server session to the jump server, opening an explorer window to the source and destination, then dragging the file over would copy the file. There have also been reported incidents where using the drag and drop method has caused memory problems (here’s an example from a forum thread http://social.technet.microsoft.com/Forums/en-US/winservergen/thread/3f8a80fd-914b-4fe7-8c93-b06787b03662/ ). This also does not address the concern with automatic restarts for the copy operation.

Opening a terminal server session to the jump server and running copy/xcopy would do the trick. There are no memory issues that I’m aware of with this method. We do have the problem whereby the copy process would not restart in the event of a problem. This method also does not do the best job of managing copies over a network. So this was out.

Opening a terminal server session to the jump server and using robocopy to copy the file made the most sense. Using the additional /r and /w switches you can have the copy process retry as many times as needed and have a defined interval between those retries. Robocopy also has a level of network optimization built in. As a bonus it also provides ongoing information as to the copy completion percentage and details statistics when complete. Robocopy is built in to Windows 2008R2 and so also meets the requirement of not wanting to install any more software.

Final copy solution

In the end I found myself opening up a terminal server remote desktop session to the jump server which had both a route to the backup file and access through the firewall to the development server. From there I used Robocopy to perform the file copy over the WAN, providing the best performance as well as automatic restarts from any failures.

I’m curious, would you have handled this differently?

Replication, SQL February 27, 2012

Replication–Beware Reuse of View Names When Filtering Articles

I came across a very strange issue with replication this weekend.

As a part of a release a new table was being created with extra columns and a view was created with the old table name that referenced the old column set (so as not to break old legacy applications). This table was replicated. The dev team provided us with scripts to setup replication. During the attempt to perform the snapshot I ran into some problems and so dropped the publication.

All of a sudden users started to complain that they could no longer see the view and couldn’t query the data. I looked, and sure enough the view was not there. Very strange. I told the users that I’d get it created again once replication was done. I redid the publication and the users called me saying not to worry, the view was back.

What the heck?

After digging through I found the cause of the issue. The dev had created the replication scripts and included a filter to eliminate certain columns. This replication then gets wrapped into an article view. In this case the dev had manually named the @view_name in sp_articleview to be the same as the view in the database. This caused SQL to drop the existing view and create it’s own. When the publication was dropped so was the view.

Here are the steps to repro this behavior:

Firstly I’m creating a view against the local table SQLMonitor.dbo.SQLBuilds

CREATE VIEW Bananas
AS
    SELECT  Build ,
            [KB Article] ,
            URL ,
            Description ,
            MajorBuild ,
            MinorBuild
    FROM    SQLMonitor.dbo.SQLBuilds 

A quick select shows data:

Now let’s create a publication for replication with column filters against the SQLBuilds table. We’ll specify the @view_name to be Bananas. I’m also including an @filter_clause to eliminate rows (I’ll explain that in a bit).

-- Enabling the replication database
USE master
EXEC sp_replicationdboption @dbname = N'SQLMonitor', @optname = N'publish',
    @value = N'true'
GO
 
EXEC [SQLMonitor].sys.sp_addlogreader_agent @job_login = NULL,
    @job_password = NULL, @publisher_security_mode = 1
GO
EXEC [SQLMonitor].sys.sp_addqreader_agent @job_login = NULL,
    @job_password = NULL, @frompublisher = 1
GO
-- Adding the transactional publication
USE [SQLMonitor]
EXEC sp_addpublication @publication = N'Bananas Test',
    @description = N'Transactional publication of database ''SQLMonitor'' from Publisher ''local''.',
    @sync_method = N'concurrent', @retention = 0, @allow_push = N'true',
    @allow_pull = N'true', @allow_anonymous = N'false',
    @enabled_for_internet = N'false', @snapshot_in_defaultfolder = N'true',
    @compress_snapshot = N'false', @ftp_port = 21, @ftp_login = N'anonymous',
    @allow_subscription_copy = N'false', @add_to_active_directory = N'false',
    @repl_freq = N'continuous', @status = N'active',
    @independent_agent = N'true', @immediate_sync = N'false',
    @allow_sync_tran = N'false', @autogen_sync_procs = N'false',
    @allow_queued_tran = N'false', @allow_dts = N'false', @replicate_ddl = 1,
    @allow_initialize_from_backup = N'false', @enabled_for_p2p = N'false',
    @enabled_for_het_sub = N'false'
GO
 
 
EXEC sp_addpublication_snapshot @publication = N'Bananas Test',
    @frequency_type = 1, @frequency_interval = 0,
    @frequency_relative_interval = 0, @frequency_recurrence_factor = 0,
    @frequency_subday = 0, @frequency_subday_interval = 0,
    @active_start_time_of_day = 0, @active_end_time_of_day = 235959,
    @active_start_date = 0, @active_end_date = 0, @job_login = NULL,
    @job_password = NULL, @publisher_security_mode = 1
EXEC sp_grant_publication_access @publication = N'Bananas Test',
    @login = N'sa'
GO
EXEC sp_grant_publication_access @publication = N'Bananas Test',
    @login = N'NT AUTHORITYSYSTEM'
GO
EXEC sp_grant_publication_access @publication = N'Bananas Test',
    @login = N'NT SERVICESQLSERVERAGENT'
GO
EXEC sp_grant_publication_access @publication = N'Bananas Test',
    @login = N'NT SERVICEMSSQLSERVER'
GO
EXEC sp_grant_publication_access @publication = N'Bananas Test',
    @login = N'distributor_admin'
GO
 
-- Adding the transactional articles
USE [SQLMonitor]
EXEC sp_addarticle @publication = N'Bananas Test', @article = N'SQLBuilds',
    @source_owner = N'dbo', @source_object = N'SQLBuilds', @type = N'logbased',
    @description = N'', @creation_script = N'', @pre_creation_cmd = N'drop',
    @schema_option = 0x000000000803509F,
    @identityrangemanagementoption = N'none',
    @destination_table = N'SQLBuilds', @destination_owner = N'dbo',
    @status = 24, @vertical_partition = N'true',
    @ins_cmd = N'CALL [sp_MSins_dboSQLBuilds]',
    @del_cmd = N'CALL [sp_MSdel_dboSQLBuilds]',
    @upd_cmd = N'SCALL [sp_MSupd_dboSQLBuilds]'
 
-- Adding the article's partition column(s)
EXEC sp_articlecolumn @publication = N'Bananas Test', @article = N'SQLBuilds',
    @column = N'Build', @operation = N'add', @force_invalidate_snapshot = 1,
    @force_reinit_subscription = 1
EXEC sp_articlecolumn @publication = N'Bananas Test', @article = N'SQLBuilds',
    @column = N'KB Article', @operation = N'add',
    @force_invalidate_snapshot = 1, @force_reinit_subscription = 1
EXEC sp_articlecolumn @publication = N'Bananas Test', @article = N'SQLBuilds',
    @column = N'MajorBuild', @operation = N'add',
    @force_invalidate_snapshot = 1, @force_reinit_subscription = 1
EXEC sp_articlecolumn @publication = N'Bananas Test', @article = N'SQLBuilds',
    @column = N'MinorBuild', @operation = N'add',
    @force_invalidate_snapshot = 1, @force_reinit_subscription = 1
 
-- Adding the article synchronization object
EXEC sp_articleview @publication = N'Bananas Test', @article = N'SQLBuilds',
    @view_name = N'Bananas', @filter_clause = N'[KB Article] >0',
    @force_invalidate_snapshot = 1, @force_reinit_subscription = 1
GO

This does it’s job, creates the publication with the article and everything. Now let’s query the view again…

Several columns are now missing and the view matches what’s specified in the replication script. The old view has been dropped and a new system view has been created:

Now dropping the publication also drops the view

-- Dropping the transactional articles
USE [SQLMonitor]
EXEC sp_dropsubscription @publication = N'Bananas Test',
    @article = N'SQLBuilds', @subscriber = N'all', @destination_db = N'all'
GO
USE [SQLMonitor]
EXEC sp_droparticle @publication = N'Bananas Test', @article = N'SQLBuilds',
    @force_invalidate_snapshot = 1
GO
 
-- Dropping the transactional publication
USE [SQLMonitor]
EXEC sp_droppublication @publication = N'Bananas Test'
GO
 

This appears to be a bug and not the behavior I would expect.

If I recreate the view and then attempt to create the publication again, this time without the @filter_clause I get an error stating the view already exists:

USE [SQLMonitor]
EXEC sp_addpublication @publication = N'Bananas Test',
    @description = N'Transactional publication of database ''SQLMonitor'' from Publisher ''local''.',
    @sync_method = N'concurrent', @retention = 0, @allow_push = N'true',
    @allow_pull = N'true', @allow_anonymous = N'false',
    @enabled_for_internet = N'false', @snapshot_in_defaultfolder = N'true',
    @compress_snapshot = N'false', @ftp_port = 21, @ftp_login = N'anonymous',
    @allow_subscription_copy = N'false', @add_to_active_directory = N'false',
    @repl_freq = N'continuous', @status = N'active',
    @independent_agent = N'true', @immediate_sync = N'false',
    @allow_sync_tran = N'false', @autogen_sync_procs = N'false',
    @allow_queued_tran = N'false', @allow_dts = N'false', @replicate_ddl = 1,
    @allow_initialize_from_backup = N'false', @enabled_for_p2p = N'false',
    @enabled_for_het_sub = N'false'
GO
 
 
EXEC sp_addpublication_snapshot @publication = N'Bananas Test',
    @frequency_type = 1, @frequency_interval = 0,
    @frequency_relative_interval = 0, @frequency_recurrence_factor = 0,
    @frequency_subday = 0, @frequency_subday_interval = 0,
    @active_start_time_of_day = 0, @active_end_time_of_day = 235959,
    @active_start_date = 0, @active_end_date = 0, @job_login = NULL,
    @job_password = NULL, @publisher_security_mode = 1
EXEC sp_grant_publication_access @publication = N'Bananas Test',
    @login = N'sa'
GO
EXEC sp_grant_publication_access @publication = N'Bananas Test',
    @login = N'NT AUTHORITYSYSTEM'
GO
EXEC sp_grant_publication_access @publication = N'Bananas Test',
    @login = N'NT SERVICESQLSERVERAGENT'
GO
EXEC sp_grant_publication_access @publication = N'Bananas Test',
    @login = N'NT SERVICEMSSQLSERVER'
GO
EXEC sp_grant_publication_access @publication = N'Bananas Test',
    @login = N'distributor_admin'
GO
 
-- Adding the transactional articles
USE [SQLMonitor]
EXEC sp_addarticle @publication = N'Bananas Test', @article = N'SQLBuilds',
    @source_owner = N'dbo', @source_object = N'SQLBuilds', @type = N'logbased',
    @description = N'', @creation_script = N'', @pre_creation_cmd = N'drop',
    @schema_option = 0x000000000803509F,
    @identityrangemanagementoption = N'none',
    @destination_table = N'SQLBuilds', @destination_owner = N'dbo',
    @status = 24, @vertical_partition = N'true',
    @ins_cmd = N'CALL [sp_MSins_dboSQLBuilds]',
    @del_cmd = N'CALL [sp_MSdel_dboSQLBuilds]',
    @upd_cmd = N'SCALL [sp_MSupd_dboSQLBuilds]'
 
-- Adding the article's partition column(s)
EXEC sp_articlecolumn @publication = N'Bananas Test', @article = N'SQLBuilds',
    @column = N'Build', @operation = N'add', @force_invalidate_snapshot = 1,
    @force_reinit_subscription = 1
EXEC sp_articlecolumn @publication = N'Bananas Test', @article = N'SQLBuilds',
    @column = N'KB Article', @operation = N'add',
    @force_invalidate_snapshot = 1, @force_reinit_subscription = 1
EXEC sp_articlecolumn @publication = N'Bananas Test', @article = N'SQLBuilds',
    @column = N'MajorBuild', @operation = N'add',
    @force_invalidate_snapshot = 1, @force_reinit_subscription = 1
EXEC sp_articlecolumn @publication = N'Bananas Test', @article = N'SQLBuilds',
    @column = N'MinorBuild', @operation = N'add',
    @force_invalidate_snapshot = 1, @force_reinit_subscription = 1
 
-- Adding the article synchronization object
EXEC sp_articleview @publication = N'Bananas Test', @article = N'SQLBuilds',
    @view_name = N'Bananas', @filter_clause = N'',
    @force_invalidate_snapshot = 1, @force_reinit_subscription = 1
GO
 

(1 row(s) affected)
(1 row(s) affected)
(1 row(s) affected)
Msg 2714, Level 16, State 3: There is already an object named ‘Bananas’ in the database.
Msg 21745, Level 16, State 1, Procedure sp_MSrepl_articleview, Line 272
Cannot generate a filter view or procedure. Verify that the value specified for the @filter_clause parameter of sp_addarticle can be added to the WHERE clause of a SELECT statement to produce a valid query.
Msg 3931, Level 16, State 1, Procedure sp_MSrepl_articleview, Line 401
The current transaction cannot be committed and cannot be rolled back to a savepoint. Roll back the entire transaction.

Something to be aware of if you are going to be playing with replication.

I’ve opened a Connect bug with Microsoft to try and get this fixed. Please go vote on it. And be careful with your replication.

Career February 14, 2012

The Loneliness of the On Call DBA

There are lots of great things about being a SQL Server DBA. You get to play with lots of cool software; get the chance to spend someone else’s money on hardware; constantly be in a position to learn new things; and be a member of an awesome community.

With the good always has to come a little bit of bad. I’ve been a production DBA for over 10 years now. That means for over 10 years I’ve been on call at various times. Way back when it used to be a case of carrying a pager everywhere and that pager get swapped between team members. Now the NOC has my cell number and I can be called at any point. I also get emails on my phone day and night. Work loves it when you’re in constant contact (no really, they do).

The toughest thing about being a DBA has to be those long late nights when you are dealing with a production issue. The feeling of loneliness is pretty intense when the house is quiet and you are sitting in front of a monitor, it’s bright light burning into your tired bleary eyes and making them burn. The only noises are the faint sounds of sleeping from another part of the house, the clicking of your keyboard as you type in the database restore commands and the low humming from the refrigerator.

This weekend I had to deal with a long brutal outage. I ended up working through the night and most of the next day. It is what I get paid for however; so while I didn’t exactly have a smile on my face I did walk away at the end with the satisfaction of knowing that I had done a good job. I was able to get everything back online as it should be and that the business functioning at 100% again.

When I first started working with SQL Server as a DBA I thought it was just a job, I didn’t know it would be a way of life. Truly, being a DBA is a job that you live. If you’re a DBA, live it well my friends.

Replication, SQL February 8, 2012

Mass Updating Job Step Retry Attempts

Sometimes I find that the SQL Server team make curious choices as regards default operations. When setting up replication by default the log reader and distribution agent job steps will retry 2,147,483,647 times before failing. Doing the math on the retry interval of 1 minute that means that the job wouldn’t actually hit a failed status for over 4000 years. While you can setup alerts around replication a lot of folks don’t do that and you could easily be in a situation where replication has been down for so long that you have to reinitialize the subscription and perform a snapshot to get things running again.

Having a great many distribution jobs running on my server I wanted to perform a mass update. It could have gone through and updated each job step but that would have taken me a couple of days. I needed a better way. Digging through the SQL Server Agent stored procedures I found that sp_update_jobstep would do the work for me. I just needed to pass in the relevant parameters and then restart the jobs so that the change took effect.

As such I ended up writing the following script. Just adjust the @retries variable to the value that you want to work with (in this instance it would be an hours worth of retries before the job failed). This script is limited to the Distribution job step but could be easily modified to manage the log reader also.

Running this script took two minutes and it updated 60 distribution agent jobs. I’m getting ready to kick it off on the rest of my servers.

SET NOCOUNT ON
      
DECLARE @retries INT = 60 --Set to the value that you want for retry attempts
 
DECLARE @JobIdUpd UNIQUEIDENTIFIER ,
    @JobStepUpd INT,
    @JobStatus INT,
    @JobName SYSNAME,
    @Message NVARCHAR(256)
 
 
--Create a temp table which will hold the results from xp_sqlagent_enum_jobs
--Note xp_sqlagent_enum_jobs is an undocumented stored procedure, it may not be there soon
CREATE TABLE #enumjobs
    (
      job_id UNIQUEIDENTIFIER NOT NULL ,
      last_run_date INT NOT NULL ,
      last_run_time INT NOT NULL ,
      next_run_date INT NOT NULL ,
      next_run_time INT NOT NULL ,
      next_run_schedule_id INT NOT NULL ,
      requested_to_run INT NOT NULL ,
      request_source INT NOT NULL ,
      request_source_id SYSNAME NULL ,
      running INT NOT NULL , 
      current_step INT NOT NULL ,
      current_retry_attempt INT NOT NULL ,
      job_state INT NOT NULL
    ) 
 
 
   
DECLARE UpdateJobsCrsr CURSOR
FOR
SELECT  sjs.job_id ,
        sjs.step_id ,
        sj.name
FROM    msdb.dbo.sysjobsteps sjs
        INNER JOIN msdb.dbo.sysjobs sj ON sjs.job_id = sj.job_id
WHERE   subsystem = 'Distribution'
        AND retry_attempts > 60
 
OPEN UpdateJobsCrsr ;
 
FETCH NEXT FROM UpdateJobsCrsr
INTO @JobIdUpd, @JobStepUpd,@JobName
WHILE ( @@FETCH_STATUS <> -1 ) 
    BEGIN
   
--Update the job step so that the number of retry attempts is set to the @retries variable                
EXEC msdb.dbo.sp_update_jobstep @job_id = @jobidUpd, @step_id = @jobstepUpd, @retry_attempts = @retries  
SELECT @Message = N'Job ' + @JobName + ' updated.'
RAISERROR (@Message, 0, 1) WITH nowait 
 
--We have to restart the job in order for this change to take effect
--Check to see if the job is running, if so stop and restart it, otherwise leave it alone
--Note, you'll need to be a sysadmin for this to work (or change up some stuff)
    TRUNCATE TABLE #enumjobs;
    INSERT INTO #enumjobs
                EXEC master.dbo.xp_sqlagent_enum_jobs 1, 'sa', @JobIdUpd;
                
SELECT @JobStatus = job_state FROM #enumjobs
 
IF @JobStatus <> 4 --This is the job idle state
    BEGIN
--Stop the job
        EXEC msdb.dbo.sp_stop_job @job_id = @JobIdUpd
--Wait a couple of seconds for the job to stop
        WAITFOR DELAY '00:00:03'        
--Start the job up again
        EXEC msdb.dbo.sp_start_job @job_id = @JobIdUpd    
    END
ELSE 
    BEGIN
        SELECT @Message = N'Job ' + @JobName + ' was not running, please start manually if required.'              
        RAISERROR (@Message, 0, 1) WITH NOWAIT
    END
                
        FETCH NEXT FROM UpdateJobsCrsr
            INTO @JobIdUpd, @JobStepUpd, @JobName
    END
 
 
CLOSE UpdateJobsCrsr ;
GO
DEALLOCATE UpdateJobsCrsr ;
GO
 
--Cleanup
DROP TABLE #enumjobs   

SQL January 11, 2012

xp_logininfo–Your View Into Active Directory

It’s always good to know what users have access to your SQL instances but when it comes to Active Directory users and groups it can quickly become complicated and you might not know who has access at any given time.

Requests can come in for you to provide access to a user, but how can you identify if that user might already have some level of access?

That’s where xp_logininfo comes to the rescue!

This extended stored procedure will check to see if your user does actually have access and give you the relevant information associated. Just call the proc with the name of the user account:

EXEC xp_logininfo 'awesomesaucenic'

As you can see it told me that the account I passed in is a user, it has admin level privileges on the server and that the access for the account is provided via the awesomesauceDBAs group.

Wouldn’t it be great to see what other accounts also have access permissions via that group? Well xp_logininfo can do that for you too. By passing the group name along with the extra parameter of members you can get a list of group members’:

EXEC xp_logininfo 'awesomesauceDBAs', 'members'

This is a nice quick and easy way to see who has access on your systems via group membership.