The other day a few SQL jobs were failing when attempting to execute SSIS packages. As usual a cryptic error was surfaced up to the job log “Execution Status: 5”
Great, really helpful. So I went to look at the package logs. Those actually didn’t show any failures, rather they just showed “Pending Execution”. At this point I knew it was going to be a long day.
A Group Policy that had been applied to the server over a month before took effect after a server restart. This policy restricted accounts in a couple of security settings which prevented SSIS from being able to handle authentication correctly.
What Went Wrong
This is one of those unforeseen circumstances things. A Group Policy object (GPO) had previously been deployed on test systems and no issues were encountered. In fact the policy had been in place for a while in production. There are, I discovered later, pieces of the GPO that would not take effect until after a server restart (this was discovered in trying to uncover the root of the problem). It would seem that in testing we applied the GPO, however no restart of the server took place and so we did not run into the problem.
A patch/restart of the SQL Server caused the GPO to be applied and new security settings to be enforced.
Down In The Weeds
Getting a little further down in the technical details here. SSIS passed up the error to its (and SQLs) logs as “Pending Execution”, yet this still caused the job to fail. What is a Pending Execution state? I have no idea, and I can’t find anything to indicate what it might truly mean. So far as I can tell it means that SSIS is messed up and nothing more.
I went down a few dead ends trying to figure it out, and then decided to validate one of the packages on the system. Rather than use the GUI (which loves to swallow errors) I used a TSQL command against the one of the packages.
Declare @validation_id bigint EXEC [SSISDB].[catalog].[validate_project] @project_name=N'< Package Name >', @validation_id=@validation_id OUTPUT, @folder_name=N'< Folder >', @validate_type=F, @use32bitruntime=False, @environment_scope=A, @reference_id=Null Select @validation_id
Yes, that’s an ugly looking command. But what it did do was surface an error that had some meaning:
Msg 6522, Level 16, State 1, Procedure validate_project_internal, Line 0
A .NET Framework error occurred during execution of user-defined routine or aggregate “validate_project_internal”:
System.ComponentModel.Win32Exception: A required privilege is not held by the client
at Microsoft.SqlServer.IntegrationServices.Server.ISServerProcess.StartProcess(Boolean bSuspendThread)
at Microsoft.SqlServer.IntegrationServices.Server.ServerApi.ValidateProjectInternal(SqlInt64 projectId, SqlInt64 versionId, SqlInt64 validationId, SqlString targetEnvironment, SqlInt16 use32BitRuntime)
This provided a starting point for trying to track down the issue. The key piece of the error message is “lack of a privilege not being held by the client”. This is generally something that ends up being caused by local security policy on the server. So I started digging some to try and find what policies should be in place for SQL and what ones were missing or locked down.
Eventually I came up with a couple of things that were not being applied for the SQL Service accounts, that are needed in order for SSIS to function.
- Impersonate a client after authentication
- Replace a process level token
- Adjust memory quotas for a process
I looked to try and update the local security policy for those items, but was unable to because of GPO settings. I checked to see what GPO were being applied for these using the “Resultant Set Of Policy” tool (rsop.msc). Yes, I know, don’t use GUIs, but this was faster for me as I don’t get to touch GPO often. Drilling down to Computer Configuration -> Windows Settings -> Security Settings -> Local Policies -> User Rights Assignments. This showed that a certain policy was being applied to these items.
I called one of the Domain Admins, who came online and removed the server from having this policy applied. Then went into each of the three listed and added the SQL Service account. From there a reboot was performed.
After the server came back up I ran the validation proc again, and it came back clean. From there I was able to run the SSIS packages.
It was certainly not a fun experience (troubleshooting SSIS rarely is). Going forward we will be ensuring that all changes to GPO include reboots in our test environments to ensure that we potentially capture all problems ahead of time.