Tag: SQL

Don’t Shrink TempDB!

As a general practice don’t shrink any databases. Every once in a while there is a need to do so. I had one of those needs today.

 

My disk is full

Thanks to a cumulative effect of a misconfigured SQL install, incorrect location of TempDB and the server pagefile combined with a user running a huge query joining 20+ tables with no where clause we had a drive run out of disk space. Unfortunately this caused another process to break which needed to grow the transaction log on another database.

I contacted the business owner immediately explaining how we needed to move TempDB to another drive and restart SQL (in fact I’d already performed the alter statement to move TempDB off to it’s own LUN where it should have been all along). The business owner stated that we could not restart SQL until after the evening so we were left in a situation where certain things were failing.

 

Why not shrink TempDB?

That was my feeling. TempDB was 18GB larger than it had ever needed to be so I figured I could just shrink the data file (there was just one) and reclaim 50% of the difference to keep things chugging along until the restart. I attempted to perform the shrink on the file and it did nothing. I checked TempDB and nothing was using it, but still the shrink failed.

Interesting problem. Argenis Fernandez (blog|twitter) just joined my team and he came over to try and help. He suggested running DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS given that the user impact would be minimal given the servers usage patterns. 

After running the DBCC commands I was able to perform the shrink, clean up the space and keep everything running.

 

Corruption – that’s why

Curious as to why running those DBCC commands allowed me to shrink the file I posted to #sqlhelp on Twitter asking why.

Pretty much immediately I got a reply from Paul Randal (blog|twitter) that made my stomach churn:

Dont Shrink Tempdb

 

Paul and Aaron Bertrand (blog|twitter) sent me links to a Microsoft KB article that clearly explains this.

From Paul I bring you the KB in a nutshell: unless you quiesce SQL Server, a TempDB data file shrink can cause corruption.

 

This post brought to you by yet another reason not to perform a shrink.

Presenting For The PASS PowerShell Virtual Chapter

A quick note, on Wednesday at 1PM EST I’ll be giving an online presentation on one of the two topics I submitted for the PASS Summit this year.

 The presentation will be heavy on the demo side of things and I’ll cover the basics of querying SQL Server using PowerShell, running scripts against multiple machines quickly and easily as well as bulk loading data into SQL.

Please check out http://www.powershell.sqlpass.org/ for a link to the livemeeting.

 

Central auditing of SQL permissions with PowerShell & TSQL

Description: As a DBA it can be a challenge to know who has permissions to what SQL instances and what objects. The more instances you have the more complex that task. In this presentation I’ll share a method using PowerShell and TSQL that can be used to capture permissions from all of your SQL instances and load them into a centralized location. We’ll even take it a step further by auditing those permissions so that we can quickly and easily identify any that might have changed.

 

SQL Clustering–Network Binding Order Warnings

In setting up my Windows 2008 R2/SQL 2008 R2 cluster this week I came across a warning in the Setup Support Rules stating that “The domain network is not the first bound network.”

 Network Binding Order Error

 

This didn’t make any sense to me as I had been very careful in setting the binding order in the network connections properties:

Binding Order Set

 

Deciding to look for further information on this I opened up the Detail.txt file from the Setup Bootstrap/Log directory and searched for network binding order:

Binding Order Setup Log

 

Strangely the first bound network is Local Area Connect* 9. I had no idea what this network was as it was not listed in the network adapter list.

I remembered a discussion on Twitter a couple of weeks ago between Allan Hirt (blog|twitter) and Denny Cherry (blog|twitter) where they talked about this exact problem and the fact that it was a hidden cluster adapter which needed to be moved in the binding order, and that had to be done in the registry.

I opened up regedit and drilled down to HKEY_LOCAL_MACHINESYSTEMCurrentControlSetservicesTcpipLinkage and opened up the Bind value:

Registry Showing Incorrect Binding Order

 

I’ve highlighted the first item in the bind order. The trouble is that this GUID value really means nothing to us…time to crack open PowerShell

gwmi Win32_NetworkAdapterConfiguration | where-object {$_.IPEnabled -eq "True"} | ft Description, SettingID -auto

List of active adapters

 

As you can see the Microsoft Failover Cluster Virtual Adapter GUID matches the first bound value. Back to the registry to move that adapter down the list with a quick cut and paste.

*Warning, playing in the registry can cause system instability, BSOD problems, singularities and quantum fluctuations*

Registry with correct binding

 

Starting up the cluster install again and the Network binding order warning is gone

 

Hopefully the next releases of Windows and SQL won’t have this problem. In the meantime this is good to know. Thanks Allan and Denny for engaging on Twitter and leading me to the answer to the problem.

T-SQL Tuesday #18–CTE A Simpler Form Of Recursion

It’s T-SQL Tuesday time folks. Bob Pusateri (blog|twitter) is the host and has chosen Common Table Expressions (CTEs) as the topic of the month.

Several ideas came to mind for writing about CTEs, one of the best uses I’ve seen for one recently was to grab the name of the most recent backup file for a database. You’ll have to ask Aaron Nelson (blog|twitter) to hook you up with that one though.

I thought I’d write about an interesting problem that was posed to me in an interview a couple of months ago.

 

Here’s a table

I was given a table with two columns; ManagerID, EmployeeID.

This table was populated with a few values thusly:

USE TempDB

GO

create table #ManagersEmployees (ManagerID int, EmployeeID int)

insert into #ManagersEmployees 

values(1,2), (2,3), (2,4), (2,5), (3,6), (3,7), (3,8)

    , (4,10),(5,11),(5,12), (12,13), (12,14)

GO

I was asked to write a recursive procedure to pull out the manager, employee tree for a given ManagerID.

 

CTEs to the rescue

Having done a little work with CTEs and understanding that I could easily write a recursive query using them I was able to quite quickly put together a script to pull the information needed. By throwing it into a procedure it could quickly and easily be executed.

CREATE PROCEDURE ManagerRecursion_CTE @ManagerID INT

AS

SET NOCOUNT ON

;WITH Managers_CTE (ManagerID, EmployeeID )

AS ( SELECT ManagerID, EmployeeID FROM #ManagersEmployees  

        WHERE ManagerID = @ManagerID

UNION ALL

    SELECT e.ManagerID, e.EmployeeID 

        FROM #ManagersEmployees e 

            INNER JOIN Managers_CTE c on e.ManagerID = c.EmployeeID)

SELECT * FROM Managers_CTE ORDER BY ManagerID, EmployeeID

GO

 

I tested and this worked nicely, it was a simple solution and provided the requested results.

 

That’s not recursion

The trouble is that while the results were not correct I was advised that this was not recursive and did not meet the criteria. Back to the drawing board then.

After a lot more work I came up with the following:

CREATE PROCEDURE ManagerRecursion_NonCTE @ManagerID INT

AS

SET NOCOUNT ON

DECLARE @rowcnt INT, @lastrow INT

DECLARE @Tbl_Results TABLE (rowid INT IDENTITY(1,1), ManagerID INT, EmployeeID INT)

 

INSERT INTO @Tbl_Results (ManagerID, EmployeeID)

SELECT ManagerID, EmployeeID

FROM #ManagersEmployees

WHERE ManagerID = @ManagerID

 

SET @rowcnt = @@ROWCOUNT

SET @lastrow = 0

WHILE @rowcnt > 0

BEGIN

INSERT INTO @Tbl_Results (ManagerID, EmployeeID)

SELECT m.ManagerID, m.EmployeeID

FROM #ManagersEmployees m

    INNER JOIN @Tbl_Results t

        ON m.ManagerID = t.EmployeeID

WHERE rowid > @lastrow

 

SELECT @rowcnt = @@ROWCOUNT

 

SELECT @lastrow = @@IDENTITY - @rowcnt

END

SELECT ManagerID, EmployeeID FROM @Tbl_Results ORDER BY ManagerID, EmployeeID

GO

 

I tested this and got back the same results as with the first procedure with all the values I passed in. Deep breath on this one as it was pushing the limits of what I could produce on the spot in an interview.

 

That’s still not recursion

Again, while the results we correct this was not recursive. It was back to the drawing board once more. This time I had to admit defeat, however did tell the interviewer that I would work on a solution at home and email it in. He gave me his contact information, we completed the rest of the interview and I went home determined to get the right data in the manner that the interviewer wanted.

After a whole bunch of reading and a lot of work I finally came up with correct results in a recursive procedure which I emailed in to get feedback.

CREATE PROC ManagerRecursion @EmpID INT, @InnerLoop INT = 0

AS

BEGIN

    IF @InnerLoop = 0

        BEGIN

        CREATE TABLE #Tbl_Results (ManagerID INT, EmployeeID INT)

        END

            INSERT INTO #Tbl_Results (ManagerID, EmployeeID)

            SELECT ManagerID, EmployeeID FROM #ManagersEmployees WHERE ManagerID = @EmpID

 

            SET @EmpID = (SELECT MIN(EmployeeID) FROM #ManagersEmployees WHERE ManagerID = @EmpID)

 

            WHILE @EmpID IS NOT NULL

            BEGIN

                EXEC ManagerRecursion @EmpID, 1

                SET @EmpID = (SELECT MIN(EmployeeID) FROM #ManagersEmployees WHERE EmployeeID > @EmpID 

                    AND EmployeeID NOT IN (SELECT ManagerID FROM #Tbl_Results)

                    AND EmployeeID IN (SELECT EmployeeID FROM #Tbl_Results))

            END

    IF @InnerLoop = 0   

        BEGIN

        SELECT * FROM #Tbl_Results order by ManagerID, EmployeeID

        DROP TABLE #Tbl_Results

        END      

END

 

GO

 

Unfortunately no feedback was forthcoming. I felt good about providing this solution despite that. I enjoy a challenge and this was certainly one of those.

 

So what was the right way?

That depends on who you ask. For me the first way was the right was. It performed well, the code was clean and easy and required a minimum amount of development. I feel that the solution here was exactly the reason that CTEs were created in the first place.

The second solution was a lot more work, the query got more complex and it does not perform as well as the first.

The final procedure was true recursion in that the procedure calls itself over and over again until all of the results are returned. It’s a loop that makes cursors look like they perform well. It was easily the worst performing of the three.

 

It all goes to show there’s more than one way to get the results you need. It’s also interesting how an example like this shows just how much work the SQL development team have done to help reduce complexity and improve performance.

Upcoming Clustering Fun

I’ve a fun week upcoming, I get to do a bunch of installs of SQL 2008 R2 on some Windows 2008 R2 clusters.

 

I’ve had a lot of experience in building and working with Windows 2003 clusters and deploying SQL 2008, this is a new era for me and means new toys to play with and new things to learn. I’m excited to really get to grips with Windows 2008 clustering, there are some significant differences between it and 2003 which is going to provide some challenges. I’m aiming to perform quite a few build up and tear downs of one cluster over the next couple of weeks to build up my levels of comfort with the newer technology, and work towards getting some nice scripts together to do so.

 

As a consequence of Windows 2008 R2 being new to me for clustering I decided to pick up Pro SQL Server 2008 Failover Clustering by Allan Hirt (blog|twitter), a Clustering MVP and guru (who has a new whitepaper out about Applying Updates to a Clustered Instance of SQL Server 2008 or SQL Server 2008 R2). Allan was on the MidnightDBA web show a couple of weeks ago (go watch it) where there was lots of conversation on several aspects of High Availability.

 

I’m still only a quarter of the way through the book but have already found out some fantastic information on things such as the network priority and installing server roles from the command line.

 

As the build goes along I plan on putting up a couple of posts to track the process and gotchas as well as a final review of Allan’s book, so check back soon.

Checking Database Space With PowerShell

Have you ever been asked the question “quick, I need to know how much free space there is in the Nodrap database on server Cromulent”?

Ever wished that you could quickly grab that information?

You might well have a TSQL script that will grab that information for you. I’ve been working on pulling that information centrally, grabbing that data for multiple servers becomes a much simpler task when using PowerShell.

 

 

Why PowerShell and not TSQL?

For me the big bonus that PowerShell provides me for grabbing this data is that I can pull it quickly and easily, I can also simply run it for multiple machines and the same script works for versions SQL 2000 and up.

 

 

How big is my database?

Open up your favorite PowerShell ISE (you can even use the one that comes with Windows). If the aren’t already loaded you’ll need to add the SQL snapins

Add-PSSnapin SqlServer* 


 

 

To get a list of databases, their sizes and the space available in each on the local SQL instance.

dir SQLSERVER:SQLlocalhostdefaultdatabases | Select Name, Size, SpaceAvailable | ft -auto


 

CheckDBWPS SizeSpace

 

Very quick and easy. The information here is a little misleading though.

The Size is the size of the database and includes size of the transaction log(s). SpaceAvailable only represents the space available in the data files however, unless we’re looking at SQL 2000 in which case it’s the free space in the data and log files. To make things even more confusing Size is reported in MB and SpaceAvailable in KB.

To get more details we need to look at the transaction log information.

 

 

How big is my transaction log?

To grab this information we need to go deeper. In this case we will focus on the AdventureWorks database.

dir SQLSERVER:SQLlocalhostdefaultdatabasesAdventureWorkslogfiles | 

    Select Name, Size, UsedSpace | ft -auto


 

CheckDBWPS LogSpace

 

 This information is all in KB which helps have it make a little more sense.

 

Doing the math

Now we have the size of the database, the size of the log and the free space in each it’s some quick math which gives us the sizing information.

  • Size = Database Size in MB
  • Size / 1024 (from log) = Log size in MB
  • UsedSpace / 1024 = Log used in MB
  • (Size – UsedSpace) / 1024 = Log free in MB
  • Size – (Size / 1024) (from log) = Data files size in MB
  • SpaceAvailable / 1024 = Space free in data files
  • Size – (SpaceAvailable / 1024) – (Size / 1024) (from log) = Space used in data files
  • Size – ((SpaceAvailable / 1024) – ((Size – UsedSpace) / 1024) = Space used in data files (SQL 2000)

 

Outputting nice results

Running all this and grabbing the data from different areas can lead to messy results. This is where the PowerShell DataTable come to the rescue.

A DataTable is a PowerShell object, much like a SQL table that can hold data for you.

First you create the object, define and add columns and then add rows before finally returning the data.

Here’s a quick example:

$dataTable = New-Object system.Data.DataTable "SomeStuff"

#Specify the column names for the data table

$col1 = New-Object system.Data.DataColumn FunkyText,([string])

$col2 = New-Object system.Data.DataColumn JustANumber,([int])

$col3 = New-Object system.Data.DataColumn TodaysDate,([datetime])

 

#Add the columns to the data table

$dataTable.Columns.Add($col1)

$dataTable.Columns.Add($col2)

$dataTable.Columns.Add($col3)

#Create a new row

$row = $dataTable.NewRow()

#Add data for each column in the row

$row.FunkyText = "Something really funky"

$row.JustANumber = 1395

$row.TodaysDate = ((Get-Date ).ToString("yyyy-MM-dd HH:mm:ss"))

#Add the new row to the datatable

$dataTable.Rows.Add($row)

 

#Output the datatable

$dataTable | out-gridview


 

CheckDBWPS Datatable

 

 

Putting everything together

Running the following script will pull the data and log information for all the databases on the SQL instance specified in the top parameter. It also handles named instances without modification, and is factored to return the correct information for SQL 2000 as well as higher versions

Note: Run DBCC UPDATEUSAGE for SQL 2000 instances to ensure that the data is accurate

Param ($Servername= 'localhost')

  
Function Get-DBSizes ($Servername)

{

$Servername

#Check to see if it's a name instance, if it is the location will be different

if ($Servername -ilike "**") { $BaseLocation = "SQLSERVER:SQL$Servernamedatabases" }

    else { $BaseLocation = "SQLSERVER:SQL$Servernamedefaultdatabases" }

$dblist = dir $BaseLocation -Force | select Name

foreach ($db in $dblist)

{

    $location = $db.Name

    $locationFixed = $location -replace "[.]", "%2e"

    #Grab the database information

    $DBInfo = dir $BaseLocation -Force | Where-Object {$_.Name -eq $location; $_.Refresh()} | 

        select Name, 

                size,

                SpaceAvailable,

                CompatibilityLevel

 

    #Pull the log information

    #Use measure-object to sum up sizes in the event that we might have more than one log file

    $logsize = dir $BaseLocation$locationFixedlogfiles | Where-Object {$_.Name; $_.Refresh()} | Measure-Object -Property Size -Sum 

    $logused = dir $BaseLocation$locationFixedlogfiles | Where-Object {$_.Name; $_.Refresh()} | Measure-Object -Property UsedSpace -Sum 

    $sp = $DBInfo.SpaceAvailable

    $TotalDBSizeMB = $DBInfo.size

    $LogSizeMB = ($logsize.sum / 1024)

    $LogUsedMB = ($logused.Sum / 1024)

    $LogFreeMB = ($LogSizeMB - $LogUsedMB)

    $DataFilesSizeMB = ($TotalDBSizeMB - $LogSizeMB)

    $SpaceAvail = ($dbinfo.SpaceAvailable / 1024)

    
    #Because SQL2000 spaceavailable    includes log space we have to do different calculations depending on version

    #Run DBCC UPDATEUSAGE on your 2000 databases nightly to help ensure this data is accurate

    $Compat = $DBInfo.CompatibilityLevel

    if ($Compat -eq 'Version80') { $DataFileFreeMB = ($SpaceAvail - $LogFreeMB)    }

        else { $DataFileFreeMB = $SpaceAvail }

    

    

    $DataFileUsedMB = ($DataFilesSizeMB - $DataFileFreeMB)

    $DataFilePercentUsed = ($DataFileUsedMB / $DataFilesSizeMB) * 100

    $DataFilePercentFree = 100 - $DataFilePercentUsed 

    $LogPercentUsed = ($LogUsedMB / $LogSizeMB) * 100

    $LogPercentFree = 100 - $LogPercentUsed

    $date = (Get-Date ).ToString("yyyy-MM-dd HH:mm:ss")

    

    #Write the results into the data table

    $row = $dataTable.NewRow()

    $row.ServerName = $Servername

    $row.DatabaseName = $location

    $row.TotalDBSizeMB = $TotalDBSizeMB

    $row.DataFilesSizeMB = $DataFilesSizeMB

    $row.DataFilesUsedMB = $DataFileUsedMB

    $row.DataFilesFreeMB = $DataFileFreeMB

    $row.DataPercentUsed = $DataFilePercentUsed

    $row.DataPercentFree = $DataFilePercentFree

    $row.LogFilesSizeMB = $LogSizeMB

    $row.LogFilesUsedMB = $LogUsedMB

    $row.LogFilesFreeMB = $LogFreeMB

    $row.LogPercentUsed = $LogPercentUsed

    $row.LogPercentFree = $LogPercentFree

    $row.Date = $date

    $dataTable.Rows.Add($row)

    
    #And we are done

}

}

 
 

#Create data table to hold the results

$dataTable = New-Object system.Data.DataTable "Results"

#Specify the column names for the data table

$col1 = New-Object system.Data.DataColumn ServerName,([string])

$col2 = New-Object system.Data.DataColumn DatabaseName,([string])

$col3 = New-Object system.Data.DataColumn TotalDBSizeMB,([int])

$col4 = New-Object system.Data.DataColumn DataFilesSizeMB,([int])

$col5 = New-Object system.Data.DataColumn DataFilesUsedMB,([int])

$col6 = New-Object system.Data.DataColumn DataFilesFreeMB,([int])

$col7 = New-Object system.Data.DataColumn DataPercentUsed,([decimal])

$col8 = New-Object system.Data.DataColumn DataPercentFree,([decimal])

$col9 = New-Object system.Data.DataColumn LogFilesSizeMB,([int])

$col10 = New-Object system.Data.DataColumn LogFilesUsedMB,([int])

$col11 = New-Object system.Data.DataColumn LogFilesFreeMB,([int])

$col12 = New-Object system.Data.DataColumn LogPercentUsed,([decimal])

$col13 = New-Object system.Data.DataColumn LogPercentFree,([decimal])

$col14 = New-Object system.Data.DataColumn Date,([datetime])

#Add the columns to the data table

$dataTable.Columns.Add($col1)

$dataTable.Columns.Add($col2)

$dataTable.Columns.Add($col3)

$dataTable.Columns.Add($col4)

$dataTable.Columns.Add($col5)

$dataTable.Columns.Add($col6)

$dataTable.Columns.Add($col7)

$dataTable.Columns.Add($col8)

$dataTable.Columns.Add($col9)

$dataTable.Columns.Add($col10)

$dataTable.Columns.Add($col11)

$dataTable.Columns.Add($col12)

$dataTable.Columns.Add($col13)

$dataTable.Columns.Add($col14)

 
 

 
#Provide the name of the SQL server that we want to check

 
$WarningPreference = "silentlycontinue"

#Call the function to populate the results

#get-content c:Tempserverlist.txt | % {Get-DBSizes $_  } 

Get-DBSizes $Servername

 

#Output the results

$dataTable | Out-GridView

 

 

Click to embiggen

 

Final note

Download the PS1 file and save it on your machine, from there you can call the script and just pass in the SQL instance to get the results eg:

./Get-DatabaseSize.ps1 localhost
./Get-DatabaseSize.ps1 cromulentawesomesauce

In a follow up post I’ll show how we apply a minor tweak to the script and have the data loaded into a SQL database.

Stop Logging Successful Backups

It’s great that SQL writes to the Event log and SQL log every time a backup completes. You get to see lots of data saying that database master was backed up and database model was backed up, then that msdb was backed up etc…

 

Is it really that useful?

Well, at least it can be useful to have in there. The thing is there are better ways to find out when your databases were last backed up (such as using PowerShell or querying MSDB).

The downside is that it quickly fills your logs with information that you don’t really need to know. It gets even worse if you perform frequent transaction logs backups. If you have three databases that are having their logs dumped to disk every 20 minutes all of a sudden your SQL log is next to useless. All you are seeing is backup success messages and not much else. It makes it all too easy to miss important the things that you really need to pay attention to.

 

There is hope

Yes, it’s true! You can stop SQL from dumping that information into the log.

By enabling trace flag T3226 you kill these messages dead. Worry not, any failure will still be written to the logs however all those pointless notifications will vanish. All of a sudden your logs will be cleaner and meaner, important things will stand out and your scroll wheel can take a break.

 

How to enable T3226

Open up SQL Configuration Manager and the SQL service. Under the Advanced tab you’ll see startup parameters. Just tack ;-T3226 on to the end, apply the changes and restart SQL.

 

The steps are further outlined at http://msdn.microsoft.com/en-us/library/ms345416.aspx and http://msdn.microsoft.com/en-us/library/ms190737.aspx.

 

Go ahead, clean up those log files…you know you want to.

T-SQL Tuesday #10 – Applying Indexes To Replicated Tables

It’s T-SQL Tuesday time again! This month Michael J. Swart (blog|twitter) is hosting and has given us the topic of indexes.

I’ve previously written about how you can find indexes that are not being used and get their sizes. Not being allowed to reuse that meant that I had to come up with something else.

 In my previous post on calculating replication schema options I documented a way of being able to change article options in transactional replication so that you could apply all the non-clustered indexes on your subscribing table that exist on the publishing one. But what if you are using replication to offload reporting and you need to create indexes that don’t exist on the publishing table?

You could always manually apply these indexes any time that you perform a snapshot but this requires manual work and can easily be forgotten.

Fortunately there is another option, you can have replication automatically apply a post-snapshot SQL script which will create those indexes for you. Here’s how…

 

Create a basic publication (snapshot or transactional) and add a table article. Perform the snapshot and compare indexes between the tables (in this example I am using the person.address table from AdventureWorks2008).

After the snapshot here’s how the indexes stand

 

For this example we’re going to create a new index on Admindb.person.address. To do this first we need to generate the index creation script

 

USE AdminDB

GO

IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N‘[person].[address]’) AND name = N’NCIX_Zip’)
CREATE NONCLUSTERED INDEX [NCIX_Zip] ON [person].[address]
(
    [PostalCode] ASC
)
WITH (ONLINE = ON, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

 

Save this script to C:ReplicationScriptsCreateReportingIndex.sql

Now we change the existing publication so that whenever a snapshot is taken it will apply this script when applied to the subscriber. This change can be done in a couple of ways.

The quick and easy way is to use sp_changepublication and just pass in the publication name, property change and location of the script.

 

USE AdventureWorks2008
GO

EXEC sp_changepublication    
@publication = ‘Post Snapshot Test’ 
, @property = N’post_snapshot_script’  ,
@value = ‘C:ReplicationScriptsCreateReportingIndex.sql’
  , @force_invalidate_snapshot = 1 –must invalidate existing snapshots for this to take effect
GO

 

This can also be done using the GUI (right click on the publication and select Properties)

 

 

When you next perform a snapshot then the script will be applied. We can check that this actually worked first by looking at replication monitor where it will tell you that the script was applied.

 

And then you can check the table itself and confirm that the index exists

 

There are a couple of final things to note.

Usually you will specify a unc path for the SQL script so that you just have to maintain a single copy. You just need to ensure that the subscriber is able to access the post-snapshot script location.

Post-snapshots are not limited to creating indexes, in fact you can do pretty much anything just so long as the Distribution Agent has the access;

  • Send yourself an email to let you know that the snapshot has been applied on the subscriber
  • Apply table permissions
  • Create views

If you use replication and find yourself manually reapplying indexes it’s worth evaluating whether using post-snapshot scripts might save you some work.

Calculating Replication Schema Options

I recently evaluated using page compression on some of my production data (see my post on estimating data compression). I’m using transactional replication to send this data out to a few other servers for reporting pusposes and I wanted to adjust replication so that the data compression settings on the tables would also be replicated.

Making life a little easier on this was an application change that was going to require me dropping and recreating replication so I figured it would be a simple case to change the replication options when recreating the publications.

Using the GUI I scripted out all of the existing publications and then dropped them.

Now I wanted to adjust the script so that the schema options I wanted replicated were applied. Here’s the sp_addarticle for one of these tables (SQL 2008)

 — Adding the transactional articles

 use [TestDB]

exec sp_addarticle
    @publication = N’TestDB_TestTable’
    , @article = N’TestTable’
    , @source_owner = N’dbo’
    , @source_object = N’TestTable’
    , @type = N’logbased’
    , @description = N”
    , @creation_script = N”
    , @pre_creation_cmd = N’drop’
    , @schema_option = 0x000000000803509F
    , @identityrangemanagementoption = N’none’
    , @destination_table = N’TestTable’
    , @destination_owner = N’dbo’
    , @status = 24
    , @vertical_partition = N’false’
    , @ins_cmd = N’CALL [dbo].[sp_MSins_dboTestTable]’
    , @del_cmd = N’CALL [dbo].[sp_MSdel_dboTestTable]’
    , @upd_cmd = N’SCALL [dbo].[sp_MSupd_dboTestTable]’
GO

 

Now this doesn’t tell you a lot about what schema options are in place. The binary value @schema_option has all the information you need however it’s not exactly user friendly. Books Online has the information available to be able to figure this out, it just takes a little calculation. Somewhat of a pain I’ve created a script to tell me what options are enabled.

All you need to do is pass in the @schema_option value and it will tell you what options are enabled (in this case  0x000000000803509F)

 

/* PROVIDES THE REPLICATION OPTIONS ENABLED FOR A GIVEN @SCHEMA_OPTION IN SYSARTICLES */

declare @schema_option varbinary(8) = 0x000000000803509F  –< PUT YOUR SCHEMA_OPTION HERE

set nocount on
declare @OptionTable table ( HexValue varbinary(8), IntValue as cast(HexValue as bigint), OptionDescription varchar(255))
insert into @OptionTable (HexValue, OptionDescription)
select 0x01 ,‘Generates object creation script’
union all  select 0x02 ,‘Generates procs that propogate changes for the article’
union all  select 0x04 ,‘Identity columns are scripted using the IDENTITY property’
union all  select 0x08 ,‘Replicate timestamp columns (if not set timestamps are replicated as binary)’
union all  select 0x10 ,‘Generates corresponding clustered index’
union all  select 0x20 ,‘Converts UDT to base data types’
union all  select 0x40 ,‘Create corresponding nonclustered indexes’
union all  select 0x80 ,‘Replicate pk constraints’
union all  select 0x100 ,‘Replicates user triggers’
union all  select 0x200 ,‘Replicates foreign key constraints’
union all  select 0x400 ,‘Replicates check constraints’
union all  select 0x800  ,‘Replicates defaults’
union all  select 0x1000 ,‘Replicates column-level collation’
union all  select 0x2000 ,‘Replicates extended properties’
union all  select 0x4000 ,‘Replicates UNIQUE constraints’
union all  select 0x8000 ,‘Not valid’
union all  select 0x10000 ,‘Replicates CHECK constraints as NOT FOR REPLICATION so are not enforced during sync’
union all  select 0x20000 ,‘Replicates FOREIGN KEY constraints as NOT FOR REPLICATION so are not enforced during sync’
union all  select 0x40000 ,‘Replicates filegroups’
union all  select 0x80000 ,‘Replicates partition scheme for partitioned table’
union all  select 0x100000 ,‘Replicates partition scheme for partitioned index’
union all  select 0x200000 ,‘Replicates table statistics’
union all  select 0x400000 ,‘Default bindings’
union all  select 0x800000 ,‘Rule bindings’
union all  select 0x1000000 ,‘Full text index’
union all  select 0x2000000 ,‘XML schema collections bound to xml columns not replicated’
union all  select 0x4000000 ,‘Replicates indexes on xml columns’
union all  select 0x8000000 ,‘Creates schemas not present on subscriber’
union all  select 0x10000000 ,‘Converts xml columns to ntext’
union all  select 0x20000000 ,‘Converts (max) data types to text/image’
union all  select 0x40000000 ,‘Replicates permissions’
union all  select 0x80000000 ,‘Drop dependencies to objects not part of publication’
union all  select 0x100000000 ,‘Replicate FILESTREAM attribute (2008 only)’
union all  select 0x200000000 ,‘Converts date & time data types to earlier versions’
union all  select 0x400000000 ,‘Replicates compression option for data & indexes’
union all  select 0x800000000  ,‘Store FILESTREAM data on its own filegroup at subscriber’
union all  select 0x1000000000 ,‘Converts CLR UDTs larger than 8000 bytes to varbinary(max)’
union all  select 0x2000000000 ,‘Converts hierarchyid to varbinary(max)’
union all  select 0x4000000000 ,‘Replicates filtered indexes’
union all  select 0x8000000000 ,‘Converts geography, geometry to varbinary(max)’
union all  select 0x10000000000 ,‘Replicates geography, geometry indexes’
union all  select 0x20000000000 ,‘Replicates SPARSE attribute ‘
                  
select HexValue,OptionDescription as ‘Schema Options Enabled’
From @OptionTable where (cast(@schema_option as bigint) & cast(HexValue as bigint)) <> 0

 

This is really great and useful. In fact you can use this script to check out the schema options for any article, just plug in the schem_option value from sysarticles to get the data.

Now to take it a step further I wanted to know what the new value would need to be for @schema_option in order to replicate the data compression settings. For this I wrote another query. This time it’s just a case of uncommenting the lines for the schema options that I want and running the select.

select cast(

  cast(0x01 AS BIGINT) –DEFAULT Generates object creation script
| cast(0x02 AS BIGINT) –DEFAULT Generates procs that propogate changes for the article
| cast(0x04 AS BIGINT) –Identity columns are scripted using the IDENTITY property
| cast(0x08 AS BIGINT) –DEFAULT Replicate timestamp columns (if not set timestamps are replicated as binary)
| cast(0x10 AS BIGINT) –DEFAULT Generates corresponding clustered index
–| cast(0x20 AS BIGINT) –Converts UDT to base data types
–| cast(0x40 AS BIGINT) –Create corresponding nonclustered indexes
| cast(0x80 AS BIGINT) –DEFAULT Replicate pk constraints
–| cast(0x100 AS BIGINT) –Replicates user triggers
–| cast(0x200 AS BIGINT) –Replicates foreign key constraints
–| cast(0x400 AS BIGINT) –Replicates check constraints
–| cast(0x800 AS BIGINT)  –Replicates defaults
| cast(0x1000 AS BIGINT) –DEFAULT Replicates column-level collation
–| cast(0x2000 AS BIGINT) –Replicates extended properties
| cast(0x4000 AS BIGINT) –DEFAULT Replicates UNIQUE constraints
–| cast(0x8000 AS BIGINT) –Not valid
| cast(0x10000 AS BIGINT) –DEFAULT Replicates CHECK constraints as NOT FOR REPLICATION so are not enforced during sync
| cast(0x20000 AS BIGINT) –DEFAULT Replicates FOREIGN KEY constraints as NOT FOR REPLICATION so are not enforced during sync
–| cast(0x40000 AS BIGINT) –Replicates filegroups (filegroups must already exist on subscriber)
–| cast(0x80000 AS BIGINT) –Replicates partition scheme for partitioned table
–| cast(0x100000 AS BIGINT) –Replicates partition scheme for partitioned index
–| cast(0x200000 AS BIGINT) –Replicates table statistics
–| cast(0x400000 AS BIGINT) –Default bindings
–| cast(0x800000 AS BIGINT) –Rule bindings
–| cast(0x1000000 AS BIGINT) –Full text index
–| cast(0x2000000 AS BIGINT) –XML schema collections bound to xml columns not replicated
–| cast(0x4000000 AS BIGINT) –Replicates indexes on xml columns
| cast(0x8000000 AS BIGINT) –DEFAULT Creates schemas not present on subscriber
–| cast(0x10000000 AS BIGINT) –Converts xml columns to ntext
–| cast(0x20000000 AS BIGINT) –Converts (max) data types to text/image
–| cast(0x40000000 AS BIGINT) –Replicates permissions
–| cast(0x80000000 AS BIGINT) –Drop dependencies to objects not part of publication
–| cast(0x100000000 AS BIGINT) –Replicate FILESTREAM attribute (2008 only)
–| cast(0x200000000 AS BIGINT) –Converts date & time data types to earlier versions
| cast(0x400000000 AS BIGINT) –Replicates compression option for data & indexes
–| cast(0x800000000 AS BIGINT)  –Store FILESTREAM data on its own filegroup at subscriber
–| cast(0x1000000000 AS BIGINT) –Converts CLR UDTs larger than 8000 bytes to varbinary(max)
–| cast(0x2000000000 AS BIGINT) –Converts hierarchyid to varbinary(max)
–| cast(0x4000000000 AS BIGINT) –Replicates filtered indexes
–| cast(0x8000000000 AS BIGINT) –Converts geography, geometry to varbinary(max)
–| cast(0x10000000000 AS BIGINT) –Replicates geography, geometry indexes
–| cast(0x20000000000 AS BIGINT) –Replicates SPARSE attribute
AS BINARY(8)) as Schema_Option

 

Plugging in that new value and executing the sp_addarticle script created the new article with all the old settings and now included replicating compression options.

Download the script Replication Schema Options.sql give them a try yourself and let me know if they help.