I have a bunch of monitoring routines running against my SQL Servers. One of these checks available disk space on drives. Today I was surprised to see that one of my SQL system drives was looking pretty full. This is unusual to say the least because I dedicate a drive to the system (master/model/msdb) databases. Upon connecting to the server I discovered that MSDB had blown up to over 12GB in size.
Checking For Large Tables
I used the following query to look for large tables in MSDB (this query is completely based upon running profiler against SQL Server while running a Disk Usage By Table report.
SELECT TOP 10
a3.name+'.'+a2.name AS [tablename],
a1.rows as row_count,
(a1.reserved + ISNULL(a4.reserved,0))* 8 AS reserved
FROM
(SELECT
ps.object_id,
SUM (CASE WHEN (ps.index_id < 2) THEN row_count ELSE 0 END) AS [rows],
SUM (ps.reserved_page_count) AS reserved
FROM sys.dm_db_partition_stats ps
GROUP BY ps.object_id) AS a1
LEFT OUTER JOIN
(SELECT
it.parent_id,
SUM(ps.reserved_page_count) AS reserved,
SUM(ps.used_page_count) AS used
FROM sys.dm_db_partition_stats ps
INNER JOIN sys.internal_tables it ON (it.object_id = ps.object_id)
WHERE it.internal_type IN (202,204)
GROUP BY it.parent_id) AS a4
ON (a4.parent_id = a1.object_id)
INNER JOIN
sys.all_objects a2
ON ( a1.object_id = a2.object_id )
INNER JOIN
sys.schemas a3
ON (a2.schema_id = a3.schema_id)
WHERE
a2.type <> N'S' and a2.type <> N'IT'
ORDER BY
reserved desc,[tablename]
The results were a little shocking to say the least.
- sysmail_mailitems contained 717,878 rows and was over 9GB in size
- backupfile/backupmediafamily/backupmediaset/backupset combined had over 8.6million rows and totalled over 2.5GB in their space usage
Forgetting To Clean Up
MSDB is something frequently forgotten. Cleaning up job history is nice and easy, it’s just a setting in the SQL Agent. Cleaning up your sent emails and backup records is not as straightforward. Why Microsoft hasn’t included this I don’t know, especially given that they have provided stored procedures to perform the cleanup of this data.
My server has been in service since March 2009. That’s a lot of backups and a lot of emails sent. I needed to get a handle on cleaning this up and then implementing a process to ensure that these tables didn’t get out of control again.
Deleting Old Mail Items
Microsoft have provided the procedure sysmail_delete_mailitems_sp to perform cleanup of the sysmail_mailitems table. Two parameters can get passed in:
- @sent_before – datetime – used to delete mail items sent prior to a certain date
- @sent_status – varchar(8) – used to force deletion of only a certain sent status
In this instance I was not concerned with what items were going to be deleted, I just wanted to get rid of everything older than 14 days.
Here’s a bad way to do this:
DECLARE @DATE DATETIME = dateadd(d, -14, getdate())
EXEC msdb.dbo.sysmail_delete_mailitems_sp @sent_before = @DATE
Actually let me clarify, this isn’t a bad way to do this if you don’t have 2 years and 9GB of data to delete. In fact this is the script I implemented in a job to continue to keep the table size down. Running this against such a large amount of data just blew up the transaction log, filled the drive and then forced a rollback.
Going a slightly smarter route I performed the deletion in small batches (5 days worth of email at a time) using the following:
DECLARE @maildate DATE = '2009-06-15'
WHILE @maildate < DATEADD(d, -14, GETDATE())
BEGIN
EXEC msdb.dbo.sysmail_delete_mailitems_sp @sent_before = @maildate
SELECT @maildate = DATEADD(d, 5, @maildate)
END
This took a little while however the transaction log stayed small and it completed without incident.
Deleting Backup History
Again Microsoft do a kindness to help cleanup old data. This time it’s the procedure sp_delete_backuphistory which we pass
- @oldest_date – datetime – the oldest backup history record we want to keep
Having learned my lesson from deleting the old mail items I again attempted to perform the delete in batches using:
DECLARE @backupdate DATE = '2009-06-15'
WHILE @backupdate < DATEADD(d, -90, @backupdate)
BEGIN
EXEC sp_delete_backuphistory @backupdate
SELECT @backupdate = DATEADD(d, 15, @backupdate)
END
I figured this would take a while. Half an hour later it was still churning away. Wait stats showed a crazy amount of i/o going on. This didn’t seem right so I stopped the script and took a look at the execution plan.
A pretty large plan which makes it difficult to look through. Luckily a few weeks ago I had downloaded the completely awesome SQL Sentry Plan Explorer from SQLSentry.net. If you have not installed this FREE tool yet then I recommend you go out there right now, well after finishing reading this post, and get it. It makes execution plans not just readable but sortable, viewable and workable. Just using Plan Explorer once can save you hours. I just want to add I am not associated with SQL Sentry in any manner nor am I getting any kickback, the tool is just that good.
Pasting the estimated execution plan XML into Plan Explorer showed multiple clustered index scans against the backupset table. It seems that a couple of critical indexes are missing from this table.
I was quickly able to ascertain that a where clause was being used on the backup_finish_date, which is indexed, however it needed to return the media_set_id which is not included in the index. Later on in the plan I could see that there was a join against the media_set_id which isn’t indexed.
These missing indexes were causing the massive i/o numbers I was seeing. To resolve I quickly added a couple of indexes.
USE msdb
go
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('[dbo].[backupset]') AND name = 'IDX_BackupSet_FinDate_MediaSet')
CREATE NONCLUSTERED INDEX IDX_BackupSet_FinDate_MediaSet ON backupset(backup_finish_date) include (media_set_id)
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID('[dbo].[backupset]') AND name = 'IDX_BackupSet_MediaSet')
CREATE NONCLUSTERED INDEX IDX_BackupSet_MediaSet ON backupset(media_set_id)
This done I took another look at the estimated plan and things looked a great deal cleaner.
I kicked off the script again and this time it completed in under 5 minutes. Several million rows were deleted and 2GB of space reclaimed.
Keeping Things Clean
Not wanting to ever run into this situation again I added a job with a couple of steps to delete mail items older than 14 days and backup history older than 90 days.
This is done with a quick execution of the cleanup procedures. Thanks to the data being brought current and the job running every night I won’t have to worry about extended run times or MSDB getting out of control.
/* Cleanup old backup history */
DECLARE @backupdate DATE = DATEADD(d, -90, GETDATE())
EXEC msdb.dbo.sp_delete_backuphistory @backupdate
/* Cleanup old mail items */
DECLARE @maildate DATE = DATEADD(d, -14, GETDATE())
EXEC msdb.dbo.sysmail_delete_mailitems_sp @sent_before = @maildate
Final Thought
I hadn’t taken sufficient care in ensuring that old data was kept clean. In fact I hadn’t even realized that backup history and old mail items could be a concern. Since I found this today I’ve started going through and performing this cleanup action on each of my SQL instances. Once that’s done I’ll be using PowerShell to deploy the nightly cleanup job on every instance.
I highly recommend checking your own servers to see if you might be encountering the same problems.