Fixing SSISDB Performance Troubles (KB 2829948)

Disclaimer: THIS IS NOT SUPPORTED BY MICROSOFT, YOU SHOULD BACK UP ANY ORIGINALS BEFORE ALTERING.
YOU MIGHT LOOSE SUPPORT IF ADVANCING WITH THIS MODIFICATIONS.

Though these modifications are a part of KB 2829948 :)

I have to say that I really love the SSISDB concept introduced in SQL Server 2012. As long as it performs well. :)

I have recently worked with a server which had a SSISDB in a situation where it had good 50 GB of space occupied – 30 GB were spent on data, plus good 20 GB were spent on the transaction log. And of course the server itself was not overloaded with the gigantic amount of jobs, no! It was just a simple situation of the 2 of the most important maintenance jobs running amok and not being able to finish.

Which jobs am I talking about ?
Theese 2 stored procedures which are executed in the maintenance job SSIS Server Maintenance Job:
[internal].[cleanup_server_retention_window]
[internal].[cleanup_server_project_version]

If you don’t know them yet – their purpose lies in cleaning up the history of the SSISDB. I guess everyone is looking to limit the storage of the history, such as packages execution statistics.
The problem with the performance of those 2 stored procedures is that they are executing deletes on a central table – [internal].[operations] which is in its turn will do cascading deletes over almost complete schema of SSISDB. This why internally in stored procedure [internal].[cleanup_server_retention_window] the deletion is executed by 10 rows at a time.

In my concrete situation I have had 36.790 rows in my [internal].[operations], but my tables [internal].[event_messages] & [internal].[operation_messages] had 29 Million rows each and occupying 13 GB & 4 GB of space respectively and of course [internal].[event_message_context] had its 52 Million Rows and another 13 GB spent.

How could it be? Quite simple – take a look at the connections between the tables inside SSISDB.
Screen Shot 2014-01-26 at 16.44.41
This is why a direct attempt of deleting just 1 row from [internal].[operations] table would take around 6 minutes and just one loop of deleting the rows would provoke a task of 40+ minutes. Having 36.000+ rows simply meant that it would not finish before 2015.

So I decided to take a better look at the execution plans and surprise! A lot of missing indexes started showing up immediately. I simply stated creating some of those indexes on SSISDB in order to see if I can get some reasonable performance improvements.

I will cut down through the chase that there is a more simple way for those who can do upgrades by installing the Cummulative Update that Microsoft released for both SQL Server RTM as well as for SQL Server SP1http://support.microsoft.com/kb/2829948

Since a lot of organization are still reluctant to install the latest and greatest Cummulative Updates on their servers and installing just the Service Packs, I decided that I shall provide here some of the indexes that I verified to be existing in the update – you can add those indexes now, but removing before advancing with Cummulative Updates.

/*
 * This script is created on basis of extraction of some of the changes from Microsofts' KB 2829948 (http://support.microsoft.com/kb/2829948)
 */
CREATE NONCLUSTERED INDEX [IX_EventMessageContext_Operation_id] ON [internal].[event_message_context]
(
	[operation_id] ASC,
	[event_message_id] ASC
)
INCLUDE ( 	[context_id])
with( ONLINE = ON, SORT_IN_TEMPDB =ON );

CREATE NONCLUSTERED INDEX [IX_EventMessages_Operation_id] ON [internal].[event_messages]
(
	[operation_id] ASC
)
with( ONLINE = ON, SORT_IN_TEMPDB =ON );

CREATE NONCLUSTERED INDEX [IX_ExecutableStatistics_Execution_id] ON [internal].[executable_statistics]
(
	[execution_id] ASC
)
INCLUDE ( 	[statistics_id] )
with( ONLINE = ON, SORT_IN_TEMPDB =ON );

-- This index exists in RTM & SP1 but with just using [sequence_id] column only
CREATE NONCLUSTERED INDEX [Unique_sequence_id] ON [internal].[execution_component_phases]
(
	[execution_id] ASC,
	[sequence_id] ASC
)
with( ONLINE = ON, SORT_IN_TEMPDB =ON, DROP_EXISTING = ON );

CREATE NONCLUSTERED INDEX [IX_OperationMessages_Operation_id] ON [internal].[operation_messages]
(
	[operation_id] ASC
)
INCLUDE ( 	[operation_message_id])
with( ONLINE = ON, SORT_IN_TEMPDB =ON );


Anyway – USE THIS AT OUR OWN RISK ONLY and AFTER COMPLETE TESTING ON THE SAFE DEVELOPMENT ENVIRONMENT.

After installing those indexes, my execution times went right up to 6.5 seconds per each of the 10 rows deletion from the [internal].[operations] table, and right now this database is enjoying a healthy life with much less space and rows to support.

There are a lot of other tricks one could do, but I won’t even mention them here, since it simply does not make any sense in the terms of supportability.
But yes, I did option(recompile) and wait(..) in order to improve performance and availability of the SSISDB while deleting those chunks of data :)

Leave a Reply

Your email address will not be published. Required fields are marked *