Columnstore Indexes – part 71 (“Change Data Capture, Change Tracking & Temporal”)

Continuation from the previous 70 parts, the whole series can be found at http://www.nikoport.com/columnstore/.

In the SQL Server world of data modification tracking, there are 2 very important features, which are called Change Data Capture (CDC) and Change Tracking (CT). Both of them allow you to understand the changes done to the monitored table. Notice that they are implemented in the different way, and have different goals.

This blogpost is dedicated to their integration and collaboration with the Columnstore Indexes in SQL Server 2016.

First of all, let’s consider the reasons for not enabling this functionality in SQL Server 2012 & SQL Server 2014:

In SQL Server 2012 we have had only Nonclustered Columnstore Indexes, which were non-updatable, implementing support for CDC & CT have simply made no sense whatsoever.

In SQL Server 2014 the Nonclustered Columnstore Indexes were kept being unupdatable, and so it was still nonsensical to make them support CDC or CT. As for the Clustered Columnstore Indexes in SQL Server 2014, they were implemented as updatable, but unfortunately both functionalities stayed out of the scope for that release.

In SQL Server 2016, we have received support for both of the scenarios, but lets take it slow and 1 by 1 diving into some of the details and tests:

Change Data Capture

The Change Data Capture is the feature that was originally implemented for the good old SQL Server 2008, with the objective of capturing data modification statements, executed against SQL Server tables. There were some important enhancement since the original version, like the addition of SSIS components in SQL Server 2012, but largely the feature has been set in the original release.

In SQL Server 2016 Clustered Columnstore Indexes have received the support of the secondary Nonclustered B-Tree Indexes, including the Unique ones, you can find more information and example on this topic at the blogpost dedicated to the Clustered Columnstore Indexes Architectural Improvements for SQL Server 2016.

From the logical perspective, in order to have support for CDC, we shall need to build a table with Clustered Columnstore Index, and we shall have to add a Unique Nonclustered Index, which we shall indicate CDC to use for collection of data modifications, if we are looking to get the net changes, while for the default implementation it is not required. But how should that work ? Given that Clustered Columnstore is the primary Index, we shall have to read its original data from the Delta-Store or from the compressed Row Group, where in the second case the performance will never be something that can be used in the real world scenario…
My thoughts are that the CDC support for Clustered Columnstore Index requires a unuseful overload for enabling it on a real Fact table, because in Data Warehousing environments doing ETL would allow you to save the modifications directly within ETL process.

For the start, let’s restore a fresh copy of ContosoRetailDW database and apply the usual modifications, such as setting Compatibility Level to 130, setting SA to be the Database Owner and increasing the default sizes for the Data File & Transaction Log:

Now we can drop Clustered B-Tree Index and create a Clustered Columnstore Index on our dbo.FactOnlineSales table:

To enable Change Data Capture (CDC), we shall need to execute the following script, which will enable it on the database level:

Now, let’s try to enable CDC on our test table (notice that I am using most of the default parameters while explicitly setting NET CHANGES to 1 (track all net changes), because this is what I expect from my CDC-enabled table:

Look at the error message that I have received:

Yeah, I am missing a unique index in my table, because I have removed my Clustered B-Tree Index previously.
Let’s add a Unique Nonclustered B-tree Index on the SalesKey column and try to execute the sys.sp_cdc_enable_table again:

This time I have received the following message:

This means that Clustered Columnstore Indexes are not supporting CDC and that’s perfectly fine from my personal point of view.
Notice that trying to enable the CDC even without net changes support, will result in the very same error, meaning that this scenario is not being supported in the moment.

What about the Nonclustered Columnstore Indexes ?
For the test, let’s use the different table dbo.FactSales – beginning with the creation of the updatable Nonclustered Columnstore Index:

Now, let’s enable the CDC scenario on it, using the same parameters as we have had try on the dbo.FactOnlineSales table:

This time it has worked without any error message!
To confirm that it was succesfull, let’s execute the sys.sp_cdc_help_change_data_capture system stored procedure, which will show all tables that are configured for the Change Data Capture:

sys.sp_cdc_help_change_data_capture
Magnificent, so that means that we have a Rowstore B-Tree table with an updatable Nonclustered Columnstore Index that supports CDC. This makes perfectly sense, because all we actually need to make it work with the traditional Rowstore structure, simply ignoring the secondary Columnstore Index.

But let’s try to insert, update & delete some data from our table, to make it sure, that everything works fine:

The deletion operation was a success, and we can confirm it with the execution of the following statement that will count the rows from the cdc.dbo_FactSales_CT table, the default system table that stores modifications from our original dbo.FactSales table:

In the result, I have received exactly what I have expected – 10 rows.
Now let’s update some data:

Let’s see if we get 50 rows from the cdc.dbo_FactSales_CT table meaning that we have received 40 modifications, because 20 modifications are stored as 20 deletes and 20 inserts:

Success! 🙂

Change Tracking

Change Tracking is a solution that provides opportunity with a lightweight tracking of changes made to a table.

To set it up for your database is very easy – its enough to execute the following command:

Notice that you can also set up all of these properties by using the properties dialog of the database that you are configuring.

Since Change Tracking requires a primary key on the tracked table, we shall need to set up our Clustered Columnstore Table with a Primary Key:

Now, let’s enable Change Tracking on our Clustered Columnstore table dbo.FactOnlineSales

Well, again for Clustered Columnstore Index we have received an error message:

This means that the improvements in tracking for Clustered Columnstore Indexes are not included in SQL Server 2016. As I have written above for the Change Data Capture functionality tests – it does not shock me at all, if Microsoft will never implement it.

Let’s get back to our dbo.FactSales, the one that has a Nonclustered Columnstore Index and check if there is a support for CT (Change Tracking):

As expected, everything functioned perfectly in this case – there were an “Command(s) completed successfully.” message.
To verify if our table is being set up for tracking successfully, let’s read data from the sys.change_tracking_tables DMV:

sys.change_tracking_tables
Looks great!

Let’s delete some data:

We can verify that the data is being tracked correctly by issuing the following command:

ChangeTable Columnstore
I am very glad to see both of the functionalities getting support for the Nonclustered Columnstore Indexes in SQL Server 2016, this makes Operational Analytics a kind of solution that many people will be able to use without a lot of additional effort.

Temporal

Temporal is the new tracking mechanism that is implemented in SQL Server 2016, and I am considering to be one of the finest addition to the engine of my favourite database. If you are interested in it – considering visiting amazing articles on Temporal written by Itzik Ben-Gan for the SQL Server PRO Magazine.

Let’s consider a new table dbo.FactOnlineSales and enable Temporal on it by issuing the following commands that will drop existing Primary Key, Create a new Clustered Columnstore Index, add the necessary 2 columns for the Temporal:

Now, we can activate the Temporal System Versioning on our table:

Well, I have received the following error message, saying that the primary key is missing in this table:

Let’s add a new primary key (nonclusrtered) to our FactInventory table and try again:

Perfect! This time it has worked like a charm! 🙂

Let’s take it for a small ride by deleting a couple of rows:

As before, to verify I will simply check the pre-configured table dbo.FactInventoryHistory for the 10 entries:

This has worked as expected! This looks great!

Let’s check if the Temporal supports Nonclustered Columnstore Indexes as well – we shall need to drop the Clustered Columnstore Index manually and create the Nonclustered Columnstore Index brand new since the conversion from one type to another does not funciton:

This script executed without any error, thus confirming that Nonclustered Columnstore Index is supporting Temporal as well.

Conclusion

Its amazing that we have finally received support for Nonclustered Columnstore Indexes in combinations with the Change Data Capture and Change Tracking. I think that given finite resources the possible decision of not implementing support for the Clustered Columnstore Indexes is very well thought.

The support of the Temporal feature for both Clustered as well as the Nonclustered Columnstore Indexes gives a great alternative, if for some specific reason one needs to use Clustered Columnstore Indexes coupled with Data Tracking features.

to be continued …

Leave a Reply

Your email address will not be published. Required fields are marked *