Clustered Columnstore Indexes – part 28 (“Update vs Delete + Insert”)

Continuation from the previous 27 parts, starting from http://www.nikoport.com/2013/07/05/clustered-columnstore-indexes-part-1-intro/

This blog post is all about the architectural specifics & performance optimizations for the Clustered Columnstore Indexes.
It is known that Inserts are mostly faster than pure Update operations, since we could get less page splits (when inserting ordered data), but what about the latest & the greatest Clustered Columnstore Indexes in SQL Server?
We know from the 1st blog in this Clustered Columnstore series, that an Update in Clustered Columnstore Indexes is being executed as a update of the Deleted Bitmap structure for the rows that we are modifying and the newest values are inserted into a Delta Store, so we can say that Update operation is actually being executed as an Delete + Insert.
I wanted to see how fast/slow the Update operation looks like when compared with a direct Delete & Insert operations.

So lets restore a fresh copy of the ContosoRetailDW database (download it for free).
Note: I maintain my Backup at C:\Install and my default test data location is c:\Data

Let us upgrade it a little bit for not battling the limitations while testing:

Next step would be to drop all the primary & foreign keys from our main 5 tables:

Now we need to create our Clustered Columnstore Index on the Test table – FactOnlineSales:

For the sake of the test, lets create a copy of our table:

Columnstore Update Tests

Now we are ready to run a couple of tests and to compare the performance and the impact of the Update Statement with the Direct Delete + Insert operations.
In order to minimize the impact I shall try to update only 1 Million rows out of our 12.5 Million rows table.

After running multiple times the setup and the actual update statement I came the following performance numbers on my virtual machine:
– CPU time = 23563 ms
– Elapsed time = 25053 ms
– Reads ~= 26500 Pages
The key performance indicator here is the actual Columnstore Index Update operator at the execution plan – it runs in the Row Mode.
Screen Shot 2014-03-17 at 00.05.37

Also, we can clearly see that this operation has modified 3 different Row Groups where updated rows were located, plus we have received a new Delta Store which is storing our 1 Million rows.

Here is a screenshot of the modified Row Groups:
Screen Shot 2014-03-16 at 23.21.35

Columnstore Delete + Insert Tests

Now lets get to the Delete + Insert operations.

Setup is very simple – truncate the actual test table, insert all the data from the table with a copy and rebuild it in order to get better Row Groups:

Let us delete 1 Million Rows:

My results were
– CPU time = 3656 ms
– Elapsed time = 4025 ms
which are quite nice in my opinion.

but we need to insert data to arrive to any conclusion, and for that lets create a copy of our 1 Million Rows into a separate table:

Now we are ready to execute our insert operation:

Here are the average results that I have got:
– CPU time = 5563 ms
– Elapsed time = 5716 ms
– Reads ~= 14700 Pages

Now to the execution plan – it was actually running in the Row Mode:
Screen Shot 2014-03-17 at 00.53.36

And if we check our Row Groups we can actually see that we have got a new compressed Segment – the best possible situation in this case:
Screen Shot 2014-03-17 at 00.53.09

Final Thoughts

We can actually speed the whole insertion process up by using Bulk Load API or even running our loads in Batch Mode, so it is a very clear indicator of a better way to work with new data when updating a table.

Not only it looks over 2.5 times faster to delete & insert data instead of executing a direct update, but such aspects as locking & blocking can become quite an issue as it was shown in the 8th part of the series Clustered Columnstore (Locking).
Screen Shot 2014-03-17 at 01.01.58

Whenever you start working with SQL Server 2014, think twice if you really want to update your data or you just want to delete it and insert/load new information directly into the table.

to be continued with Clustered Columnstore Indexes – part 29 (“Data Loading for Better Segment Elimination”)

2 thoughts on “Clustered Columnstore Indexes – part 28 (“Update vs Delete + Insert”)

Leave a Reply

Your email address will not be published. Required fields are marked *