Columnstore Indexes – part 102 (“CCI with Secondary Rowstore Indexes on SQL 2014”)

Continuation from the previous 101 parts, the whole series can be found at http://www.nikoport.com/columnstore/.

In SQL Server 2014 we have received a possibility to update the Columnstore Index directly with an addition of the Clustered Columnstore Index. This addition had some important limitations, between which were the impossibility of adding secondary Rowstore Indexes (making the Clustered Columnstore Index the only index on the table) and the lack of the unique constraints (

Meet the good old and absolutely underused friend – the Indexed Views, which in fact (and for some cost) will provide you with the above mentioned features.
If you absolutely need those features, follow this article to find this absolutely trivial way.

Using the good old free database ContosoRetilDW, download the backup and restore it from the C:\Install\:

Let’s convert the FactOnlineSales table to the Clustered Columnstore one, by dropping the primary key and all the foreign keys:

Here is query that I would like to optimise today, which is a rather basic SalesAmount aggregation on the daily basis for some certain orders, where we try to sort the data based on the Total Sales Amount per Day:

This query takes almost 330 ms on the average to run on my virtual machine, at the same burning 1281 ms of the CPU time, doing over incredible 36.000 LOB logical reads!

Here is the execution plan for this query:

If you run the query and look at the actual execution plan, you will think about adding a nonclustered index:

this operation will naturally fail, as expected:

We can solve and work around this problem if we build an indexed view, and we shall use it: 🙂

We shall need a unique clustered index to turn this view an indexed one:

And let us add an additional nonclustered rowstore index, to make this view performing great:

Let us re-execute the query, this time against our indexed view to see if the query is performing well or not:

This time the results are returned with a different speed – it took just 23ms of the total execution time, while spending just 31ms of the CPUT time! From the read accesses we are talking here about less than 170 logical reads. Compare these results with the original ones with 330 ms of the total execution (10 times improvement), and over 36000 LOB logical reads (over 200 times improvement).

Here is the actual execution plan that was produced for this query:

Here we have a beautiful and a simple execution plan, which delivers what we need – great performance with a relative low overall cost. And in this way we enjoy the possibility to get the best out of the 2 worlds – Columnstore & Rowstore.
Should we need to run a similar query but agains the whole dataset, it will be redirected to our Columnstore Index which will deliver great performance:

Uniqueness

Another frequent requirements is the uniqueness for the Clustered Columnstore Indexes on SQL Server 2014, and this one can be achieved by simply adding unique indexes to our indexed view (and notice that there could be more than 1 default required unique index on it).
The operation that we have already executed to make an indexed view out of the simple view is the one which shall guarantee the uniqueness for the column OnlineSalesKey:

Should we try to insert a duplicate into our table FactOnlineSales, here by copying one of the rows into our test table FactOnlineSales:

The error message is very clear that it the uniqueness of the data is enforced and will be preserved by the unique index in our view:

Now, if only we could build primary keys/foreign keys to the views … Ok, Ok – I know, this is a different topic …

Final Thoughts

The only thing to add here is that one should evaluate costs very carefully, but if you are dealing with reasonably (not huge) tables in the low million rows, this might be a great solution for your unique constraints and secondary indexes.

to be continued with Columnstore Indexes – part 103 (“Partitioning 2016 vs Partitioning 2014”)

5 thoughts on “Columnstore Indexes – part 102 (“CCI with Secondary Rowstore Indexes on SQL 2014”)

  1. Avinash Jaiswal

    Hi Niko,

    How does the secondary rowstore index work physically? Is the data completely duplicated (at leaf of B-Tree) once we create any B-Tree index on a table with CCI or does it have references within the CCI somehow?
    In case the data is duplicated completely then shouldn’t we prefer Non Clustered Columnstore Index if we intend to have more than one B-Tree non clustered index since the data will be duplicated for each NCI.

    Thanks.

    1. Niko Neugebauer Post author

      Hi Avinash,

      The secondary rowstore index duplicates the data of the clustered index and it does not matter which types are of indexes are involved.
      Secondary Rowstore Indexes over Clustered Columnstore Index serve to speed up and optimise resource spending for the Short-Range (couple of thousands ordered) & Point Lookup (couple of rows) queries.
      Data Duplication on the Index-Level is ok, if you are spending less system resources overall in the end.

      Best regards,
      Niko

      1. Avinash

        I agree about the performance gains. I use that in my production too.
        I am more curious to know your opinion in the case when we need “more than one” NCIs along with the CCI. Will it be better to then have a traditional clustered index (say on the primary key) and have a non clustered column store index(for performance) and make other indexes traditional NCIs? This strategy can save us storage space because the NCIs don’t need to duplicate the data. Of course the column store will duplicate the data but in case of CCI all NCIs would have their own copies of the data.

        1. Niko Neugebauer Post author

          Hi Avinash,

          so if I am understanding correctly, you are wondering wether
          CCI + Rowstore NCCI
          vs
          Rowstore CCI + NCCI + Rowstore NCCI
          are good fit for your scenario.
          I suggest you take a look the following blog post: http://www.nikoport.com/2017/12/29/columnstore-indexes-part-117-clustered-vs-nonclustered/
          it might have some answers for you.
          As for NCCI, it will not solve your Short-Range Scans or Point Lookups and will introduce a greater resource spending.

          Best regards,
          Niko

Leave a Reply

Your email address will not be published. Required fields are marked *