Columnstore Indexes – part 65 (“Clustered Columnstore Improvements in SQL Server 2016”)

Continuation from the previous 64 parts, the whole series can be found at https://www.nikoport.com/columnstore/.

This article is dedicated to the improvements to Clustered Columnstore Index in SQL Server 2016. The Clustered Columnstore Index is designed for DataWarehouse & Decision Support systems, like in previous versions of SQL Server, but now, in SQL Server 2016 we have more diversified options (Updatable Nonclustered Columnstore Index on Rowstore tables & Updatable Nonclustered Columnstore Index on InMemory OLTP), that are targeting different scenarios – mainly Operational Stores with Reporting requirements.

In the first 2 versions with Columnstore Index, there were a number of limitations on the different types of data types as well as some key features of the SQL Server, that were forbidden to use in combination with Columnstore Index.
For Clustered Columnstore Index in SQL Server 2014, some of those incompatible key features were Primary & Foreign Keys, Unique Constraints & Secondary Indexes. If you are looking for the complete list of the features that included Triggers, CDC, Change Data Capture, LOB Columns, Computed Columns, etc – go over to my CISL library and check out “suggested_tables.sql” script, that will list them all.

In SQL Server 2016, for Clustered Columnstore Index, we have received a very significant improvement, with possibility to create and use secondary Nonclustered Indexes, as well as the Unique Constraints and Primary & Foreign Keys, between others.

The importance of those 4 features is very high, even though from the first sight or from the top of the mountain of a happy SQL Server 2014 Clustered Columnstore user those features might not be the best addition to the Columnstore family, for the following reasons:
– ETL process should guarantee the quality of the data and so maintenance of the Uniqueness (Unique Constraint, Primary Key) and the Relationships between the tables (Foreign Keys) and
– Creating smaller Row Groups (typically through applying artificial memory pressure with the help of the Resource Governor), will allow the Lookups & Range Lookups to have smaller impact on the system, and thus the necessity of the secondary indexes is limited.

Secondary Indexes can’t be substituted for 100% by a technique that implies slower scans (smaller row groups means that the Read-Ahead can’t use its potential to the maximum), plus scanning around 100.000 rows versus descending a couple of b-tree pages is not really a competition, unless you enjoy watching a Tractor competing with a racing car (Lamborgini produced both, so maybe this competition can be arranged overall) :)

This blog post will be diving into secondary Nonclustered B-tree Indexes over Clustered Columnstore Index. I will cover the Primary Keys, Foreign Keys & Unique Constraints in the next blog post.

Point Lookups & Range Lookups

In a DataWarehouse setup, most of the time, analytical queries are scanning the large part of the table, if not completely. Whenever the data is aggregated and presented to the user, the next step is all about drilling done, to investigate something that produced a result that might provoke some extra interest or mistrust from the Analyst side. Drilling down the report means one thing – reading and processing smaller amounts of information, sometimes going deep down to a single line, like for example when investigating a suspect transaction in a large bank.
The operations that invoke a small scan of ordered information is often called Ranged Lookup, while reading a couple of lines of information is described as a Point Lookup.

No matter what kind of analytical system you are using, it will almost always will come down to Range & Point Lookups, and this is where you will definitely need a couple of Nonclustered Indexes.
Before launching SQL Server 2014, Microsoft had published some information about Bookmark Lookups that they intended to introduce for Clustered Columnstore Indexes, and I have blogged about it in Columnstore Indexes – part 3 (“More Internals”), but the real implementation took more time then everyone has hoped, and that might be very good news – better tested software is what we all want & need.

What do the secondary nonclustered b-tree indexes represent SQL Server ? They represent a subset of the table information, with main purpose of speeding up lookup operations. The nonclustered indexes in current versions of SQL Server can go up in the number until 999, which is definitely a number you will want to avoid at all costs :)
The nonclustered b-tree indexes always need a connection to the clustered index and traditionally for Rowstore it is implemented through the insertion of the clustered index keys into the nonclustered index. Note that if our clustered index is not unique, then a uniquifier column is added internally for identifying the indexes.
This is done for establishing a connection between primary and secondary index structures, for when we are scanning a nonclustered index and we need some of the columns that are not included in it’s structure, we need to fall back on our clustered index and the best as well as the fastest way to do is through this connection.
In the case we have a HEAP table (no clustered index), then this connection is established through generated Row Bookmark (which itself represents a joining of file id, data page number and slot number within the data page).

Secondary Nonclustered Rowstore Indexes

A similar implementation (with a reference system) is implemented for supporting secondary Nonclustered Rowstore Indexes for Clustered Columnstore Index.

This reference system is implemented with the help of 2 parts:
– Row Locator
– Mapping Index

The Row Locator within B-tree indexes has a simple reference to the id of the Row Group (Delta-Store, Compressed Row Group or Tombstone) plus a positional reference (pointing to the position of the row within the respected Row Group).

The Mapping Index is a support structure that manages the connections between Clustered Columnstore Index and Nonclustered B-tree Indexes. Its presence is essential for allowing to support the dynamic movement of rows, which is implemented through Rebuild & Merge processes. Any rows location in Columnstore Index is not static, since a Rebuild, a Reorganize or a Merge process may force row to change the Row Group and updating all involved rows in all indexes might be a total nightmare.
We shall dive into the details a little bit later in this blog post.

Let’s simply experiment with creation of the table with a Clustered Columnstore Index that can contain secondary b-tree nonclustered indexes. Before SQL Server 2016, this was absolutely impossible scenario:

create table dbo.CCI_Indexes_2016(
	c1 bigint,
	c2 numeric (36,3),
	c3 bit,
	c4 smallint,
	Index PK_CCI_Indexes_2016 Clustered Columnstore ); 

Ok, the table is ready, let’s add 2 nonclustered indexes, that will have 2 different leading columns:

create nonclustered index IX_CCI_Indexes_2016_1
	on dbo.CCI_Indexes_2016 (c1,c2);

create nonclustered index IX_CCI_Indexes_2016_2
	on dbo.CCI_Indexes_2016 (c2,c4) include(c3);

Fantastic! This has worked like Microsoft said it would!
Let’s try to insert some data into this table:

insert into dbo.CCI_Indexes_2016
	(c1,c2,c3,c4)
	values
	(1,16.,1,19),
	(2,7.21,1,256),
	(3,22.55,0,128),
	(4,3.94,1,31);

This has worked without any problems as expected, but what kind of structures do we have at the moment, and for this investigation I will query one of the newest DMVs, that Microsoft has added to SQL Server 2016 – sys.internal_partitions (its description can be found in Columnstore Indexes – part 56 (“New DMV’s in SQL Server 2016”)):

select object_name(part.object_id) as TableName, 
	part.object_id, part.partition_id,
	ind.name as IndexName, part.index_id, 
	part.hobt_id,
	part.internal_object_type, part.internal_object_type_desc,
	part.row_group_id, part.rows, part.data_compression, part.data_compression_desc
	from sys.internal_partitions part
		left outer join sys.indexes ind
			on part.object_id = ind.object_id and part.index_id = ind.index_id
	where part.object_id = object_id('dbo.CCI_Indexes_2016');	

Sys.Internal_Partitions for NC Indexes over CCI, part 1You can clearly see on the image above that we have 2 structures – a Delta Store where we have stored our data and a Deleted Bitmap which contains information about deleted rows from the compressed Row Groups. This is all expected and if this DMV would exist in SQL Server 2014, this is what I would expect to see there.
Notice that we have a little delighter in this DMV – the information about the type of compression that is applied to each of the Columnstore internal structures. :)
Let’s compress the Clustered Columnstore Index and then delete a row from it:

alter index PK_CCI_Indexes_2016
	on dbo.CCI_Indexes_2016
	Rebuild;

Running our query against the sys.internal_partitions will bring us the following result:
Sys.Internal_Partitions for NC Indexes over CCI, part 2
This time we have no Delta-Stores and this view shows us only the internal partitions different to compressed segments and dictionaries, and so this result is very expected.
I see that there is no indication of Mapping Index even though we have created 2 nonclustered b-tree indexes on our table.
This happens because Mapping Index is simply not created yet, because there are no different Row Groups to manage – we have just 1 Row Group, and I assume that there is a optimisation for this type of scenario.

An important scenario for Clustered Columnstore Indexes + Nonclustered Rowstore Indexes combination is the removal of the data. To test it I will delete 1 row from our compressed Row Group, maybe it will force out Mapping Index to appear.

delete from dbo.CCI_Indexes_2016
	where c1 = 2;

After querying sys.internal_partitions again, I see no changes in the list of the internal structures. Mapping Index is still not visible, and I guess for the good. Unless we have moved data from 1 Row Group to another, there is no need for Mapping Index and its maintenance will only slow down our queries (reading and writing) so its creation should be delayed until the moment when it is really necessary.

Let’s add another full compressed Segment of data and see if we can observe any difference with 2 compressed Row Groups, notice that I am adding 1 row over the maximum compressed segment size, I do that in order to force the loaded Row Group closure:

set nocount on

declare @i as int;
declare @max as int;
select @max = isnull(max(C1),0) from dbo.CCI_Indexes_2016;
set @i = 1;

begin tran
while @i <= 1048577
begin
	insert into dbo.CCI_Indexes_2016
		default values

	set @i = @i + 1;
end;
commit;

checkpoint;

-- Rebuild our table
alter index PK_CCI_Indexes_2016
   on dbo.CCI_Indexes_2016
   reorganize;

Let's check on the internal partitions of our Columnstore Table if we can see anything different:

select object_name(part.object_id) as TableName, 
	part.object_id, part.partition_id,
	ind.name as IndexName, part.index_id, 
	part.hobt_id,
	part.internal_object_type, part.internal_object_type_desc,
	part.row_group_id, part.rows, part.data_compression, part.data_compression_desc
	from sys.internal_partitions part
		left outer join sys.indexes ind
			on part.object_id = ind.object_id and part.index_id = ind.index_id
	where part.object_id = object_id('dbo.CCI_Indexes_2016');	

Sys.Internal_Partitions for NC Indexes over CCI, part 3 - Mapping IndexWonderfull - we are finally observing the Mapping Index! Its here in the Columnstore internal partitions, supporting the existence and potential moves of our data between Row Groups, connecting with Nonclustered B-Tree Indexes.

Now let's go back and discover more internals from the Row Locators & Mapping Index:

Row Locator

As I have written above, Row Locators inside B-tree Nonclustered Indexes represent a connection to the Clustered Columnstore Index. They contain a reference to the Id of the Row Group, where the row is located, plus the positional reference for exact position inside the Row Group.

For example, for the row with c2 = 2, that value I have inserted in the very first insert statement before forcing a rebuild (2,7.21,1,256) - the Row Locator should look like this:

---------------------------
| Row Group Id | Position |
---------------------------
|      0       |     2    |
---------------------------

The Row Group Id of the very first compressed is equals to 0, and let's imagine that the compression did not changed the order of the rows, an so the position is kept at the number 2, like we have inserted data into the Delta-Store.
To verify Row Group Id, I have used the following query that I have issued agains our good old friend sys.column_store_row_groups:

select *
	from sys.column_store_row_groups rg
	where rg.object_id = object_id('dbo.CCI_Indexes_2016');

Identifying Row Group IdYou can clearly notice the ID of the Row Group that is compressed and has just 4 rows (where 1 of those rows is actually deleted).

The Row Locator is there serving more then one purpose at the same time: to go from the Nonclustered B-tree Index to fetch the missing columns (NCI->CCI direction), and to delete the B-Tree Index value, should we issue a Delete Statement against Columnstore Index (CCI->NCI direction).

One would wonder if we could possibly need any other structures, because the connection is quite clear and direct. The other structure, Mapping Index exists because of the performance reasons, and serves as a great optimiser.
Imagine we are issuing an ALTER INDEX ... REBUILD statement and all of our Row Group's rows are changing the Row Group. This would mean an absolutely massive update process over all involved Nonclustered B-tree indexes and it could take some epic minutes or even hours. The solution for this problem is an intermediate structure, which is storing a fixed connection to the B-tree Nonclustered Indexes and only update the locator for the Columnstore Indexes. The name of this structure is Mapping Index.

Mapping Index

Mapping Index is a B-tree mapping structure, that connects Nonclustered B-tree Indexes with the Clustered Columnstore Index. Its purpose is on tracking the movement of the rows inside Clustered Columnstore Index. It contains a key of the original locator (where it was originally inserted) and the current location (Row Group Id & Position) of the row.

The reason for storing the Original Row Locator is that this is the way of how the row is being uniquely identified and tracked by Nonclustered B-tree Indexes, so this is actually the key that is being referenced everywhere.
The Original Row Locator is also automatically added to the Columnstore Index structure once we create a Nonclustered B-Tree Index, in order to serve as a key for the lookups (a more concrete example is the Delete operation) between at Mapping Index, connecting Columnstore & Nonclustered B-tree indexes. When initialised, this column with the Original Row Locator is set to NULL, until the Row has moved. Once it has moved, the original location is updated with the value of the original Row Groups and the row's position in it.

When we insert rows into Columnstore Index, there are couple of different possible landing locations:
- Delta-Store (in this case original locator is mapped to the ID of the Delta-Store plus a unique number identifying the row within)
- Compressed Row Group (in this case we put Row Group Id plus a position of the row inside the compressed Row Group)

When a row is moved inside the Clustered Columnstore Index to a different Row Group, no B-tree Indexes are updated - because we simply update our Mapping Index structure that each of our Nonclustered B-tree Indexes are referencing, thus avoiding great confusion and possible performance problems, after all we just scan a single rather small B-tree structure.

The key thing about the Mapping Index is that it only tracks rows that have moved, which is implemented for the performance reasons. Another important issue of moving rows between different Row Groups is that they are typically moved in groups, sometimes including all active rows in a Row Group, if we have a lot of deleted rows marked or if our Row Group is far from being full to it's maximum size.

Moving groups of rows allows an optimisation that a Row Locators are stored for a range of rows. This should not happen during a Rebuild process, since most rows are moved to different Row Groups without any possibility of control (this is a Vertipaq compression prerogative), but for Merging processes or for the works of the Tuple Mover this is quite an easy item to implement.

Of course the introduction of the Mapping Index makes the performance of the Range Lookups and Point Lookups a little bit slower, since we need to scan and collate and additional structure, but our reading queries against the Columnstore Index do not get affected in any way - we are still processing compressed Segments, Delta-Stores and Deleted Bitmaps, there is no change in this process.

Examples of Range & Point Lookups

To test the improvements for the Range & Point Lookups in SQL Server 2016, I will use (surprise, surprise!) the free ContosoReatilDW database, by restoring a fresh copy of it, dropping all primary & foreign keys (the blogpost on their support in SQL Server 2016 is coming soon) and creating a new Clustered Columnstore Index on it:

USE [master]
 
alter database ContosoRetailDW
    set SINGLE_USER WITH ROLLBACK IMMEDIATE;
 
RESTORE DATABASE [ContosoRetailDW] 
    FROM  DISK = N'C:\Install\ContosoRetailDW.bak' WITH  FILE = 1,  
        MOVE N'ContosoRetailDW2.0' TO N'C:\Data\ContosoRetailDW.mdf', 
        MOVE N'ContosoRetailDW2.0_log' TO N'C:\Data\ContosoRetailDW.ldf',  
        NOUNLOAD,  STATS = 5;
 
alter database ContosoRetailDW
    set MULTI_USER;
GO

GO
ALTER DATABASE [ContosoRetailDW] SET COMPATIBILITY_LEVEL = 130
GO
ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0', SIZE = 2000000KB , FILEGROWTH = 128000KB )
GO
ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0_log', SIZE = 400000KB , FILEGROWTH = 256000KB )
GO

use ContosoRetailDW;

ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimCurrency];
ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimCustomer];
ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimDate];
ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimProduct];
ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimPromotion];
ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimStore];

ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [PK_FactOnlineSales_SalesKey];

create clustered columnstore Index PK_FactOnlineSales
	on dbo.FactOnlineSales;

We have done good so far, let's run those 2 queries to test if the improvement is something that is truly measurable. The first query is a Point Lookup since it is reading a single row (imagine someone is verifying the data from a single suspected bank transaction record, while the second one is getting a range of rows for analysing top 50 sales done on some particular moment - this is a Range Lookup:

-- Point Lookup
select TotalCost, DiscountAmount
	from dbo.FactOnlineSales
	where SalesOrderNumber = N'20070405725440';

-- Range Lookup
select top 50 SalesAmount
	from dbo.FactOnlineSales
	where DateKey = '2007-04-05 00:00:00.000'
	order by SalesAmount desc;

Here are the execution plans that I have received:
Columnstore Point LookupThe Point Lookup is quite a tough cookie for the Columnstore Index, because strings are difficult, but in SQL Server Microsoft has already made some very significant improvements, especially in the terms of CPU (to find more, follow Columnstore Indexes – part 58 ("String Predicate Pushdown")). This query takes some significant time to run and does a lot of reads - 36683 logical reads to be precise:

Table 'FactOnlineSales'. Scan count 1, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 36683, lob physical reads 0, lob read-ahead reads 0.
Table 'FactOnlineSales'. Segment reads 13, segment skipped 0.
 SQL Server Execution Times:
   CPU time = 718 ms,  elapsed time = 722 ms.

Columnstore Range Lookup
The Range Lookup is more nice in the terms of this sample queries, it takes almost 20 times less to execute, even though it gets back 50 records, instead of just 1 as in the case of the Point Lookup.
The number of logical reads goes down to 1570 and there is almost no work to be done by CPU:

Table 'FactOnlineSales'. Scan count 1, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 1570, lob physical reads 0, lob read-ahead reads 0.
Table 'FactOnlineSales'. Segment reads 6, segment skipped 7.
   CPU time = 0 ms,  elapsed time = 44 ms.

In order to compare with a true #SQLServer 2016 I will create 2 Nonclustered Indexes (I won't even try to optimise them with making them filtered, I will create a default ones):

create nonclustered index IX_FactOnlineSales_SalesOrderNumber
	on dbo.FactOnlineSales (SalesOrderNumber) 
		include (TotalCost, DiscountAmount);

create nonclustered index IX_FactOnlineSales_DateKey
	on dbo.FactOnlineSales (DateKey) 
		include (SalesAmount);

And now let's re-run our test queries:

-- Point Lookup
select TotalCost, DiscountAmount
	from dbo.FactOnlineSales
	where SalesOrderNumber = N'20070405725440';

-- Range Lookup
select top 50 SalesAmount
	from dbo.FactOnlineSales
	where DateKey = '2007-04-05 00:00:00.000'
	order by SalesAmount desc;

Let's take a look at the execution plans:
B-Tree Point Lookup
I guess that anyone loves such execution plans whenever getting just 1 row back - a single B-tree Index Seek. I guess there is no doubt that it will be significantly faster than the scan of 6 Columnstore Segments :)

Table 'FactOnlineSales'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 3 ms.

I think it would be unfair to put any comment here - 3ms vs 722ms, 4 logical reads vs 36683 logical reads. :)

B-Tree Range Lookup
In the case of a Range Lookup the situation is not that positive in the terms of the spent resources since we are getting all possible information from the Index, in order to return a small part of it, that is sorted correctly.

Table 'FactOnlineSales'. Scan count 1, logical reads 57, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 69 ms.

The amount of time spent on the query is where Columnstore Indexes are actually delivering a significant better performance: 69 ms vs 44 ms (Columnstore), but the amount of data to be read even from the memory is something that can't be easily improved: 57 logical reads vs 1570 logical reads. The 30 times bigger number of reads is something that will do a very serious impact on the production system and so can't be underestimated.

I know that my examples are not covering the vast area of all possible cases to be found in usage of Columnstore Indexes with Nonclustered B-tree Indexes, but they might show you the general idea of the direction.

Data Loading

One area where you should definitely expect an impact is data loading, but hey we all know that people load their data before creating dozens of B-Tree Indexes, right ? :)
The data loading process has been very significantly improved, as I have already written in Columnstore Indexes – part 63 ("Parallel Data Insertion"), and do not forget all those wonderful SIMD optimisations that are making their way into the Columnstore Batch mode with the upcoming SQL Server 2016 version.
Will it be slower to load millions of row into a structure with Clustered Columnstore & Nonclustered B-Tree Indexes ? No matter how you will put it in (CCI or Heap and then add CCI B-tree Indexes, or build meta-structure first), in the most cases I believe that it will be slower than in a pure CCI table in SQL Server 2014.
But once you have loaded data into Clustered Columnstore Index, you can already start querying it, while building all those Nonclustered B-tree Indexes online, and so yes, in some cases it might be slower overall, but the improved performance should be worth it.

I suggest that we monitor our Mapping Index with sys.internal_partitions DMV, but for that we shall need some kind of its size exposure mechanism ... and maybe a bit more of the it's internal structure ... :)

to be continued with Columnstore Indexes – part 66 ("More Clustered Columnstore Improvements in SQL Server 2016")

One thought on “Columnstore Indexes – part 65 (“Clustered Columnstore Improvements in SQL Server 2016”)

  1. Mate Farkas

    Dear Niko,
    Have you already studied the leaf level of nonclustered rowstore index over clustered columnstore index? I try to find something similar to the Clustered Key/RID in case of Clustered Rowstore Index/Heap but I haven’t found anything so far.
    I expected to find the Row Locator on the leaf pages of the Nonclustered Rowstore Index but there is nothing, only a UNIQUIFIER. This would allow the “Key Lookup” from Nonclustered index to Clustered Columnstore Index what you mentioned in this article but why don’t we find it?

    BTW your article and the whole blog about CCI is amazing and I try to follow all of your work here.
    Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *