Continuation from the previous 15 parts, starting from https://www.nikoport.com/2013/07/05/clustered-columnstore-indexes-part-1-intro/
This post will be dedicated to the process of building Clustered Columnstore Indexes. I will be keep on playing with the Contoso BI database.
I have already mentioned in one of the first posts of this series, that the build process of the Clustered Columnstore Indexes includes a new improved mechanism which automatically determines the amount of memory available to the process, and thus effectively decides how many cores of the CPU will be used for the process – hence the quantity of memory.
We shall be working with FactOnlineSales table which has around 12 million rows, and our usual setup includes dropping all the related keys:
ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [PK_FactOnlineSales_SalesKey] ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimCurrency] ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimCustomer] ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimDate] ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimProduct] ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimPromotion] ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimStore]
Before taking a detailed look into the Clustered Columnstore Index creation plan, let me explain the setup – in my case the virtual machine has 6 cores as well as the 8 GB of RAM, so I can effectively play. As you should already know, having a lot of fast CPU’s as well as big amount of memory is a must if you want to work effectively with Clustered Columnstore Indexes.
Lets create a Clustered Columnstore Index on our FactOnlineSales table:
Create Clustered Columnstore Index PK_FactOnlineSales on dbo.FactOnlineSales;
It takes around 21 seconds to execute this statement on my Virtual Machine, and here is the execution plan that I did receive:
The most interesting thing for us at the moment are the properties of the “insert” box, lets check them out:
You can notice, that there is a new property in the SQL Server 2014 CTP1 – Effective Degree of Parallelism, which is described as a “max degree of parallelism during columnstore index build“. In our case the value corresponds the expected one, which is a 6, since we have not changed our default settings.
Lets change the amount of memory available to the SQL Server by setting it to 4 GB by executing the following script:
sp_configure 'max server memory (MB)', '4096' Go Reconfigure
Now let’s re-create our Clustered Columnstore Index and see what is different:
Create Clustered Columnstore Index PK_FactOnlineSales on dbo.FactOnlineSales with (DROP_EXISTING = ON);
This time we have a process which took around 28 seconds on my Virtual Machine and here are the properties for the execution plan:
This time we have a 4 as a value for the Effective Degree of Parallelism property.
This is already different from the SQL Server 2012, in which for the Nonclustered Columnstore Indexes their creation or rebuild process would fail if there would not be enough memory. In SQL Serve 2014 we will be getting both Clustered as well as Nonclustered Columnstore Indexes using this mechanism and practically adopting memory usage to the currently available amounts. Should we lower the amounts of memory then SQL Server will progressively lower the MAXDOP until 1 effectively adopting the non-parallel process.
But what about the MAXDOP setting, can we actually set it to a lower number and see any effect? Lets try it out, with our current 4 GB of memory available, lets execute the following statement:
Create Clustered Columnstore Index PK_FactOnlineSales on dbo.FactOnlineSales with (DROP_EXISTING = ON, MAXDOP = 3);
Create Clustered Columnstore Index PK_FactOnlineSales on dbo.FactOnlineSales with (DROP_EXISTING = ON, MAXDOP = 5);
In this case we are still getting MAXDOP 4, as it was before, because the resources are not allowing us to get more MAXDOP.
One more interesting thing that I have noticed during my experiments with Clustered Columnstore Indexes is that the actual build process is kind of separated in 2 phases: on the first one we are just using 1 core of any available – and logically this is a phase where we create our dictionaries and then the 2nd phase when are using all our available/specified CPU cores to create the actual Clustered Columnstore Indexes.
Take a look this picture, you can clearly see both of the phases:
to be continued with Clustered Columnstore Indexes – part 17 (“Resources 2012 vs 2014”)