Continuation from the previous 19 parts, starting from http://www.nikoport.com/2013/07/05/clustered-columnstore-indexes-part-1-intro/
This post is dedicated to the situations when available memory to the process ( execution plan ) will not be able to satisfy the context execution environment and one or more operators will have to ‘spill’ the data on the disk. I am trying to investigate if there are any serious improvements in the SQL Server 2014 CTP1.
As in a lot of previous posts, I will keep on working with Contoso BI database, having created a Clustered Columnstore Index on the FactOnlineSales table on SQL Server 2014 CTP1 and having a traditional clustered index + Nonclustered Columnstore Index on SQL Server 2012 SP1.
This type of problems are somehow common in SQL Server as they happen for a good number of reasons. The most common of them is the wrong statistics estimate – and so the request for the necessary memory is simply asking for the wrong values. In this situation for the query to keep on execution, the data is transferred (spilled) to the disk in order for the query not to stop its execution. The current data is moved to TempDB in order to make space available for the new data, and whenever the spilled data is needed again – it will be read from the TempDB disk. This whole process is extremely slow, because we are not dealing with memory and this is the reason why some queries simply execute extremely slowly – they write and read from Disk, without a very concrete mentioning of the process.
Before SQL Server 2012 the query execution plans had no information about this process at all, and in order to be able to catch those situations one would have to use Profiler or Extended Events (2008+). In SQL Server 2012, the query plans have received the necessary warnings in order to advise query developers about the spill process.
I have heard from Igor Stanko (Program Manager for Parallel Datawarehouse at Microsoft) on TechEd 2013 Europe that the new SQL Server 2014 will have some significant improvements in that area, and so I decided to advance with a series of tests that I would like to present in this blog.
One of the common situations in DataWarehousing is when we are joining 2 very big tables and then waiting for our Hash Match operator to complete the joining, and so I have created the following query joining the 12.6 Million row table FactOnlineSales with itself while using a Columnstore Index (Nonclustered for SQL Server 2012 and Clustered for SQL Server 2014 CTP1). Since I have 4 GB of memory given to both of Virtual Machines there must be some interesting results to investigate.
Note: I have made my VM’s suffer with an objective of seeing how different SQL Server handles different situations. The code itself does not make much sense and please don’t run any of such experiments on your production environments.
Note 2: I had to force the usage of the Columnstore Index on SQL Server 2012 SP1, since the traditional clustered index was being used on that queries without additional hints.
set rowcount 100 -- SQL Server 2012 version select sales.OnlineSalesKey, COUNT_BIG(*) from dbo.FactOnlineSales sales with (index(NC_PK_FactOnlineSales)) inner join dbo.FactOnlineSales sales2 with (index(NC_PK_FactOnlineSales)) on sales.OnlineSalesKey = sales2.OnlineSalesKey group by sales.OnlineSalesKey order by sales.OnlineSalesKey -- SQL Server 2014 CTP 1 version select sales.OnlineSalesKey, COUNT_BIG(*) from dbo.FactOnlineSales sales inner join dbo.FactOnlineSales sales2 on sales.OnlineSalesKey = sales2.OnlineSalesKey group by sales.OnlineSalesKey order by sales.OnlineSalesKey
Here comes the execution plan for the SQL Server 2012:
If you look with attention at the plan, you will notice 2 yellow triangles at the bottom of the Hash Match and Sort operators. This means that we are facing 2 problems in 1 execution plan – we have a TempDB spill on both of those operations.
It takes 34 seconds with 21124 reads associated to execute this query on my SQL Server 2012 SP 1 Virtual Machine, while on the SQL Server 2014 CTP1 it takes around 5 seconds with 139154 reads. Getting a result almost 7 times better than before is one major improvement.
Both plans look quite similar, with 1 extra Stream Aggregate for the SQL Server 2014 CTP1 version, plus the typical differences between SQL Server 2012 and SQL Server 2014 relating to the batch mode. But wait, let us examine the exact difference in the execution plan.
Here are the details for the both operators at SQL Server 2012:
One of the first things to notice is that both operations are running in Row Mode, but remember that we are actually reading data only from the Columnstore Index, which means that theoretically there is a possibility of both of those operations running in the Batch Mode. Life is not a fairy tale, and for the first version of the Columnstore Indexes no reasonable person should have expected to have everything functioning right, so lets check out if there are any improvements SQL Server 2014 CTP1:
Now there are some quite visible changes with the Hash Match operator – it functions completely in Batch Mode even though we are doing TempDB spill, and according to Microsoft the spill itself functions completely in Batch Mode, which is a very nice improvement if you ask me.
This is really fine, but what about the Sort operation ? It works completely in Row Mode, and especially since we have a Parallelism operator in our plan – I don’t even have to check its mode.
I am extremely glad to have Batch Mode implemented for the Hash Match operator but I am definitively looking forward to see Sort operator functioning in the Batch Mode. There is no doubt in my mind that this can be done especially in the situations when we are dealing only with the data from the Columnstore Indexes. I mean that ideally we would do almost everything (besides some small operations when we have just a few rows to work with, but I see the progress and I really expect a couple more operators to have Batch Mode implemented in the final release of the SQL Server 2014.
I really applaud to Microsoft for advancing further with Columnstore Indexes and especially with the Batch Mode, but I am pretty much hungry for more.
to be continued with Clustered Columnstore Indexes – part 21 (“DBCC CSIndex”)