Statistics: Incremental (2014) – Part 3: Updates Performance

This post is a direct continuation from the previous post on the new Incremental Statistics feature in SQL Server 2014.

This time we shall dive into the performance of the Statistics Update statement.

To reset the game, I will update tables statistics on by executing the sp_updatestats statement

update statistics dbo.FactOnlineSales with FULLSCAN;

After that let’s make sure that all our table statistics are actual:

SELECT indid, o.name AS Table_Name, i.name AS Index_Name, 
       STATS_DATE(o.id,i.indid) AS Date_Updated, rowmodctr--, i.*
	   , st.is_incremental
	FROM sysobjects o 
		JOIN sysindexes i 
			ON i.id = o.id
		JOIN sys.stats st
			ON st.object_id = o.id and st.stats_id = i.indid
	WHERE xtype = 'U' AND i.name IS NOT NULL and o.id = object_id('dbo.FactOnlineSales')
	ORDER BY Date_Updated asc;

Here is what I can actually see:
Screen Shot 2014-02-04 at 16.16.43

So the next step would be executing the statistics update for both traditional as well as incremental statistics to be able to compare their execution times. To make sure the results are more consistent I am pushing DBCC FREEPROCCACHE before each of the UPDATE STATISTICS command:

USE [ContosoRetailDW]
GO
dbcc freeproccache;

set statistics time on
set statistics io on

-- Incremental Statistics update
update statistics dbo.FactOnlineSales ([PtdFactOnlineSalesIncremental]) with FULLSCAN

set statistics time off
set statistics io off

GO
dbcc freeproccache;

set statistics time on
set statistics io on

-- Traditional Statistics Update
update statistics dbo.FactOnlineSales ([PtdFactOnlineSales]) with FULLSCAN

set statistics time off
set statistics io off

I have executed the previous script 3 times to get the average times and here are my results:
Screen Shot 2014-02-04 at 16.47.36

There are a couple of things quite clear on the image above:
1. The fullscan update of Incremental Statistics spends more resources. In my case a good 30% more CPU Time were spent, which shows that the internal structure of Incremental Statistics which includes an object for each of the partitions plus intermediate aggregates (representing a tree) needs more resources to get updated. This make a lot of sense.
2. The total elapsed time of a statistics update operation is a very different game. This is where Incremental statistics easily outperforms the Traditional Statistics even while spending much more CPU time in total
SPECULATION:
I guess it has to do with the fact that Incremental Statistics are being updated in parallel, where Traditional Statistics are being updated in a Single-Thread.

Partition Statistics Update

This is great, but what about the direct update of statistics on a single partition ?
Lets update some data on the 3rd partition of our table, by executing the following statement:

update dbo.FactOnlineSales
	set DateKey = dateAdd( ss, 1, DateKey )
	where DateKey > '2009-12-01';

The above statement will update 3.087.847 rows on the 3rd partition and it might take some time on your computer. Just saying. :) Now its time to play and to see what is going on with our statistics, by updating the Incremental Statistics update on a partition basis:

USE [ContosoRetailDW]
GO
dbcc freeproccache;

set statistics time on
set statistics io on

update statistics dbo.FactOnlineSales ([PtdFactOnlineSalesIncremental]) with RESAMPLE ON PARTITIONS(3)

set statistics time off
set statistics io off

Lets do the same for our Traditional Statistics object:

dbcc freeproccache;

set statistics time on
set statistics io on

update statistics dbo.FactOnlineSales ([PtdFactOnlineSales]) with RESAMPLE ON PARTITIONS(3)

set statistics time off
set statistics io off

Oops, now I get this:
Msg 9111, Level 16, State 1, Line 29
UPDATE STATISTICS ON PARTITIONS syntax is not supported for non-incremental statistics.

Eureka! Of course We can’t expect that the traditional statistics will allow us to update the information on a partition level, so that means that we are stuck with the old command without PARTITION.
So the execution statement would actually be update statistics dbo.FactOnlineSales ([PtdFactOnlineSales]) with RESAMPLE;.

Here are the results & as we all know a picture is thousand words worth :) :
Screen Shot 2014-02-04 at 17.57.01

There is one more thing I want to point on:
Should we update the Incremental Statistics, than the rowmodctr is actually not changing, as you can see from the execution of the following statement:

SELECT indid, o.name AS Table_Name, i.name AS Index_Name, 
       datediff(ss,STATS_DATE(o.id,i.indid), CURRENT_TIMESTAMP) AS SecondsSinceLastUpdate, 
	   rowmodctr
	   , st.is_incremental
	FROM sysobjects o 
		JOIN sysindexes i 
			ON i.id = o.id
		JOIN sys.stats st
			ON st.object_id = o.id and st.stats_id = i.indid
	WHERE xtype = 'U' AND i.name IS NOT NULL and o.id = object_id('dbo.FactOnlineSales')
	ORDER BY SecondsSinceLastUpdate asc
GO

Screen Shot 2014-02-04 at 18.01.12

This actually makes sense, since there were no updates of all partitions of our object PtdFactOnlineSalesIncremental, but if we update all of the partitions manually with the following script:

update statistics dbo.FactOnlineSales ([PtdFactOnlineSalesIncremental]) with RESAMPLE ON PARTITIONS(1);
update statistics dbo.FactOnlineSales ([PtdFactOnlineSalesIncremental]) with RESAMPLE ON PARTITIONS(2);
update statistics dbo.FactOnlineSales ([PtdFactOnlineSalesIncremental]) with RESAMPLE ON PARTITIONS(4);

than everything will be fine, since rowmodctr will be reset. ;)
Screen Shot 2014-02-04 at 18.14.34

to be continued …

Leave a Reply

Your email address will not be published. Required fields are marked *