Continuation from the previous 21 parts, starting from http://www.nikoport.com/2013/07/05/clustered-columnstore-indexes-part-1-intro/
Last week on an amazing 2013 edition of PASSCamp in Germany one of the attendees has called my attention to the fact that during hands-on labs he has managed to get an invisible Row Group. I was in the middle of demos and did not get the message during the first call – I thought that it was an unexpected Row Group that has managed to see, but as a matter of a fact – it was something different.
Indeed it was something, that is called Invisible Row Group. I can only say that I was extremely excited about it since I have never managed to see one in my tests and demos. I am referring to the very special moment of life of any Delta Store, when it is right to become a compressed Segment. If you are running tests during the period when Tuple Mover kicks in, you have great chances to find it at CTP2 for sure.
When I had a first couple of minutes free I have managed to imitate the conditions, which I will describe below:
Lets kick off by creating an empty table with a Clustered Columnstore Index:
create table dbo.MaxDataTable( c1 bigint, c2 numeric (36,3), c3 bit, c4 smallint, c5 decimal (18,3), c6 smallmoney, c7 int, c8 tinyint, c9 money, c10 float(24), c11 real, c12 date, c13 datetimeoffset, c14 datetime2 (7), c15 smalldatetime, c16 datetime, c17 time (7), c18 char(100), c19 varchar(100), c20 nchar(100), c21 nvarchar(100), c22 binary(8), c23 varbinary(8), c24 uniqueidentifier, ); -- Lets try out to create a Clustered Columnstore Index: Create Clustered Columnstore Index CC_MaxDataTable on dbo.MaxDataTable;
I will insert 2 new rows and compress this Row Group:
insert into dbo.MaxDataTable default values; insert into dbo.MaxDataTable default values; alter table dbo.MaxDataTable rebuild;
Let us insert 10.000 rows and update them immediately:
-- Insert 10.000 rows insert into dbo.MaxDataTable default values; GO 10000 -- ***************************************************************** -- Lets Update all rows with some data with updTable as ( select * , row_number() over(partition by C1 order by C1) as rnk from dbo.MaxDataTable ) update updTable set C1 = rnk, C2 = ABS(CHECKSUM(NewId())) % 142359.0, C3 = 1, C4 = cast(ABS(CHECKSUM(NewId())) % 10000 as smallint), C5 = ABS(CHECKSUM(NewId())) % 242359.0, C6 = rnk / ABS(CHECKSUM(NewId())) % 242359.0, C7 = ABS(CHECKSUM(NewId())) % 2000000, C8 = ABS(CHECKSUM(NewId())) % 255, C9 = rnk / ABS(CHECKSUM(NewId())) % 242359.0, C10 = rnk / ABS(CHECKSUM(NewId())) % 242359.0, C11 = rnk / ABS(CHECKSUM(NewId())) % 242359.0, C12 = getDate(), C14 = DATEADD(day, (ABS(CHECKSUM(NEWID())) % 65530), 0), C15 = DATEADD(day, (ABS(CHECKSUM(NEWID())) % 65530), 0), C16 = DATEADD(day, (ABS(CHECKSUM(NEWID())) % 65530), 0), C17 = getDate(), C18 = cast( ABS(CHECKSUM(NewId())) % 242359.0 as char(25)), C19 = cast( ABS(CHECKSUM(NewId())) % 232659.0 as varchar(25)), C20 = cast( ABS(CHECKSUM(NewId())) % 242359.0 as nchar(25)), C21 = cast( ABS(CHECKSUM(NewId())) % 232659.0 as nvarchar(25));
Before we advance to the next step, please use the following script to monitorize the state of the Row Groups in our table:
SELECT rg.total_rows, cast(100.0*(total_rows)/1048576 as Decimal(6,3)) as PercentFull, cast(100-100.0*(total_rows - ISNULL(deleted_rows,0))/iif(total_rows = 0, 1, total_rows) as Decimal(6,3)) AS PercentDeleted, i.object_id, object_name(i.object_id) AS TableName, i.name AS IndexName, i.index_id, i.type_desc, rg.* FROM sys.indexes AS i INNEr JOIN sys.column_store_row_groups AS rg ON i.object_id = rg.object_id AND i.index_id = rg.index_id WHERE object_name(i.object_id) = 'MaxDataTable' --and state = 0 ORDER BY object_name(i.object_id), i.name, row_group_id;
Its time to insert an extra Row Group – 1.045.678 rows:
-- Insert 1.000.000 rows in order to get the Delta-Store full ) declare @i as int; declare @max as int; select @max = isnull(max(C1),0) from dbo.MaxDataTable; set @i = 1; begin tran while @i <= 1048576 begin insert into dbo.MaxDataTable default values set @i = @i + 1; end; commit;
You should immediately kick in Tuple Mover manually, while checking the status of the Row Groups using the script above:
alter index [CC_MaxDataTable] on dbo.MaxDataTable reorganize;
There are 4 Row Groups visible, with 1 of them having status equal to 0 (Zero), and status_description of "Invisible". This is clearly a moment when we are compressing our closed Delta-Store which is identified with Row_Group_id = 1 (state = "closed").
A very clear show of how things are functioning in Clustered Columnstore :)
This raises another important issue for all those future scripts consulting sizes of Clustered Columnstore Indexes - a predicate with state > 0 should be included in order to avoid double counting the number of rows inside the Clustered Columnstore Index.
to be continued with Clustered Columnstore Indexes – part 23 (“Data Loading”)