Clustered Columnstore Indexes – part 4 (“Basic T-SQL”)

Continuation from the previous 3 parts, starting from http://www.nikoport.com/2013/07/05/clustered-columnstore-indexes-part-1-intro/

Lets take a look at the concrete T-SQL syntax of the Clustered Columnstore Indexes:

Create Clustered Columnstore Index [name] on [table];

The only difference to the Nonclustered Indexes is basically the word Clustered plus the absence of the list of the columns (which is a normal condition for creating a Clustered Index, because it includes each and every column of the respective table).

Supported Data Types:
At the moment of the writing of this post we have a CTP1 of the SQL Server 2014, and so I decided to start a path of discovery which datatypes are actually supported in this release.

I have started with creating a small script including officially supported datatypes:

create table dbo.MaxDataTable(
	c1 bigint,
	c2 numeric (18,3),
	c3 bit,
	c4 smallint,
	c5 decimal (18,3),
	c6 smallmoney,
	c7 int,
	c8 tinyint,
	c9 money,
	c10 float(24),
	c11 real,
	c12 date,
	c13 datetimeoffset,
	c14 datetime2 (7),
	c15 smalldatetime,
	c16 datetime,
	c17 time (7),
	c18 char(100),
	c19 varchar(100),
	c20 nchar(100),
	c21 nvarchar(100),
	c22 binary(8),
	c23 varbinary(8),
	c24 uniqueidentifier,
);

-- Lets try out to create a new Clustered Columnstore Index:
Create Clustered Columnstore Index CC_MaxDataTable on dbo.MaxDataTable;

-- Insert some Null's by default into this table:
insert into dbo.MaxDataTable
	default values;

-- Select content of our table
select c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14, c15, c16, c17, c18, c19, c20, c21, c22, c23, c24
	from dbo.MaxDataTable; 

This code works like a charm, without any problems creating the respective table plus the Clustered Columnstore Index and successfully inserts NULL values into it. 🙂 I have also tried out Numeric datatype with 38 precision and things are looking quite smooth.

But what about the other datatypes ? Their support is not present in the CTP1 of SQL Server 2014:

create table dbo.UnsupportedDatatypesTable(
	c1 text, 
	c2 timestamp,
	c3 hierarchyid,
	c4 sql_variant,
	c5 xml,
	c6 varchar(max),
	c7 nvarchar(max),
	c8 geography,
	c9 geometry
);	
GO

-- Lets create a Clustered Columnstore Index
Create Clustered Columnstore Index CC_UnsupportedDatatypesTable on dbo.UnsupportedDatatypesTable;
GO

-- Drop this test table
drop table dbo.UnsupportedDatatypesTable

This code fails right at the statement of the Clustered Columnstore Index creation. I have received the following message:
Msg 35343, Level 16, State 1, Line 14
CREATE INDEX statement failed. Column ‘c1’ has a data type that cannot participate in a columnstore index. Omit column ‘c1’.

Any of those listed datatypes in the UnsupportedDatatypesTable table will not allow the creation of the Clustered Columnstore Index, and let me say that for the first release of this functionality it seems like a very reasonable tradeoff.

In the future version I would definitely would love to see supprot for HierarchyId and Varchar(MAX) as well as nVarchar(MAX).
Note: The same restrictions applies to the CLR datatypes, but it is logical since hierarchyId, geography and geometry are CLR types.

Sparse:
This is all fine, but what about Sparse datatype ? SQL Server 2008 has introduced the support for Sparse columns and a good number of DatawareHouses are taking use of it in different situations. Lets try them out:

create table dbo.SparseTable(
	c1 int sparse NULL, 
	c2 varchar(10) sparse NULL,
	c3 char(4) sparse NULL
);
GO

Create Clustered Columnstore Index CC_SparseTable on dbo.SparseTable;
GO

drop table dbo.SparseTable

But now we get an error message:

Msg 35309, Level 16, State 1, Line 9
CREATE INDEX statement failed because a columnstore index cannot be created on a sparse column. Consider creating a nonclustered columnstore index on a subset of columns that does not include any sparse columns.
Thats a bummer, and though I can see that Sparse is all about space improvement and with the compression of the Columnstore it looses focus, what I can’t understand at the moment is the reason why Nonclustered Columnstore Indexes are supporting it, when Columnstore do not. 🙁

Alter Table:
Lets play now with schema changes for our MaxDataTable table, lets see if it already works by adding a new column of type integer (note that we have our Clustered Columnstore Index already created):

-- Let us add one more column
alter table dbo.MaxDataTable
	add c25 int NULL;
GO

update dbo.MaxDataTable
	set c25 = 23;

-- Can we actually modify it ?
alter table dbo.MaxDataTable
	alter column c25 int NOT NULL;

-- Drop it
alter table dbo.MaxDataTable
	drop column c25;
Worked out fine! This is really nice, and what about Constraints, I have heard from multiple people that they are not supported:
alter table dbo.MaxDataTable
	add Constraint CK_MaxDataTable
		Check (C1 > 0) with CHECK;

update dbo.MaxDataTable
	set c1 = 23;

-- Since data is still in the Delta-store let's try to Rebuild the table to put everything into a Segment:
alter table dbo.MaxDataTable 
        rebuild;
GO

Wow, it looks like Constraints are actually are functioning quite well. Any updates with the values outside of the constraint are rejected, so I really assuming that this functionality is very present already in the current release.

Traditional Indexes:
It was mentioned a good number of times that the Clustered Columnstore Index can be the only index in the table, so I wanted to confirm that:

-- Nonclustered Index
CREATE INDEX IX_MaxDataTable
    ON [dbo].[MaxDataTable] (c1);
GO

-- Playing stupid
CREATE CLUSTERED INDEX IX_MaxDataTable
    ON [dbo].[MaxDataTable] (c1); 

Confirmed:
CREATE INDEX statement failed because a nonclustered index cannot be created on a table that has a clustered columnstore index. Consider replacing the clustered columnstore index with a nonclustered columnstore index.

While I see enough value in traditional NonClustered Indexes the fact that they will not be available for SQL Server 2014 is not something that I see preventing from using Clustered Columnstore in DatawareHouse environments. Of course I believe that Microsoft should consider implementing them (Nonclustered Indexes) in the following version after SQL Server 2014.

to be continued with Clustered Columnstore Indexes – part 5 (“New Meta-Information and System Stored Procedure”)

5 thoughts on “Clustered Columnstore Indexes – part 4 (“Basic T-SQL”)

  1. Pingback: Clustered Columnstore Indexes – part 1 (“Intro”) | Nikoport

  2. Frank

    You write “what I can’t understand at the moment is the reason why Nonclustered Columnstore Indexes are supporting it, when Columnstore do not.”. But the error message clearly suggests “a nonclustered columnstore index on a subset of columns that does not include any sparse columns”. This explains why it is not possible, as the clustered index needs to include ALL columns, and it just is impossibly to include any sparse columns in a columnstore index at all.

  3. Yasub Mannan

    Hi Niko,

    Thank you so much for this blog series. I am trying to work with ColumnStore Indexes and you blog series is a savior!

    One Input about the Supported Datatypes, Latest edition of SQL2014, supports these as well 🙂 :

    c6 varchar(max),
    c7 nvarchar(max),
    c8 geography,
    c9 geometry

Leave a Reply

Your email address will not be published. Required fields are marked *