Continuation from the previous 3 parts, starting from http://www.nikoport.com/2013/07/05/clustered-columnstore-indexes-part-1-intro/
Lets take a look at the concrete T-SQL syntax of the Clustered Columnstore Indexes:
Create Clustered Columnstore Index [name] on [table];
The only difference to the Nonclustered Indexes is basically the word Clustered plus the absence of the list of the columns (which is a normal condition for creating a Clustered Index, because it includes each and every column of the respective table).
Supported Data Types:
At the moment of the writing of this post we have a CTP1 of the SQL Server 2014, and so I decided to start a path of discovery which datatypes are actually supported in this release.
I have started with creating a small script including officially supported datatypes:
create table dbo.MaxDataTable( c1 bigint, c2 numeric (18,3), c3 bit, c4 smallint, c5 decimal (18,3), c6 smallmoney, c7 int, c8 tinyint, c9 money, c10 float(24), c11 real, c12 date, c13 datetimeoffset, c14 datetime2 (7), c15 smalldatetime, c16 datetime, c17 time (7), c18 char(100), c19 varchar(100), c20 nchar(100), c21 nvarchar(100), c22 binary(8), c23 varbinary(8), c24 uniqueidentifier, ); -- Lets try out to create a new Clustered Columnstore Index: Create Clustered Columnstore Index CC_MaxDataTable on dbo.MaxDataTable; -- Insert some Null's by default into this table: insert into dbo.MaxDataTable default values; -- Select content of our table select c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14, c15, c16, c17, c18, c19, c20, c21, c22, c23, c24 from dbo.MaxDataTable;
This code works like a charm, without any problems creating the respective table plus the Clustered Columnstore Index and successfully inserts NULL values into it. 🙂 I have also tried out Numeric datatype with 38 precision and things are looking quite smooth.
But what about the other datatypes ? Their support is not present in the CTP1 of SQL Server 2014:
create table dbo.UnsupportedDatatypesTable( c1 text, c2 timestamp, c3 hierarchyid, c4 sql_variant, c5 xml, c6 varchar(max), c7 nvarchar(max), c8 geography, c9 geometry ); GO -- Lets create a Clustered Columnstore Index Create Clustered Columnstore Index CC_UnsupportedDatatypesTable on dbo.UnsupportedDatatypesTable; GO -- Drop this test table drop table dbo.UnsupportedDatatypesTable
This code fails right at the statement of the Clustered Columnstore Index creation. I have received the following message:
Msg 35343, Level 16, State 1, Line 14
CREATE INDEX statement failed. Column ‘c1’ has a data type that cannot participate in a columnstore index. Omit column ‘c1’.
Any of those listed datatypes in the UnsupportedDatatypesTable table will not allow the creation of the Clustered Columnstore Index, and let me say that for the first release of this functionality it seems like a very reasonable tradeoff.
In the future version I would definitely would love to see supprot for HierarchyId and Varchar(MAX) as well as nVarchar(MAX).
Note: The same restrictions applies to the CLR datatypes, but it is logical since hierarchyId, geography and geometry are CLR types.
This is all fine, but what about Sparse datatype ? SQL Server 2008 has introduced the support for Sparse columns and a good number of DatawareHouses are taking use of it in different situations. Lets try them out:
create table dbo.SparseTable( c1 int sparse NULL, c2 varchar(10) sparse NULL, c3 char(4) sparse NULL ); GO Create Clustered Columnstore Index CC_SparseTable on dbo.SparseTable; GO drop table dbo.SparseTable
But now we get an error message:
Msg 35309, Level 16, State 1, Line 9
CREATE INDEX statement failed because a columnstore index cannot be created on a sparse column. Consider creating a nonclustered columnstore index on a subset of columns that does not include any sparse columns.
Thats a bummer, and though I can see that Sparse is all about space improvement and with the compression of the Columnstore it looses focus, what I can’t understand at the moment is the reason why Nonclustered Columnstore Indexes are supporting it, when Columnstore do not. 🙁
Lets play now with schema changes for our MaxDataTable table, lets see if it already works by adding a new column of type integer (note that we have our Clustered Columnstore Index already created):
-- Let us add one more column alter table dbo.MaxDataTable add c25 int NULL; GO update dbo.MaxDataTable set c25 = 23; -- Can we actually modify it ? alter table dbo.MaxDataTable alter column c25 int NOT NULL; -- Drop it alter table dbo.MaxDataTable drop column c25; Worked out fine! This is really nice, and what about Constraints, I have heard from multiple people that they are not supported:
alter table dbo.MaxDataTable add Constraint CK_MaxDataTable Check (C1 > 0) with CHECK; update dbo.MaxDataTable set c1 = 23; -- Since data is still in the Delta-store let's try to Rebuild the table to put everything into a Segment: alter table dbo.MaxDataTable rebuild; GO
Wow, it looks like Constraints are actually are functioning quite well. Any updates with the values outside of the constraint are rejected, so I really assuming that this functionality is very present already in the current release.
It was mentioned a good number of times that the Clustered Columnstore Index can be the only index in the table, so I wanted to confirm that:
-- Nonclustered Index CREATE INDEX IX_MaxDataTable ON [dbo].[MaxDataTable] (c1); GO -- Playing stupid CREATE CLUSTERED INDEX IX_MaxDataTable ON [dbo].[MaxDataTable] (c1);
CREATE INDEX statement failed because a nonclustered index cannot be created on a table that has a clustered columnstore index. Consider replacing the clustered columnstore index with a nonclustered columnstore index.
While I see enough value in traditional NonClustered Indexes the fact that they will not be available for SQL Server 2014 is not something that I see preventing from using Clustered Columnstore in DatawareHouse environments. Of course I believe that Microsoft should consider implementing them (Nonclustered Indexes) in the following version after SQL Server 2014.