Columnstore Indexes – part 75 (“Stretch DB & Columnstore”)

Continuation from the previous 74 parts, the whole series can be found at https://www.nikoport.com/columnstore/.

This blog post focuses on one of the newest additions to SQL Server 2016, the feature that most probably has gained a universal appraisal since the announcement – the Stretch DB and its integration with Columnstore Indexes.

IC835282.jpegStretch DB or alternatively Stretch Database is a way of spreading your table between SQL Server (on-premises, VM in Azure) and a Azure SQLDatabase. This means that the dat of the table will shared between the SQL Server and the Azure SQLDatabase giving the opportunity to lower the total cost of the local storage, since Azure SQLDatabase is cheap relatively expensive storage typically used on the local SQL Server installations.
This mean that the table data will be separated into Hot Data & Cold Data, where Hot Data is the type of data that is frequently accessed and it extremely important (this is typically some OLTP data) and the Cold Data (this is typically rarely or almost never accessed archival or log data).
For the final user the experience should be the same as before – should he ask for some data that is not on the SQL Server, then it will be read from Azure SQLDatabase by the invocation of remote query, joined with the local results (if any) and then presented to the user.

Currently with SQL Server 2016 CTP 3.2 with GUI Wizard we can only migrate whole table to Azure SQLDatabase but with the help of T-SQL we can already control the predicate filter.

Since the StretchDB is focusing on helping to manage big amounts of data, one of the key elements in order to lower the cost is the usage of Columnstore Indexes and here I will focus on testing StretchDB compatibility with them.

The Test

I will use the free database ContosoRetailDW that comes from Microsoft that I use the backup copied into C:\Install and the data files are placed into C:\Data:

alter database ContosoRetailDW
    set SINGLE_USER WITH ROLLBACK IMMEDIATE;
 
RESTORE DATABASE [ContosoRetailDW] 
    FROM  DISK = N'C:\Install\ContosoRetailDW.bak' WITH  FILE = 1,  
        MOVE N'ContosoRetailDW2.0' TO N'C:\Data\ContosoRetailDW.mdf', 
        MOVE N'ContosoRetailDW2.0_log' TO N'C:\Data\ContosoRetailDW.ldf',  
        NOUNLOAD,  STATS = 5;
 
alter database ContosoRetailDW
    set MULTI_USER;
GO

GO
ALTER DATABASE [ContosoRetailDW] SET COMPATIBILITY_LEVEL = 130
GO
ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0', SIZE = 2000000KB , FILEGROWTH = 128000KB )
GO
ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0_log', SIZE = 400000KB , FILEGROWTH = 256000KB )
GO

Currently there are 2 ways of enabling StretchDB on the database:
– using T-SQL
– using Graphical User Interface (GUI) inside SQL Server Management Studio

For T-SQL we shall need to set up the remote data archive option that is enabling Stretch Database feature on the server, by executing the following command:

exec sp_configure 'remote data archive' , '1';
go
Reconfigure;
go

After that we shall need to create a master key for the database. This master key will be used for securing the credentials that are used to communicate with Azure SQLDatabase:

CREATE MASTER KEY ENCRYPTION BY PASSWORD='

For communication with remote part (Azure SQLDatabase) we shall need to create database scooped credentials:

CREATE DATABASE SCOPED CREDENTIAL stretchContosoCredentials
    WITH IDENTITY = 'niko' , SECRET = '=#$"qwerty $&/:bnuwE3';
GO

We will need to create a Azure SQLServer & SQLDatabase that will have those credentials created in the previous step, or we might simply reuse the existing one.
In my case I simply went to Azure Portal and created a new server with database for this test.
Notice that you will need to give enough permissions on Azure SQLDatabase for your user to be able to create a new database.

After preparing login and user, you can simply execute the following statement, which will create a new database for StretchDB:

ALTER DATABASE ContosoRetailDW
    SET REMOTE_DATA_ARCHIVE = ON
        (
            SERVER = 'columnstore.database.windows.net', -- My Azure SQL Server
            CREDENTIAL = stretchContosoCredentials -- Database Scope Credentials
        ) ;

Refreshing the list of your databases in SSMS, you will notice that the symbol of the database has changed becoming a "kind of a cloud db", meaning that this database is being stretched to Azure:
Stretched Database ContosoRetailDW

Let's stretch a table, and this time for the experiment I will pick dbo.FactITMachine table.
Since StretchDB at the moment does not support primary & foreign keys (make sense because how can you guarantee consistency with data that is not explicitly included in the operations), we will drop them all, plus we will need to remove the identity from the leading ITMachinekey column:

ALTER TABLE [dbo].[FactITMachine] DROP CONSTRAINT [FK_FactITMachine_DimDate];
GO
ALTER TABLE [dbo].[FactITMachine] DROP CONSTRAINT [FK_FactITMachine_DimMachine];
GO
ALTER TABLE [dbo].[FactITMachine] DROP CONSTRAINT [PK_FactITMachine];
GO
alter table dbo.FactITMachine
	alter column [ITMachinekey] int;

Let's set up a filtering predicate function that will determine which rows should be migrated to Azure SQLDatabase and which should stay in SQL Server. For this test I will define 1st of January of 2009 as the cutting point for the data to be migrated:

CREATE FUNCTION dbo.fn_stretchpredicateFactITMachine(@column1 datetime)
	RETURNS TABLE
		WITH SCHEMABINDING 
AS 
RETURN	SELECT 1 AS is_eligible
		WHERE @column1 < CONVERT(datetime, '1/1/2009', 101);

Before activating stretching on a table with a filter predicate we shall need to enable a global trace flag in SQL Server 2016 CTP 3.2:

dbcc traceon(10422,-1); -- Trace Flag Filtering Predicate

Now let's activate stretching on our dbo.FactITMachine table using the filter predicate defined in the previous step:

ALTER TABLE dbo.FactITMachine
    SET ( REMOTE_DATA_ARCHIVE = ON (
        FILTER_PREDICATE = dbo.fn_stretchpredicateFactITMachine(DateKey),
        MIGRATION_STATE = OUTBOUND ) );

If we open monitoring screen for Stretched Databases (right click on our database -> tasks-> stretch -> monitor) :
Open StretchDB Monitor with GUI
you will see that our table is successfully migrated according to the filtering predicate we have defined:
Table FactITMachine is stretched with a filter
To prove that those numbers reflect the actuality, feel free to execute the following T-SQL script which will return exactly 7086 rows corresponding our filtering predicate:

select count(*)
	from dbo.FactITMachine
	where Datekey >= '2009-01-01'

If you want to get the rows back into your SQL Server table, you will need to execute the following script that will enable the global trace flag allowing downward migration from Azure SQLDatabase down to your SQL Server, and then altering the state of the table, setting the migration to be InBound:

dbcc traceon(10424,-1); -- UnMigrate Trace flag

ALTER TABLE dbo.FactITMachine
    SET ( REMOTE_DATA_ARCHIVE = ON  (MIGRATION_STATE = INBOUND) ) ;

Let's concentrate on the test of the functionality that matters to these blog series - Columnstore Indexes - let's create 2 tables with Columnstore Indexes: 1 with Clustered Columnstore Index and the other one with Nonclustered Columnstore Index:

create table dbo.CCI_Stretched(
	c1 int,
	c2 varchar(25),
	index CCI_Stretched_Index clustered columnstore );

create table dbo.NCCI_Stretched(
	c1 int,
	c2 varchar(25),
	index CCI_Stretched_Index nonclustered columnstore (c1,c2) );

Let's insert a couple of rows into each of the tables:

insert into dbo.CCI_Stretched
	(c1,c2)
	select top 100 id, substring(name,0,25) 
		from sys.sysobjects;

insert into dbo.NCCI_Stretched
	(c1,c2)
	select top 100 id, substring(name,0,25) 
		from sys.sysobjects;

Now let's go to the GUI in SQL Server Management Studio and enable stretching for our table with the Clustered Columnstore Index:
Enable Stretching CCI on Clustered Columnstore
Clicking through the welcome screen:
Enable Stretching CCI Welcome
Confirming that the table selected is correct:
Enable Stretching CCI - Table List
We are coming to the validation screen:
Enable Stretching CCI - Validation
where you can see that we have received a pretty concrete message that the type of index used in the table (Clustered Columnstore is the only index there) is not supported for Stretch Databases.
If clicking on the error details you will be presented with a more definitive error message confirming that Clustered Columnstore Indexes are not supported.
Enable Stretching CCI - Error Message GUI

Ok, this is a message from the GUI and we all know that GUI is the last thing to get updated. Even though the message appears to be very concrete, let's try to enable stretching with the help of T-SQL:

ALTER TABLE dbo.CCI_Stretched
    SET ( REMOTE_DATA_ARCHIVE = ON (
        MIGRATION_STATE = OUTBOUND ) );

The script was executed without any error message - did it work ?
Let's open the Stretch Database Monitor - but the table is not listed there !
Opening Stretch Database Health Events in Extended Events gave me the following message:
Enable Stretching CCI - Health Events
This message is clearly showing the the remote table could not be created. Verifying that the table is the one I just tried to configure, I execute the following T-SQL:

select object_name(318624178);

The result is the dbo.CCI_Stretched table and so the extended event is correct - we can't have our table stretched to Azure SQLDatabase.

Well ... Thinking about the error message I have decided to try to ... correct it. :)
Given that the database created for the StretchDB is a Standard S3 one, I went to Azure Portal and manually upgraded it to P1 (Premium).
Database Edition Upgrade
Seconds after the upgrade has finished, and after restarting SSMS (this is an old bug with StretchDB), I have had my table listed as stretched:
Enable Stretching CCI - Configured Database
To make a further confirmation, I have executed the following query while enabling the showing of the actual execution plan:

select *
	from dbo.CCI_Stretched;

Having Remote Query and concatenation with the local data is exactly how Stretch DB is working:
Enable Stretching CCI - Remote Query Execution Plan
Let us hope then that there will be a possibility for choosing an edition while stretching databases. This will allow to avoid all possible problems.

Alternatively Microsoft can always enable Columnstore Indexes for the Standard Edition ... :)

Lets not forget about the Operational Analytics and let's see if we can stretch a table with updatable Nonclustered Columnstore Indexes:
Enable Stretching NCCI - Error Message GUI
This is the same error we have seen for the Clustered Columnstore Indexes, so let's use the power of T-SQL to stretch this table as well:

ALTER TABLE dbo.NCCI_Stretched
    SET ( REMOTE_DATA_ARCHIVE = ON (
        MIGRATION_STATE = OUTBOUND ) );

The result is visible in the Stretch DB Monitor:
Enable Stretching NCCI - Configured Database
Let's verify if the query is really being run remotely:

select *
	from dbo.NCCI_Stretched;

The execution plan speaks for itself:
Enable Stretching NCCI - Remote Query Execution Plan

There is one more scenario that one should consider when thinking about Columnstore Indexes: the newest addition for the In-Memory Tables: InMemory Operational Analytics.
Heading to the Surface area limitations and blocking issues for Stretch Database will help you to discover the limitats and the Memory-optimized tables overall are not supported for the current version.

Final Thoughts

Thinking about the impossibility of Columnstore Index stretching i realised that the default Azure SQLDatabase that is created for StretchDB is the Standard one (S3), where Columnstore Indexes are not supported at the moment.
Otherwise the technology should work without any problems.
Time will tell if maybe StretchDB Migration Advisor will recommend & allow Premium DB selection when configuring the feature ...

to be continued ...

4 thoughts on “Columnstore Indexes – part 75 (“Stretch DB & Columnstore”)

  1. Yoshihiro Kawabata

    The last T-SQL query sample seems to need to changing from “dbo.CCI_Stretched” to “dbo.NCCI_Stretched”.
    Original:
    “select *
    from dbo.CCI_Stretched;”

    Change:
    “select *
    from dbo.NCCI_Stretched;”

    and Thank you, nice blogs.
    Regards.

    1. Niko Neugebauer Post author

      Hi Suhas,

      I suggest you post this issue on StackOverflow or eventually open a support case.
      Stretch is not a popular feature and since the initial release and with departure the people driving its success from Microsoft – I did not see any improvements in all these years.

      Best regards,
      Niko

Leave a Reply to Niko Neugebauer Cancel reply

Your email address will not be published. Required fields are marked *