Clustered Columnstore Indexes – part 26 (“Backup & Restore”)

Continuation from the previous 25 parts, starting from https://www.nikoport.com/2013/07/05/clustered-columnstore-indexes-part-1-intro/

This time I decided to take a look at how having Columnstore Indexes in a database is affecting database backup & restore procedures. I decided to take a simple backup from the Contoso Retail DW database to just 1 file on a local drive in order to have some kind of a baseline. Naturally in a real production environment we would be able to go for multiple drives as well as multiple backup files which would accelerate the overall performance, but for a simple test I will use my local VM with 2 cores of I7 as well as a SSD drive.

For the first step in this exercise, I would be enabling checksums and compression by default:

exec sp_configure 'backup checksum default', '1';
RECONFIGURE;
exec sp_configure 'backup compression default', '1';
RECONFIGURE;

To be fair, I have rebuilt Page-compressed tables at our database, by running the following script, which includes some basic update for the database work space:

 USE [master]
GO
ALTER DATABASE [ContosoRetailDW] SET COMPATIBILITY_LEVEL = 120
GO
ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0', SIZE = 2560000KB , FILEGROWTH = 512000KB )
GO
ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0_log', SIZE = 2000000KB , FILEGROWTH = 512000KB )
GO
USE [ContosoRetailDW]
GO
EXEC dbo.sp_changedbowner @loginame = N'sa', @map = false
GO

ALTER TABLE [dbo].[FactOnlineSales] REBUILD PARTITION = ALL
WITH ( DATA_COMPRESSION = PAGE );

ALTER TABLE [dbo].[FactStrategyPlan] REBUILD PARTITION = ALL
WITH ( DATA_COMPRESSION = PAGE );

ALTER TABLE [dbo].[FactSales] REBUILD PARTITION = ALL
WITH ( DATA_COMPRESSION = PAGE );

ALTER TABLE [dbo].[FactInventory] REBUILD PARTITION = ALL
WITH ( DATA_COMPRESSION = PAGE );

ALTER TABLE [dbo].[FactSalesQuota] REBUILD PARTITION = ALL
WITH ( DATA_COMPRESSION = PAGE );

Now to the Backup command itself, which will be a classic one, generated by the SSMS:

use [master];

DBCC Freeproccache;

BACKUP DATABASE [ContosoRetailDW] TO  DISK = N'C:\Backup\ContosoBkTest' WITH  COPY_ONLY, NOFORMAT, NOINIT,  NAME = N'ContosoRetailDW-Full Database Backup', SKIP, NOREWIND, NOUNLOAD,  STATS = 10, CHECKSUM
GO
declare @backupSetId as int
select @backupSetId = position from msdb..backupset where database_name=N'ContosoRetailDW' and backup_set_id=(select max(backup_set_id) from msdb..backupset where database_name=N'ContosoRetailDW' )
if @backupSetId is null begin raiserror(N'Verify failed. Backup information for database ''ContosoRetailDW'' not found.', 16, 1) end
RESTORE VERIFYONLY FROM  DISK = N'C:\Backup\ContosoBkTest' WITH  FILE = @backupSetId,  NOUNLOAD,  NOREWIND
GO

and this is the Restore command that I have been using in order to measure performance:

use [master];

DBCC Freeproccache;

RESTORE DATABASE [ContosoRetailDW] 
	FROM  DISK = N'C:\Backup\ContosoBkTest' WITH  FILE = 1,  
		MOVE N'ContosoRetailDW2.0' TO N'C:\Data\MSSQL\DATA\ContosoRetailDW.mdf', 
		MOVE N'ContosoRetailDW2.0_log' TO N'C:\Data\MSSQL\DATA\ContosoRetailDW.ldf',  
		NOUNLOAD,  STATS = 5;

I have executed both backup and restore commands a number of times to measure the average performance and I have to say that the results were pretty consistent.

Before creating Clustered Columnstore Indexes on those 5 tables, I will execute the following script in order to make sure that I am running a fair test:

 USE [master]
GO
ALTER DATABASE [ContosoRetailDW] SET COMPATIBILITY_LEVEL = 120
GO
ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0', SIZE = 2560000KB , FILEGROWTH = 512000KB )
GO
ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0_log', SIZE = 2000000KB , FILEGROWTH = 512000KB )
GO
USE [ContosoRetailDW]
GO
EXEC dbo.sp_changedbowner @loginame = N'sa', @map = false
GO

Remove all primary and foreign keys, since they are not compatible with Clustered Columnstore Indexes:

ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [PK_FactOnlineSales_SalesKey]
ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [PK_FactStrategyPlan_StrategyPlanKey]
ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [PK_FactSales_SalesKey]
ALTER TABLE dbo.[FactInventory] DROP CONSTRAINT [PK_FactInventory_InventoryKey]
ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [PK_FactSalesQuota_SalesQuotaKey]

ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimCurrency]
ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimCustomer]
ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimDate]
ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimProduct]
ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimPromotion]
ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimStore]
ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [FK_FactStrategyPlan_DimAccount]
ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [FK_FactStrategyPlan_DimCurrency]
ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [FK_FactStrategyPlan_DimDate]
ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [FK_FactStrategyPlan_DimEntity]
ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [FK_FactStrategyPlan_DimProductCategory]
ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [FK_FactStrategyPlan_DimScenario]
ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [FK_FactSales_DimChannel]
ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [FK_FactSales_DimCurrency]
ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [FK_FactSales_DimDate]
ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [FK_FactSales_DimProduct]
ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [FK_FactSales_DimPromotion]
ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [FK_FactSales_DimStore]
ALTER TABLE dbo.[FactInventory] DROP CONSTRAINT [FK_FactInventory_DimCurrency]
ALTER TABLE dbo.[FactInventory] DROP CONSTRAINT [FK_FactInventory_DimDate]
ALTER TABLE dbo.[FactInventory] DROP CONSTRAINT [FK_FactInventory_DimProduct]
ALTER TABLE dbo.[FactInventory] DROP CONSTRAINT [FK_FactInventory_DimStore]
ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [FK_FactSalesQuota_DimChannel]
ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [FK_FactSalesQuota_DimCurrency]
ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [FK_FactSalesQuota_DimDate]
ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [FK_FactSalesQuota_DimProduct]
ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [FK_FactSalesQuota_DimScenario]
ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [FK_FactSalesQuota_DimStore]

Lets create our Clustered Columnstore Indexes:

Create Clustered Columnstore Index PK_FactOnlineSales
	on dbo.FactOnlineSales;

checkpoint;

Create Clustered Columnstore Index PK_FactOnlineSales
	on dbo.[FactStrategyPlan];

checkpoint;

Create Clustered Columnstore Index PK_FactOnlineSales
	on dbo.[FactSales];

checkpoint;

Create Clustered Columnstore Index PK_FactOnlineSales
	on dbo.[FactInventory];

checkpoint;

Create Clustered Columnstore Index PK_FactOnlineSales
	on dbo.[FactSalesQuota];

Now here are the results of my tests and just take a look at those pictures:
Screen Shot 2014-02-09 at 11.31.37

The backup size is absolutely a no-brainer: Columnstore compression simply wipes out any competition. Even though the storage is getting cheaper, it is nice to have backups that are occupying less space – and this way you might simply have more backup history stored at no additional cost. :)

Screen Shot 2014-02-09 at 11.30.37
The results of the backup are pretty impressive, with Columnstore showing that it simply wins by a double margin with Page compression being the closest competition. Judging by what I see – backing up a database with Columnstore Indexes might make some good improvement on your backup times.

Screen Shot 2014-02-09 at 11.31.32
Restore – I was actually expecting Columnstore backup performing worse than a Page-based one, since a non-compressed backup restore outperforms greatly a compressed one (9,1 seconds on my setup).
As a matter of a fact PAGE Compression backup restore perform here a little bit worse than Columnstore Indexes backup and I have had to make a good number of tests to be able to reconfirm it.
This might have to do with a speed of the drive (SSD), but if we look at it logically – a slower drive on a server would make even more win for the Columnstore indexes, as long as you have a couple of free CPU cores, which should be the case on a good server.

The slower your backup/restore disks are – the more improvements you will get from using Columnstore Indexes, there is absolutely no doubt about it.

updated on 10th of February 2014:

I have executed some tests backing up data to the USB drive, to be able to compare with the backups to the fast drive. :)
Here are the results and as expected the difference gets bigger and bigger according to the performance of the drive.

Screen Shot 2014-02-10 at 12.16.42

Screen Shot 2014-02-10 at 12.16.21

to be continued …

Leave a Reply

Your email address will not be published. Required fields are marked *