Continuation from the previous 25 parts, starting from https://www.nikoport.com/2013/07/05/clustered-columnstore-indexes-part-1-intro/
This time I decided to take a look at how having Columnstore Indexes in a database is affecting database backup & restore procedures. I decided to take a simple backup from the Contoso Retail DW database to just 1 file on a local drive in order to have some kind of a baseline. Naturally in a real production environment we would be able to go for multiple drives as well as multiple backup files which would accelerate the overall performance, but for a simple test I will use my local VM with 2 cores of I7 as well as a SSD drive.
For the first step in this exercise, I would be enabling checksums and compression by default:
exec sp_configure 'backup checksum default', '1'; RECONFIGURE; exec sp_configure 'backup compression default', '1'; RECONFIGURE;
To be fair, I have rebuilt Page-compressed tables at our database, by running the following script, which includes some basic update for the database work space:
USE [master] GO ALTER DATABASE [ContosoRetailDW] SET COMPATIBILITY_LEVEL = 120 GO ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0', SIZE = 2560000KB , FILEGROWTH = 512000KB ) GO ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0_log', SIZE = 2000000KB , FILEGROWTH = 512000KB ) GO USE [ContosoRetailDW] GO EXEC dbo.sp_changedbowner @loginame = N'sa', @map = false GO ALTER TABLE [dbo].[FactOnlineSales] REBUILD PARTITION = ALL WITH ( DATA_COMPRESSION = PAGE ); ALTER TABLE [dbo].[FactStrategyPlan] REBUILD PARTITION = ALL WITH ( DATA_COMPRESSION = PAGE ); ALTER TABLE [dbo].[FactSales] REBUILD PARTITION = ALL WITH ( DATA_COMPRESSION = PAGE ); ALTER TABLE [dbo].[FactInventory] REBUILD PARTITION = ALL WITH ( DATA_COMPRESSION = PAGE ); ALTER TABLE [dbo].[FactSalesQuota] REBUILD PARTITION = ALL WITH ( DATA_COMPRESSION = PAGE );
Now to the Backup command itself, which will be a classic one, generated by the SSMS:
use [master]; DBCC Freeproccache; BACKUP DATABASE [ContosoRetailDW] TO DISK = N'C:\Backup\ContosoBkTest' WITH COPY_ONLY, NOFORMAT, NOINIT, NAME = N'ContosoRetailDW-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10, CHECKSUM GO declare @backupSetId as int select @backupSetId = position from msdb..backupset where database_name=N'ContosoRetailDW' and backup_set_id=(select max(backup_set_id) from msdb..backupset where database_name=N'ContosoRetailDW' ) if @backupSetId is null begin raiserror(N'Verify failed. Backup information for database ''ContosoRetailDW'' not found.', 16, 1) end RESTORE VERIFYONLY FROM DISK = N'C:\Backup\ContosoBkTest' WITH FILE = @backupSetId, NOUNLOAD, NOREWIND GO
and this is the Restore command that I have been using in order to measure performance:
use [master]; DBCC Freeproccache; RESTORE DATABASE [ContosoRetailDW] FROM DISK = N'C:\Backup\ContosoBkTest' WITH FILE = 1, MOVE N'ContosoRetailDW2.0' TO N'C:\Data\MSSQL\DATA\ContosoRetailDW.mdf', MOVE N'ContosoRetailDW2.0_log' TO N'C:\Data\MSSQL\DATA\ContosoRetailDW.ldf', NOUNLOAD, STATS = 5;
I have executed both backup and restore commands a number of times to measure the average performance and I have to say that the results were pretty consistent.
Before creating Clustered Columnstore Indexes on those 5 tables, I will execute the following script in order to make sure that I am running a fair test:
USE [master] GO ALTER DATABASE [ContosoRetailDW] SET COMPATIBILITY_LEVEL = 120 GO ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0', SIZE = 2560000KB , FILEGROWTH = 512000KB ) GO ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0_log', SIZE = 2000000KB , FILEGROWTH = 512000KB ) GO USE [ContosoRetailDW] GO EXEC dbo.sp_changedbowner @loginame = N'sa', @map = false GO
Remove all primary and foreign keys, since they are not compatible with Clustered Columnstore Indexes:
ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [PK_FactOnlineSales_SalesKey] ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [PK_FactStrategyPlan_StrategyPlanKey] ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [PK_FactSales_SalesKey] ALTER TABLE dbo.[FactInventory] DROP CONSTRAINT [PK_FactInventory_InventoryKey] ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [PK_FactSalesQuota_SalesQuotaKey] ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimCurrency] ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimCustomer] ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimDate] ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimProduct] ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimPromotion] ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [FK_FactOnlineSales_DimStore] ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [FK_FactStrategyPlan_DimAccount] ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [FK_FactStrategyPlan_DimCurrency] ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [FK_FactStrategyPlan_DimDate] ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [FK_FactStrategyPlan_DimEntity] ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [FK_FactStrategyPlan_DimProductCategory] ALTER TABLE dbo.[FactStrategyPlan] DROP CONSTRAINT [FK_FactStrategyPlan_DimScenario] ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [FK_FactSales_DimChannel] ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [FK_FactSales_DimCurrency] ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [FK_FactSales_DimDate] ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [FK_FactSales_DimProduct] ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [FK_FactSales_DimPromotion] ALTER TABLE dbo.[FactSales] DROP CONSTRAINT [FK_FactSales_DimStore] ALTER TABLE dbo.[FactInventory] DROP CONSTRAINT [FK_FactInventory_DimCurrency] ALTER TABLE dbo.[FactInventory] DROP CONSTRAINT [FK_FactInventory_DimDate] ALTER TABLE dbo.[FactInventory] DROP CONSTRAINT [FK_FactInventory_DimProduct] ALTER TABLE dbo.[FactInventory] DROP CONSTRAINT [FK_FactInventory_DimStore] ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [FK_FactSalesQuota_DimChannel] ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [FK_FactSalesQuota_DimCurrency] ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [FK_FactSalesQuota_DimDate] ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [FK_FactSalesQuota_DimProduct] ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [FK_FactSalesQuota_DimScenario] ALTER TABLE dbo.[FactSalesQuota] DROP CONSTRAINT [FK_FactSalesQuota_DimStore]
Lets create our Clustered Columnstore Indexes:
Create Clustered Columnstore Index PK_FactOnlineSales on dbo.FactOnlineSales; checkpoint; Create Clustered Columnstore Index PK_FactOnlineSales on dbo.[FactStrategyPlan]; checkpoint; Create Clustered Columnstore Index PK_FactOnlineSales on dbo.[FactSales]; checkpoint; Create Clustered Columnstore Index PK_FactOnlineSales on dbo.[FactInventory]; checkpoint; Create Clustered Columnstore Index PK_FactOnlineSales on dbo.[FactSalesQuota];
Now here are the results of my tests and just take a look at those pictures:
The backup size is absolutely a no-brainer: Columnstore compression simply wipes out any competition. Even though the storage is getting cheaper, it is nice to have backups that are occupying less space – and this way you might simply have more backup history stored at no additional cost. :)
The results of the backup are pretty impressive, with Columnstore showing that it simply wins by a double margin with Page compression being the closest competition. Judging by what I see – backing up a database with Columnstore Indexes might make some good improvement on your backup times.
Restore – I was actually expecting Columnstore backup performing worse than a Page-based one, since a non-compressed backup restore outperforms greatly a compressed one (9,1 seconds on my setup).
As a matter of a fact PAGE Compression backup restore perform here a little bit worse than Columnstore Indexes backup and I have had to make a good number of tests to be able to reconfirm it.
This might have to do with a speed of the drive (SSD), but if we look at it logically – a slower drive on a server would make even more win for the Columnstore indexes, as long as you have a couple of free CPU cores, which should be the case on a good server.
The slower your backup/restore disks are – the more improvements you will get from using Columnstore Indexes, there is absolutely no doubt about it.
updated on 10th of February 2014:
I have executed some tests backing up data to the USB drive, to be able to compare with the backups to the fast drive. :)
Here are the results and as expected the difference gets bigger and bigger according to the performance of the drive.
to be continued …