Columnstore Indexes – part 119 (“In-Memory Columnstore Location”)

Continuation from the previous 118 parts, the whole series can be found at http://www.nikoport.com/columnstore/.

This is a rather simple blog post that is dedicated to the theme of the In-Memory Columnstore Indexes location. This has been a constant topic of discussion over a long period of time, even during the public events – and there is a need to clear out this topic.

I have assumed that the In-Memory Columnstore structures (Segments, Dictionaries, …) are located in the In-Memory, but there have been voices that I greatly respect, pointing that actually the Columnstore Object Pool is the exact location of any Columnstore structures, and there is nothing better than to take this feature for a ride and see what the SQL Server engine is actually doing.

For the tests, I took a simple SQL Server 2017 EXPRESS Edtion instance in its RTM version, because of its memory limitations that define that the Memory: Maximum memory-optimized data per database and the Memory: Maximum Columnstore segment cache per instance are both capped at the 352 MB, making the demos on the subject very easy to produce and observe. 🙂

The good old free database ContosoRetailDW will be used for the tests because we do want to make sure that our tables fit into the memory of the test instance – the maximum size for the buffer pool is still 1GB, but the SQL Server Express is free – yes, really free!

The standard script for restoring the database from the C:\Install location and storing this time in C:\DATA\XE is presented below:

USE [master]
 
alter database ContosoRetailDW
    set SINGLE_USER WITH ROLLBACK IMMEDIATE;
 
RESTORE DATABASE [ContosoRetailDW] 
    FROM  DISK = N'C:\Install\ContosoRetailDW.bak' WITH  FILE = 1,  
        MOVE N'ContosoRetailDW2.0' TO N'C:\Data\XE\ContosoRetailDW.mdf', 
        MOVE N'ContosoRetailDW2.0_log' TO N'C:\Data\XE\ContosoRetailDW.ldf',  
        NOUNLOAD,  STATS = 5;
 
alter database ContosoRetailDW
    set MULTI_USER;
GO

GO
ALTER DATABASE [ContosoRetailDW] SET COMPATIBILITY_LEVEL = 140
GO
ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0', SIZE = 2000000KB , FILEGROWTH = 128000KB )
GO
ALTER DATABASE [ContosoRetailDW] MODIFY FILE ( NAME = N'ContosoRetailDW2.0_log', SIZE = 400000KB , FILEGROWTH = 256000KB )
GO

Let us add the In-Memory capability to our test database, using the C:\Data\XE\Contosoxtp folder for the In-Memory storage:


 USE master;

ALTER DATABASE [ContosoRetailDW] 
	ADD FILEGROUP [ContosoRetailDW_Hekaton] 
	CONTAINS MEMORY_OPTIMIZED_DATA
GO

ALTER DATABASE [ContosoRetailDW]
	ADD FILE(NAME = ContosoRetailDW_HekatonDir, 
	FILENAME = 'C:\Data\XE\Contosoxtp') 
	TO FILEGROUP [ContosoRetailDW_Hekaton];

GO

We need to drop the primary key on the FactOnlineSales table and create a Clustered Columnstore Index on this table:

use ContosoRetailDW;
GO

ALTER TABLE dbo.[FactOnlineSales] DROP CONSTRAINT [PK_FactOnlineSales_SalesKey];
 
create clustered columnstore Index PK_FactOnlineSales
	on dbo.FactOnlineSales;

Now, let us create a simple copy of our Columnstore FactOnlineSales table, that will be storing In-Memory data, and which shall be called FactOnlineSales_Hekaton:

use ContosoRetailDW;

DROP TABLE IF EXISTS [dbo].[FactOnlineSales_Hekaton];

CREATE TABLE [dbo].[FactOnlineSales_Hekaton](
	[SomeId] INT IDENTITY(1,1) NOT NULL,
	[OnlineSalesKey] [int] NOT NULL,
	[DateKey] [datetime] NOT NULL,
	[StoreKey] [int] NOT NULL,
	[ProductKey] [int] NOT NULL,
	[PromotionKey] [int] NOT NULL,
	[CurrencyKey] [int] NOT NULL,
	[CustomerKey] [int] NOT NULL,
	[SalesOrderNumber] [nvarchar](20) NOT NULL,
	[SalesOrderLineNumber] [int] NULL,
	[SalesQuantity] [int] NOT NULL,
	[SalesAmount] [money] NOT NULL,
	[ReturnQuantity] [int] NOT NULL,
	[ReturnAmount] [money] NULL,
	[DiscountQuantity] [int] NULL,
	[DiscountAmount] [money] NULL,
	[TotalCost] [money] NOT NULL,
	[UnitCost] [money] NULL,
	[UnitPrice] [money] NULL,
	[ETLLoadID] [int] NULL,
	[LoadDate] [datetime] NULL,
	[UpdateDate] [datetime] NULL,
	Constraint PK_FactOnlineSales_Hekaton PRIMARY KEY NONCLUSTERED ([SomeId]),
	INDEX NCCI_FactOnlineSales_Hekaton 
		CLUSTERED COLUMNSTORE 
) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);

For the simplest test, let’s load 1 Million Rows into our In-Memory table, making sure that no data is staying in the Delta-Store, but compressed in the Row Group:

-- DELETE ALL Records
delete from dbo.FactOnlineSales_Hekaton

-- Insert 1.500.000 Rows 
insert into dbo.FactOnlineSales_Hekaton
	select top 1500000 *
		from dbo.FactOnlineSales

-- Select Object_ID for the InMemory OLTP Table for using in the following statement for Tail Row Group compression:
select object_id('dbo.FactOnlineSales_Hekaton');

-- Substitute the object_id with your own from the previous statement
exec sys.sp_memory_optimized_cs_migration @object_id = 926626344

With the help of my CISL free library, you should have very similar results for the output of the dbo.cstore_GetRowGroups and the dbo.cstore_GetRowGroupsDetails functions, giving an overview and the details on the available Row Groups – 1 Million Rows that all stored inthe compressed Row Group, while the Tail Row Group (Delta-Store) is currently empty.

EXEC dbo.cstore_GetRowGroups @tableName = 'FactOnlineSales_Hekaton';
EXEC dbo.cstore_GetRowGroupsDetails @tableName = 'FactOnlineSales_Hekaton';

Since we have data successfully loaded into the InMemory Columnstore Index, let us discover where it is stored – and for that purpose, we can use the sys.dm_clumn_store_object_pool DMV, that was introduced in the SQL Server 2016 – and I will simply ask to return the names of the tables, with the data in the Columnstore Object Pool:

select DISTINCT object_name(object_id) as TableName
	from sys.dm_column_store_object_pool
	WHERE object_name(object_id) LIKE 'FactOnlineSales%'

The results are quite clear – this DMV reports that only On-Disk Columnstore Index from the table FactOnlineSales are represented in this DMV. The FactOnlineSales_Hekaton table has nothing stored within it – giving the very first impression that the In-Memory Columnstore Index is using In-Memory structures, but to confirm it we shalle need to make a couple more steps.

Running CISL cstore_GetMemory function with the parameter @tableName = ‘FactOnlineSales_Hekaton’ produces no results, meaning that the DMV sys.dm_os_memory_cache_entries contains no information on the Columnstore structures for this table.

Let’s dive into the sys.dm_db_xtp_memory_consumers and the sys.memory_optimized_tables_internal_attributes DMVs to discover if the In-Memory structures are holding the information on the In-Memory Columnstore Indexes structures:

SELECT
  QUOTENAME(SCHEMA_NAME(o.schema_id)) + N'.' + QUOTENAME(OBJECT_NAME(moa.object_id)) AS 'table',
  i.name AS 'columnstore index',

  cast(SUM(mc.allocated_bytes) / 1024. / 1024 as Decimal(9,3)) as [allocated_Mb],
  cast(SUM(mc.used_bytes) / 1024. / 1024 as Decimal(9,3))  as [used_Mb]
FROM sys.memory_optimized_tables_internal_attributes moa
   JOIN sys.indexes i ON moa.object_id = i.object_id AND i.type in (5,6)
   JOIN sys.dm_db_xtp_memory_consumers mc ON moa.xtp_object_id=mc.xtp_object_id
   JOIN sys.objects o on moa.object_id=o.object_id
WHERE moa.type IN (0, 1, 2, 3, 4)
GROUP BY o.schema_id, moa.object_id, i.name;

SELECT
  QUOTENAME(SCHEMA_NAME(o.schema_id)) + N'.' + QUOTENAME(OBJECT_NAME(moa.object_id)) AS 'table',
  i.name AS 'columnstore index',
  moa.type_desc AS 'internal table',
  mc.index_id AS 'index',
  mc.memory_consumer_desc,
  cast(mc.allocated_bytes / 1024. / 1024 as Decimal(9,3)) as [allocated_Mb],
  cast(mc.used_bytes / 1024. / 1024 as Decimal(9,3))  as [used_Mb]
FROM sys.memory_optimized_tables_internal_attributes moa
   JOIN sys.indexes i ON moa.object_id = i.object_id AND i.type in (5,6)
   JOIN sys.dm_db_xtp_memory_consumers mc ON moa.xtp_object_id=mc.xtp_object_id
   JOIN sys.objects o on moa.object_id=o.object_id
WHERE moa.type IN (0, 1, 2, 3, 4)

To make sure I was seeing everything correctly, I went looking into the memory_clerks:

SELECT type  
     , name 
     , DB_NAME(TRY_CONVERT(int, substring(name,7,2))) as DbName 
     , memory_node_id  
     , pages_kb/1024 AS pages_MB   
   FROM sys.dm_os_memory_clerks WHERE type LIKE '%xtp%';

At this point I thought the following – let’s waste it totally, by adding the 10 million rows and it will prove that the In-Memory … But then I have looked at the number above and realised that we were already above the announced limits of 352 MB…

But I thought – who cares and so here we go:

insert into dbo.FactOnlineSales_Hekaton
	select top 10000000 *
		from dbo.FactOnlineSales;
SELECT
  QUOTENAME(SCHEMA_NAME(o.schema_id)) + N'.' + QUOTENAME(OBJECT_NAME(moa.object_id)) AS 'table',
  i.name AS 'columnstore index',

  cast(SUM(mc.allocated_bytes) / 1024. / 1024 as Decimal(9,3)) as [allocated_Mb],
  cast(SUM(mc.used_bytes) / 1024. / 1024 as Decimal(9,3))  as [used_Mb]
FROM sys.memory_optimized_tables_internal_attributes moa
   JOIN sys.indexes i ON moa.object_id = i.object_id AND i.type in (5,6)
   JOIN sys.dm_db_xtp_memory_consumers mc ON moa.xtp_object_id=mc.xtp_object_id
   JOIN sys.objects o on moa.object_id=o.object_id
WHERE moa.type IN (0, 1, 2, 3, 4)
GROUP BY o.schema_id, moa.object_id, i.name;

SELECT
  QUOTENAME(SCHEMA_NAME(o.schema_id)) + N'.' + QUOTENAME(OBJECT_NAME(moa.object_id)) AS 'table',
  i.name AS 'columnstore index',
  moa.type_desc AS 'internal table',
  mc.index_id AS 'index',
  mc.memory_consumer_desc,
  cast(mc.allocated_bytes / 1024. / 1024 as Decimal(9,3)) as [allocated_Mb],
  cast(mc.used_bytes / 1024. / 1024 as Decimal(9,3))  as [used_Mb]
FROM sys.memory_optimized_tables_internal_attributes moa
   JOIN sys.indexes i ON moa.object_id = i.object_id AND i.type in (5,6)
   JOIN sys.dm_db_xtp_memory_consumers mc ON moa.xtp_object_id=mc.xtp_object_id
   JOIN sys.objects o on moa.object_id=o.object_id
WHERE moa.type IN (0, 1, 2, 3, 4)

Let’s check on our memory clerks, this time including the Columnstore Object Pool as well:

SELECT type  
     , name 
	 , DB_NAME(TRY_CONVERT(int, substring(name,7,2))) as DbName 
     , memory_node_id  
     , pages_kb/1024 AS pages_MB   
   FROM sys.dm_os_memory_clerks
    WHERE type LIKE '%xtp%'  OR type LIKE '%columnstore%'
   ORDER BY type


5 GB for the In-Memory on SQL Server 2017 Express Edition, right …

Running queries against the In-Memory Columnstore table will not push the data into the Columnstore Object Pool and this is my number one argument for confirming that the In-Memory Columnstore is not using the Columnstore Object Pool.

The Final Thoughts

Either this is a bug of the DMVs and the Columnstore Object Pool is the real user of the In-Memory Columnstore Indexes (makes no sense, seriously dangerous, because 10 databases would waste 352 MB limit in a second and then the In-Memory Columnstore would not be an In-Memory technology),
or
(most probably) Houston, there is a huge bug in the Express Edition of SQL Server 2017 right now and the In-Memory Columnstore is stored in the In-Memory.

to be continued with Columnstore Indexes – part 120 (“Merge Replication 2016-2017”)

Leave a Reply

Your email address will not be published. Required fields are marked *