Clustered Columnstore Indexes – part 39 (“Memory in Action”)

Continuation from the previous 38 parts, starting from http://www.nikoport.com/2013/07/05/clustered-columnstore-indexes-part-1-intro/

In this post I shall try to discover more interesting information about Columnstore Object Pool behaviour and it’s relations to the total size of the available memory as well as the relationship to the Buffer Pool.

For the start, I will need a freshly restored copy of ContosoRetailDW with every major blocker (PKs, FKs) dropped.

Now let us create Clustered Columnstore Indexes on our 5 Fact Tables:

Columnstore Object Pool & Max Server Memory:

Since SQL Server 2012 we have a better control over the total amount of memory available to SQL Server, because ‘max server memory (KB)’ option is including all memory structures and not just the Buffer Pool as for SQL Server 2008 R2 and previous versions.

For the start, let’s try to see the maximum size for the Columnstore Object Pool depending on the amount of memory available to SQL Server.
We can set the max server memory to 512, 768 and 1024 MB and see how it will function: we also need to drop all the occupied Buffers as well as the Proc Cache.

I will read export the data from my 5 Fact tables into the same file C:\Install\ExportData.csv, but in the process I will make SQL Server read the whole content of my tables with Clustered Columnstore Index:

To monitor the results I will query sys.dm_os_memory_cache_counters DMV:

Even though the total amount of space, occupied by the Columnstore Segments, Dictionaries & whatever else is there that we do not know yet :), is around 400 MB, we can’t reach that number and for the 512 MB of RAM configured the maximum number of space occupied that I have managed to achieve was 147528 KB.
This corresponds around 30% of the total amount of memory, and it seems to be quite logical that SQL Server is reserving enough space for the Buffer Pool, CLR, Plan Cache, etc – we can’t spend everything on Columnstore Indexes even though we might like it 🙂

What if I increase the max server memory to 768 MB ?
This time my result was 279760 KB – seems that we have more than doubled the available space for the Columnstore Object Pool just by growing the total amount of memory in 50%.

When I increase the max server memory to 1GB running the scripts above gets me 433472 KB, which corresponds to to over 40% of the total and actually increases the maximum available space even further than the first increase from 512MB to 768MB did.

I was wondering if this was the real limit, and so I decided to copy FactOnlineSales data onto itself (12.6 Million Rows * 2), in order to reach the limit of the Columnstore Object Pool for 1GB of configured max server memory:

I shall finish the load process with reorganize statement that shall force to close any open Delta-Stores:

I ran the exporting script again, hoping to get even bigger numbers, above 50%, but to my surprise independently from the number of attempts I was never able to get that far:

Even more interesting was the fact that when I monitor sys.dm_os_memory_broker_clerks DMV I can clearly see the Buffer Pool growing up and then decreasing while reading information from Columnstore Indexes.

In the end I have managed to reach 477184 KB, which is almost about at my target, but it was not on the first attempt and they key here was absolute silence with not even a single other query touching Buffer Pool. The final result after the end of the script execution was ~470 MB for Columnstore Object Pool and 94 MB for the Buffer Pool.

Columnstore Object Pool Max Size 1024I have joined the results from 3 configuration in order to get a better visualisation and analysis.

These tests are fine for a blog post, but in the real life none runs a single query against a database (especially based on a single table), and the tables most probably will represent a mixture from Columnstore and Rowstore storage types, but they show how far Columnstore Object Pool can potentially develop – way over some 20-30% but of course it should never reach over 80% of the total memory.

Memory Pressure (Columnstore Object vs Buffer Pool)

I restored the original ContosoRetailDW database before proceeding, setting the total memory to 8GB before the operation and resetting it back to 512MB after

Previous test was very synthetic, because it was touching Columnstore Object Pool almost exclusively, without any fight for the resources between them.
Now it is the time to show a little bit more complex scenario –
I will start by issuing a complex query that while supposed to be using Columnstore Object Pool heavily, in practice works much more with a Buffer Pool, because of all the Table Spool operations and consecutive Row Execution Modes on the Hash Match (Aggregate) operators.

It takes more than 20 seconds on my test VM and it’s complexity goes to some 11739271 logical reads from the Worktable. (Notice that I have restored the ContosoRetailDW)

Let us check out the undocumented DMV sys.dm_os_memory_broker_clerks, it will give us an update on the current status of Buffer Pool and Columnstore Object Pool:

dm_os_memory_broker_clerks_query_1Well, well, well – we have so much interesting information here:
– first of all we see just 6.7 MB of space occupied by Columnstore Object Pool and a very fat part with around 150MB being occupied by a Buffer Pool.
– secondly, internal_freed_kb columns shows us that 1.574.368 KB were processed and freed from the Columnstore Object Pool.

Ok, and what about running a simple query that should increase Columnstore Object Pool greatly to help it reach the same size as Buffer Pool potentially (FactOnlineSales occupies more than 150MB):

dm_os_memory_broker_clerks_query_2 Surprise! We have increased our Columnstore Object Pool only by 10 MB apparently, while Buffer Pool has lost just 3 MB. This is not what I expected. There are more questions here than any possible answers:
– We have definitely read more than 10 MB, why SQL Server did not fill out Columnstore Object Pool with them ? This looks like a priority issues between Buffer Pool and Columnstore Object Pool, where Columnstore Object Pool is clearly the loosing side.

Ok, maybe my previous query was something that SQL Server did not like, let’s export the whole table into a CSV:

dm_os_memory_broker_clerks_query_3This time we have jumped to 44MB with Columnstore Object Pool while loosing 2MB from Buffer Pool – the only idea that I get from these observations is that Buffer Pool has a higher priority than Columnstore Object Pool and lives in its own space, which is quite apart from the Columnstore Object Pool.

With this all said, let’s try to expand Columnstore Object Pool, by executing the data reading from Columnstore table in a cycle:

dm_os_memory_broker_clerks_query_4This does not seem to help a lot – Buffer Pool does not give it’s memory back easily, even though there are no new queries requesting it (besides system DMVs request and all the stuff that SSMS issues). Columnstore Object Pool has increased to 52 MB and Buffer Pool went down to 100MB.

What if I issue a single query that should increase Buffer Pool size:

dm_os_memory_broker_clerks_query_5We have increased our Buffer Pool by ~3MB and nothing else happened.

Let’s force some more information into Buffer Pool 🙂

dm_os_memory_broker_clerks_query_6In my opinion there is no doubt that Buffer Pool consumes Columnstore Object Pool space with much ease, what is not happening in other direction.

4096 MB:
Running the same workloads with 4096 MB RAM produces different results, we definitely have more space to expand and here until some certain point Buffer Pool gives space to Columnstore Object Pool without a lot of hustle.
Around 400 MB things start to get more complex and Buffer Pool becomes one greedy creature and starts to slow down memory

I wonder, is there some internal threshold set on ~10% that Columnstore Object Pool can easily occupy before it starts some serious fight with Buffer Pool for every consecutive MB ?
In the case of 512 MB RAM configuration my troubles were starting at 50-60 MB, which corresponds this threshold.

Thoughts

It makes a lot of sense to implement higher priority for Buffer Pool, but I have some concern on the minimum space for Columnstore Object Pool and how in practice it will function – nobody wants to read everything from the disk, especially if the technology is being labelled as a “In-Memory” one.

While we are able to reach almost 50% of the space with Columnstore Object Pool it seems that the there are some serious limitations once we have some data in the Buffer Pool (ha! we always have data there) and in practice I believe that a given SQL Server instance with Columnstore Indexes should have between 10% and 20% of the total memory occupied.

Having a lot of memory in these conditions might be really essential for a good performance and as Geoff Patterson suggested in the previous article comment having better control over it might be a very good idea.

to be continued with Clustered Columnstore Indexes – part 40 (“Compression Algorithms”)

2 thoughts on “Clustered Columnstore Indexes – part 39 (“Memory in Action”)

Leave a Reply

Your email address will not be published. Required fields are marked *