There is a world-famous “optimize for ad hoc workloads” Server Configuration Option in SQL Server since the version 2008 (meaning it is well over 10 years old), which is a great help for the to clean the single-use plan cache bloat as Kimberley Tripp so eloquently wrote in the article Plan cache, adhoc workloads and clearing the single-use plan cache bloat.
I confess to have never seen a server where I could not activate this option so far – I use it everywhere (oh and I wish it would be activated by default …)
This option is so ever popular and needed that Microsoft is bringing it into the Database Scoped Configurations for Azure SQL Database and SQL Server 2019, as I blogged in Database Scoped Configurations in 2018, yahoo!
This rather short blogpost is about a feature I would love to have in Azure SQL Database and SQL Server – “Optimize for Analytics”. I would like to have this feature just like the “Optimize for ad hoc workloads” configuration option – on the server as well on the database level.
Which functionalities do I imagine being hidden under this configuration? Here are some of the basic suggestions:
– -E configuration, that will help extent allocation for the File Groups, as written in the good old Data Loading Performance Guide from the year 2008.
– Query Optimiser huge preference for the Hash Joins, whenever we are working with larger sets of data (think Batch Execution Mode for Rowstore) and/or when we are working with Columnstore Indexes. What do I mean by a huge preference? Whenever Query Optimiser is doing a cost-based selection of the one of the possible execution plans, it should assign a significantly higher value for the Hash joins, over the Merge and Inner Loop joins – thus helping us to nail down the problem of the sudden Merge & Inner Loop joins, which are killing the performance of the Batch Execution Mode.
– Bigger preference for the Batch Execution Mode for the Rowstore tables. As the data amounts grow bigger I see the eventual marginalisation of the Row Execution Mode and I would love to be able to influence the choice as described in Batch Mode – part 3 (“Basic Heuristics and Analysis”)
– Bigger preferences for the Scans and Prefetching for the Storage Engine. Every time I scan the whole table for the operation I wish to indicate the Storage Engine that this it truly can throw even more resources at the operation, to dive more recklessly (as in opposite of over carefully) into the prefetching, when I am running a Data Warehousing environment.
– Bigger Columnstore Object Pool allocation for the server. Yes, I am admitting to an option to force its bigger size on the server size.
– Elevate Online Resumable Columnstore Indexes creation and rebuild. Indeed, the feature itself is still missing (and I am critically upset by this fact, since it is not the Rowstore Indexes that hold bigger amount of data, but the Columnstore ones), but once it will be implemented I really would love the default Columnstore Index build and rebuild operations to be resumable. This is what non-technical people expect from a modern database system.
– Improve the actual plans. I have shared this old idea of mine with the Query Optimiser team last November that I would love to be able to allocate some of the processing capacity of the server at the configurable moments (a lot of times server has a clearly identifiable workloads and stale times are quite easily to identify for maintenance and ETL tasks) for finding better execution plans. As we know the execution plans are analysed in an extremely fast fashion, most of the time being totally invisible for the final user who is firing the query. What if we could say to the server – use your capacity for the analytics and find me some better plans that are in cache, so the next time/day they are used, they can get an advantage of the more complete analysis of let’s say 2 minutes instead of 50 milliseconds. This is totally something that needs a mind that is inside the matter and is absolutely brilliant to find the wholes in my idea (such as Gjorgji Gjeorgjievski‘s or Conor Cunningham‘s).
There must be others that I do not have in my mind right away, but I am sure that more intelligent and experienced people will contribute with valuable suggestions and ideas and for that purpose I have opened a Connect (oops!) Azure Feedback item here – “https://feedback.azure.com/forums/908035-sql-server/suggestions/36454111-optimise-for-analytics-configuration-option“.
This configuration would also be a place holder for any other future enhancements, giving an opportunity to “subscribe” for the Data Warehousing enhancements for the regular users and DBA’s, without too much of the knowledge and/or research, since the Trace Flags are still a no-go for the most people, even if they are well documented and recommended.