Contact me

If you wish to contact me, then the best way to do it – is to write here. :)

108 thoughts on “Contact me

  1. Priyadarshi Alok

    I have Visual Studio 2008 version 9 and .Net Frame work 4 on 64 Bit Window 7 Operating System

    As per the installatoin of Google Analytics Step

    I have install the .ddl in below mention folder using Gacutil.exe
    all the dll are installed correctly in following location:

    C:\Windows\Microsoft.NET\assembly\GAC_MSIL

    DotNetOpenAuth.dll
    Google.Apis.Analytics.v3.dll
    Google.Apis.Authentication.OAuth2.dll
    Google.Apis.dll
    Newtonsoft.Json.dll
    SSISComponents.Dts.Pipeline.GoogleAnalyticsSource.dll
    Zlib.Portable.dll

    And then i copied these all dlls in following folder :

    C:\Program Files (x86)\Microsoft SQL Server\100\DTS\PipelineComponents

    and refreshed the SSIS tool box using command :

    C:\Program Files (x86)\Microsoft Visual Studio 9.0\Common7\IDE>devenv /ResetSettings

    But Still GoogleAnalytics Source is not visible in my SSIS ToolBook Pane.

    Please help me ,where installion step is missing .

  2. Tom

    Hi Nico,
    I’m confused by when to use MAXDOP option when you create or drop CCI.

    CREATE CLUSTERED COLUMNSTORE INDEX [IndexName] ON [TableName]
    WITH (DROP_EXISTING = OFF , MAXDOP=1 )

    DROP INDEX [IndexName] ON [TableName] WITH ( ONLINE = OFF, MAXDOP = 1)

    Question 1) Do I get same results in the end, if I use MAXDOP=1 or MAXDOP=8?, or not using the MAXDOP at all?
    I heard somewhere that using MAXDOP=1 when you creating CCI will get better Clustering.

    Question 2) Do you want better Clustering to get better query performance?
    Question 3) Performance Issues with MAXDOP=1.

    If I use MAXDOP=1 option when you create or drop CCI, it takes long time to process.
    6 hours to create CCI with MAXDOP=1, and 2 hours to drop CCI with MAXDOP=1

    If I use MAXDOp=8 option when you create or dorp CCI, process is much faster then using MAXDOP=1
    1 hour to create CCI with MAXDOP=8, and 25mins to drop CCI with MAXDOP=8

    If my SQL server can handle the CCI Create and Drop with MAXDOP=8 or more, should I be using higer MAXDOP?

    Thanks for your help

    1. Niko Neugebauer Post author

      Hi Tom,

      First of all you will need to use DROP_EXISTING = ON while re-creating a Columnstore Index over the Clustered rowstore one.
      Do not drop Columnstore Index, it makes no sense. Create a Rowstore index with DROP_EXISTING = ON ordering data on the most frequently used column, and then re-create a Columnstore Index with DROP_EXISTING = ON.

      1. You will find the detailed explanations here on the Segment Clustering: http://www.nikoport.com/2014/04/16/clustered-columnstore-indexes-part-29-data-loading-for-better-segment-elimination/
      2. Yes, it will help to get better segment elimination and thus will allow to read & process less data while getting the same results.
      3. MAXDOP = 1 is slow.
      Using MAXDOP = 8 makes your server using 8 cores, so it should be much faster … :)

      Please read this article http://www.nikoport.com/2014/04/16/clustered-columnstore-indexes-part-29-data-loading-for-better-segment-elimination/
      and let me know if something is still not clear.

      Best regards,
      Niko Neugebauer

  3. Mauricio Orellana

    Hello Niko
    a query: SSRS Report Generator 1.8 apparently task does not support exporting to Excel .xlsx format.

    Any advice to achieve this goal?

    Stay tuned.
    thanks.-

  4. Ramya

    Hi Niko,

    I am facing problem with a simple select query which involves three tables with clustered columnstore index on them. For every 15 mins, there is a ETL load running and one of the three tables gets truncated each time before new rows get inserted. This leaves the rowgroup to be open and query becomes too slow.

    It will run fine if I run rebuild index and update stats on tables.
    The problem is we are unable to run rebuild index and update stats for every 15 mins. I wanted to know your thoughts on this and if there is anyway this problem can be avoided.

    Thanks!

  5. Dale Wilson

    Hi Niko,

    I attended your Columnstore Indexes – from basics to optimised analytics at SQLBits.

    Please could you send me your slides from the training day?

    Many thanks,

    Dale

  6. Nadir

    Hi Niko,
    I also attend your precon session on SQLBits, could you share the slides and the script please !
    Your help is mush appreciated !
    Thanks

  7. Nadir

    Hi Niko,
    I also attended your precon session on SQLBits 2016, could you share the slides and the script please !
    Your help is mush appreciated !
    Thanks

  8. anil kumar

    Hello Niko,

    Hope you are doing great !

    I couldn’t submit my comment on that blog post because of some issues, Can you please help me on below questions ?

    As you mentioned in part-38 “I understand that at the moment when we are reading Columnstore data from the disk, it is being read directly into Columnstore Object Pool without decompression, and then all the respective pages from the Segments are decompressed into Buffer Pool for returning the result.” – 1) Does it mean Buffer Pool Extension can store decompressed Columnstore data in SQL Server 2016 Standard edition ?

    2) How to restrict the memory allocation to columnstore objects (Delta-Stores + Deleted Bitmaps + Decompressed columnstore data) in buffer pool ?

    3) Columnstore Object Pool can be of 32GB in SQL Server 2016 STD edition, what is going to be impact of it as most of the stuff is stored in buffer pool ?

    4) CCI and NCCI rebuild and reorganize operations are performed in buffer pool rather than in Columnstore pool ?

    Looking forward for your response. Thank you.

    1. Niko Neugebauer Post author

      Hi Anil,

      answering your questions:
      1) yes. But avoid Buffer Pool Extension for Columnstore Indexes right now, since it is focused on the OLTP scenarios. All operations are done on the page (8KB) level.
      2) There is no way to control that.
      3) It will depend on the actual data. The cap is the maximum value, it does not mean that your workload will strive to achieve that.
      4) yes, to my understanding.

      Best regards,
      Niko

  9. Joao Caxias

    Hey Niko,
    1st Congratulations on all work and knowledge share. Truthfully appreciated.
    On a scenario like this:
    Sql server 2016 on prem , git(visual studio) with all sql structure (…create tables…) how would i change a normal rowstore create table to a columnstore and have it correctly deploied everytime?
    Thank you

    1. Niko Neugebauer Post author

      Hi João,

      Sorry for the delay – you caught me on the vacations.
      Using Test or DEV environment, create Clustered Columnstore Index on the table (you might consider using CISL for that) and start experimenting and observing.

      Best regards,
      Niko

  10. Marcus Nugroho

    Hi Niko,

    I read your blogs regarding operational analytics using memory optimized table + clustered columnstore (CCI). The workload is as follow:
    – 50 million rows on the table initially (size is about 50 GB)
    Run simultaneously:
    – 300 user insertion process –> 3000 TPS
    – 500 user query (simple, medium, complex)

    Use Azure VM (32 cores, 448 GB RAM)
    resource pool is set to 90% max_memory, 90% max CPU. memory grant is 25% for each user

    Issue: we got error for some queries. some is complete successfully.
    There is insufficient system memory in resource pool ‘default’ to run this query.

    When we run on 10 insert users x 30 query users, there is no error.

    Do you have any advice regarding this error? what should I try to setting and test again?

    Thank you

    1. Niko Neugebauer Post author

      Hi Marcus,

      check you memory requirements for the queries – you might need to increase the overall memory or use Resource Governor to restrict the amount of work memory given to the single query.

      Best regards,
      Niko

  11. Ramiro

    Hi nico

    I was able to see your today presentation of column store killed Olaf start, I was thinking, you were talking the needs of creating you data for reporting and analysis as a start schema , anyhow

    What is your opinion about that , with Columnstore now or do you still see the need to store data as start schema or more as a natura relational database

    Thanks

  12. Adam Wenger

    Good afternoon Niko,

    Great information here! I was wondering if you know whether future versions of SQL Server will support Clustered ColumnStore Index MIN/MAX or predicate pushdown for DatetimeOffset data types?

    We’re currently using Datetimeoffset(4) but looks like we’d need to go to Datetime2(4) currently to get the segment elimination for our date queries that we want.

    Thanks again!

    1. Niko Neugebauer Post author

      Hi Adam,

      I have no such information …
      I see it as a bigger decision, which is probably not to be taken any time soon.
      DatetimeOffset(4) occupies 9 bytes and this would imply feature support for > 8 bytes, which I believe is not on the highest priority list right now …

      Best regards,
      Niko

  13. Gonzalo Bissio

    Hi Niko!
    I’m testing your maintenance solution… But i got one issue… In¿¿On our environment we have partitioned and non partitioned clustered columnstores.. The thing is if I use @usePartitionLevel=1 the script showed an error saying that I cannot use partition clause in a non partitioned index. Since i’m executing this in a UAT environment (because we have real data on that env).. The problem is that if i use @usePartitionLevel=0 I’ll be running the maintenance on the whole partitioned table… So, is this expected? What will be your recommendation here? By the way excellent script!!
    Regards!

  14. Doug Valdez

    Performance Benefits of Clustered Column store Index vs In-Memory Column store index?

    If the Server has enough memory for all of the DB to reside in memory will a In-Memory CCI offer much benefits over a normal CCI? (Server Memory has 2 TB and DB size is 1 TB)

    Thanks

    1. Niko Neugebauer Post author

      Hi Doug,

      Your DB size is probably already includes compression, and so I would be careful thinking that the In-Memory will take care of it all. In-Memory is not a solution for the pure memory access, it was designed to tackle locking and latching problems for concurrent access.
      Take a very detailed look at the In-Memory limitations before going that way.

      Regarding the CCI location, no – I do not think that there are any compelling arguments of choosing In-Memory CCI over CCI just on the base of the memory access. Such limitations as no inter-db access of the In-Memory is a major pain, In-Memory CCI requires you to have a nonclustered PK anyway and there is no compression, plus you won’t be able to use the Archival compression, etc …
      Nope.
      Choose In-Memory CCI when you are already working with the In-Memory and need to run aggregations in near real-time on it.

      Best regards,
      Niko

  15. René Schell

    Hi Niko,
    at the yesterdays PASS in Coblenz you’ve mentioned that you could provide us with a list regarding to “in wich cases which index should be used”.
    We would appreciate it, if you’d send us this “checklist”

    Kind regards and have fun in the USA
    René

  16. Karim T

    Hi Niko,

    We don’t have a DW but an OLTP database with some tables over 300M records. In order to use the benefits of SQL2016 batch mode Window Aggregate operator, I tried to add a fake columnstore to the bigger tables
    e.g. CREATE NONCLUSTERED COLUMNSTORE INDEX idx_cs_dummy ON dbo.TRADE_BATCH(iBatchNo) WHERE iBatchNo = -1 AND iBatchNo = -2
    Suddenly the application became extremely slow, and we had issues with tempdb. Noticed that any user/process attempting to add records to a temp table, it blocks other processes/users.
    Once I dropped all the columnstore indexes, everything started running fast, no tempdb blocking.

    I read also that you use another tip, using a left join on a dummy table that has a column store
    LEFT JOIN dbo.dummyTable ON 1 = 2 — but this requires code changes to every query

    Not sure where the problem lies. Maybe you ran into similar situation.

    Any help will be greatly appreciated.

    Thanks,
    Karim

    1. Niko Neugebauer Post author

      Hi Karim,

      this would require looking at the execution plans.
      There are many elements that can potentially go wrong – skewing of the parallelism (often), sorting (because of the memory grants – frequently on Hybrid Systems), etc

      Best regards,
      Niko

  17. Carina

    Hi Niko,
    I attended your session on Worst Practices & Less Known Limitations for Columnstore Indexes at SQLSaturday Vienna in January 2018.
    There you mentioned that in order to get rid of the deleted_rows in the segments it is necessary to rebuild the index and reorganize is not enough.
    Now I found some Microsoft statements (https://docs.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-defragmentation?view=sql-server-2017) that starting from SQL Server 2016 also reorganize removes deleted rows.
    Am I missing something, or could I now also use reorganize my indexes on a daily basis in order to lower fragmentation?

    Thanks,
    Carina

    1. Niko Neugebauer Post author

      Hi Carina,

      there are quite a number of things that won’t be allow the total removal of the deleted rows from the Deleted Bitmap:
      – imagine if every Row Group has around 550.000 rows – you won’t be able to merge them
      – imagine if you have memory pressure or dictionary pressure -read here for more (Columnstore Indexes – part 113 (“Row Groups Merging Limitations”) – http://www.nikoport.com/2017/09/24/columnstore-indexes-part-113-row-groups-merging-limitations/)
      etc

      Bear in mind that reorganisation of the columnstore without TF 6404 can lead to disastrous memory grants on bigger machines http://www.nikoport.com/2017/05/30/columnstore-indexes-part-106-memory-requirements-for-rebuild-reorganize/)

      The only guarantee of the total cleanup of the deleted structures (bitmaps, buffers, row tables) is the rebuild.

      Best regards,
      Niko

  18. Henning Prigge

    Hallo Herr Neugebauer,

    ich war im Februar auf der SQL PASS Konferenz, dort habe ich den Columnstore PreCon Day besucht.
    Wäre es möglich die Folien zu bekommen? Auf der PASS Seite konnte ich diese nicht finden.

    Viele Grüße
    H. Prigge

  19. Beth

    Niko, whenever I try to open post Columnstore #87, I am redirected to a scam web site. One of us has been hacked. is it you or is it me?

  20. Vernon

    Hi Niko

    Do you know where i can get the files or installer for the SSIS Analytics Source for sql 2012?
    I’ve searched everywhere for it, i dont see it on https://archive.codeplex.com/?p=analyticssource nor http://ssis-components.net/

    We have some legacy SSIS packages that use this component so we can analyze Google analytics data and integrate with other data, but have moved them to a new computer and dont have the files to re-setup

    Your help would be very much appreciated!

    Regards
    Vernon

  21. Angel Martinez

    Hi Niko recently we attended 22 October 2018 in Madrid at your workshop on columnstore index please can you share slides and tsql scripts?

    Thanks in advance!!

    Best regards

  22. Dave

    Niko, I noticed you’re giving back in the new year 1 hour sessions. That’s a very generous thing for you to do! I was curious if you’d be able to help us with some tips and tricks using ColumnStore indexes. We’ve been migrating off of Sybase IQ for some time now and each database has its challenges, mostly because they shouldn’t have been put on IQ to begin with! We’ve got 140TB IQ database in 1 data space in IQ and looking to migrate it to SQL Server 2017. We thought we’d follow the normal 1 database per year, partitioned by day and make it columnar to compress the best we could. We’ve also got analytics databases that are 40+TB large and looking for help!

    Thanks in advance,
    Dave

  23. Brad Lynch

    Hello Niko,
    I work at Mercy Corps, our 5,000+ staff over 40 countries work to alleviate suffering, poverty and oppression by helping people build secure, productive and just communities. https://www.mercycorps.org/about-us

    As a resource-constrained non-profit that raises funds for beneficiaries and must be prudent yet innovative with how we spend those dollars, we have to thoughtful about what we choose to invest in.

    On our fundraising team, we have a very traditional, Ralph Kimball-based Data Warehouse. We primarily use SSIS, SSRS & SSAS to service and deliver these data assets.

    What is the primary use case of ColumnStore over traditional row store?

    How could Columnstore help us?

    For example, we have fundraising data that we are hesitant to allow to scale because we don’t fully comprehend how to handle those data stores efficiently.

    If we have a row store table in the ~110 million row range and we would like to blow it up to a billion rows, could this be a good candidate use case for ColumnStore? Is this even the right question for this technology?

    Would we look to ColumnStore to help handle this large table, speed up ETL into and out of it (and updating of it) as well as general queries against it?

    We would gather at least 3 people from across our tech team for your 1-hour call. We feel this would greatly help us frame this technology in our current environment and how we could place in the context of future opportunities that we want to go after.
    Thank you very much for your consideration and for giving back to the community!

    Brad Lynch
    Data Warehouse Developer, Mercy Corps
    Portland, OR

    1. Niko Neugebauer Post author

      Hi Brad,

      thank you for the contact – I am truly honoured to be of help.
      Even though officially the decision will fall in early January 2019, in your case you can definitely start picking the month you prefer to have my hour.

      Best regards,
      Niko

  24. Christo Kutrovsky

    Hi Niko, I want to first thank you for taking up column store as your topic for a deep dive. I’ve read all your blogs on the subject. I am building a mixed use data warehouse that relies heavily on column store for performance.
    Your blog has been key to giving me confidence in the technology. As long as there are people experimenting and blogging about it – one has a chance of understanding what’s going on when you start hitting the edge cases.
    I come from the world of Oracle databases, including Oracle Exadata, hybrid columnar storage, parallel query performance, compression and etc. I’ve presented many times on topics on memory management, performance and others.

    Recently I ran into a problem where a query that typically completes in 45 ms, was taking 8,000+ ms to complete, with the primary culprit being ROWGROUP_VERSION due to a full can on a clustered column store index. I do use snapshot isolation. I couldn’t figure out what’s going on and why we have this wait one day, and not another day. I understand it has to do with transactions – but I don’t understand why nor the exact conditions it would manifest itself.

    I saw you have a presentation on the subject – https://sqlbits.com/Sessions/Event18/Columnstore_Indexes_What_are_we_waiting_for_ . Do you have a recording or maybe at least the slides? I am afraid this will ROWGROUP_VERSION wait will show up elsewhere .. and I want to understand when it happens, so I can perhaps design around.

    It looks like you are offering 1 hour calls – I would love to be able to connect with you and explore this further. I can provide more details on what I am working on at that time.

    1. Niko Neugebauer Post author

      Hi Christo,

      parallel access (read & write, while using TRICKLE inserts) to the same underlying Row Groups will cause you ROWGROUP_VERSION waits.
      Let’s say you are loading one day 1 Million Rows … You might escape seeing the waits …
      Next day you will load 100.000 Rows into a SINGLE Row Group (Parallel Insertion splits between the available MAXDOP, and do not forget the actual partitioning, which will cause more splits) – you will see the ROWGROUP_VERSION waits as a result.
      Check you Row Group trimming reasons, it would be a great place to start investigating.

      Regarding the link to the presentation – that is the submission for the next year 2019 SQLBits edition, so it might be chosen or not.
      I will be delivering this session this friday for the Group By online conference, and there will be a recording of it in the next days available on the Youtube, if you can’t attend it live.

      Regarding the 1 hour call – I will let you know by the beginning of the year 2019, once I will review all the submissions.

      Best regards,
      Niko

  25. Eugene Karpovich

    Hi, Niko.
    First of all, thank you very much for your blog. It is an invaluable resource for information on columnstore indexes. Please, keep up the great work.

    I have a question related to creating a columnstore index on a temp table (regardless local or global).

    Question is – are you aware of any specifics of creating a columnstore index on a temp table with regards to waiting for a memory grant?
    Specifically, have you ever seen that this operation does not honor the “request_memory_grant_timeout_sec” option of the relevant Resource Governor workload group?

    In my case, even though I have increased this option up to 3600 seconds (from the default “0”, which afaik is equivalent to 25 times of a query cost), I am still getting memory timeout exception whenever my “create columnstore index” operation happens to wait for memory grant (RESOURCE_SEMAPHORE wait type) for over 25 seconds.
    I am running SQL Server 2016 SP1 no CUs.

    I could not find the answer either on internet, or on your blog, although yours is so vast that I might have simply missed it.

    Thank you
    Eugene Karpovich

    1. Niko Neugebauer Post author

      Hi Eugene,

      can you share more details, such as:
      – how many rows do you have in your tables ?
      – what are the other RG settings are set to ? (Trying to understand the memory grant throttling here)
      – what is the SQL Server version & edition you are running ?

      maybe even there is a simple repro you can share …

      Best regards,
      Niko Neugebauer

      1. Eugene Karpovich

        Niko, thank you for your time.
        Below are my answers:

        – how many rows do you have in your tables ?
        the number of rows in a temp table varies with each run, but whenever the create index operation fails due to memory timeout, the table has a few million rows;
        however, despite the table itself has that many rows, the index I am creating is actually a dummy one – I am using a filter condition “WHERE [the_date] = ‘19000101’ AND [the_date] = ‘19000102’”;
        essentially, I am creating the columnstore index here just to leverage the Batch execution mode;
        on the latest occasion of the memory timeout, the create index operation had the query cost of 146.6 and the requested memory grant of 736MB

        – what are the other RG settings are set to ? (Trying to understand the memory grant throttling here)
        I only changed the following settings:
        ALTER WORKLOAD GROUP [default] WITH( request_memory_grant_timeout_sec = 3600);
        ALTER WORKLOAD GROUP [default] WITH( request_max_memory_grant_percent = 8);
        all other settings are left as default;
        I have 20 CPUs on the box, organized as 4 numa nodes with 5 CPUs each;
        I set the “max degree of parallelism” server level setting at 10;

        – what is the SQL Server version & edition you are running ?
        I am running Enterprise Edition (64-bit) of SQL Server 2016 SP1 with no CUs (13.00.4210 build)

        with regards to a repro – unfortunately, I don’t have one right at this point..

        Thank you
        Eugene

        1. Niko Neugebauer Post author

          Hi Eugene,

          the only thing that comes to my mind (besides possible bug) is that your server under extremely heavy workload and there is no memory available at all …
          I assume that statistics are in place …
          I would try to play with Resource Governor to see if you can guarantee some more resources for your process.
          Otherwise you would need to work on the reproduction of the situation or/and contact support.

          Best regards,
          Niko

          1. Eugene Karpovich

            Niko
            Actually you are exactly right – whenever a memory timeout happens, the server indeed is under extremely heavy workload and there is no memory available at all. And there are other concurrent processes (not related to columnstore indexes) that are waiting for their memory grants along with the “create columnstore index” process.
            The only problem is that these other processes continue to wait for their memory grants for well over 25 seconds, honoring the “request_memory_grant_timeout_sec” RG setting (which is 3600 seconds, as I set it up).

            Yes, I will try to play with the RG settings to maybe further decrease the memory available for a single query.

            Thank you very much for getting back to me on this.
            Regards
            Eugene

  26. Eugene Karpovich

    Niko, a slight correction – just this morning, this memory timeout happened again.
    In this case, the global temp table had just 543 records, the create “filtered-dummy” columnstore index operation had the query cost of just 0.2, belonged to the “default” RG workload group (which has 3600 memory grant timeout setting) and the requested memory grant was 376MB.
    At the point of the memory timeout, the query has been waiting for memory grant (RESOURCE_SEMAPHORE) for almost exactly 25 seconds.

    Regards
    Eugene

  27. Grzegorz

    Hello,
    I’m on your workshop on SQL Day 2018 in Poland, Wroclaw. The workshop is about ColumstoreIndex (clustered, nonclastered etc).
    I have link to your one drive, but unfortunatelly link expired. It is possible to get your presentation (only pdf/ppt, without data and scripts?).

    Thank you for your reply.

    1. Niko Neugebauer Post author

      Hi Grzegorz,

      I am sorry to hear that it took you so long to get to the materials – but nobody should expect that a link for the workshop materials will stay over 10 months.
      Can you please contact the SQLDay event organizers and ask them to share the materials?

      Best regards,
      Niko Neugebauer

      1. Grzegorz

        Hello Niko,
        I use this this material after SQL day, but unfortunatelly online.
        Event organizers don’t have this material.

        Best regards,
        Grzegorz

  28. Matt Thrower

    Hi Niko,

    I attended your course at SQL Bits “Optimising ETL on Microsoft Data Platform (SQL & Azure)”. Are you able to forward the slides for the day please? The day was fantastic and I want to make sure I missed nothing.

    Thanks,

  29. Matthew Hannah

    Hi Niko,

    I also attended your precon Optimising ETL at SQL Bits 2019, and have been struggling to get hold of the materials from SQL Bits organisers.
    Please could you email me the materials? I’d be most grateful.

    Many thanks,
    Regards,
    Matt

  30. Matt

    Hi Niko,

    I also attended your Optimising ETL workshop at SQL Bits 2019, but SQL Bits haven’t sent through the materials for this. Is it possible to send me these please?

    Many thanks.

    Regards,
    Matt

    1. Niko Neugebauer Post author

      Hi Georgi,

      thank you for the kind words.
      There will be an update soon – there are some very nice engine improvements and I will be blogging and recording videos about them.

      Best regards,
      Niko Neugebauer

  31. Greg

    Hi Niko,

    I’m trying to contribute a useful comment on http://www.nikoport.com/2019/03/02/set-statistics-io-update-in-sql-server-2019/ but it not going through:(

    I made a free tool in regards to running SET STATISTICS IO ON which outputs stats into a pretty report right inside of SSMS
    https://analyticsbar.com/blog/statistics-reporter-ssms-extension/

    Could you please consider allowing the posted comment to go through I’m sure the community would appreciate it:)

    Thanks so much!
    Greg

  32. Kay-Christian Wessel

    I’m using SQL Server 2017 and I have a clustered columnstore index/table with 184 million rows and this uses 2 Gb of disk storage.

    I’ve tried to put this inside a IN-Memory clustered columnstore index/table, but is does not seem to compress the data. I inserted 60 million rows and the size of the table was 32 Gb.

    Any idea what can cause this?

    My SQL Server has 64 Gb om memory and 58 Gb is for SQL Server.

    Br
    Kay

    1. Niko Neugebauer Post author

      Hi Kay-Christian,

      most probably your data ended landing in the Tail Row Group (aka In-Memory Delta-Store).
      Try to force the compression of it with the sys.sp_memory_optimized_cs_migration procedure while monitoring the status of all the Row Groups with the help of the CISL or the sys.columnn_store_row_groups.

      Best regards,
      Niko

  33. Paul Holmes

    We have a true puzzler. Non deterministic failure on an update to a Clustered Columnstore Index.

    (Firstly I apologise for not constructing a repro… when we try to synthesise the error we cannot ever get the failure.)

    We have a very (Very) wide CCI. With a set of source data, an update statement covering 100+ columns. It fails on a particular set of source data, with an “exceeds 8060 bytes” error. I paraphrase – I don’t have the exact error text to hand.

    So then, playing elimination, we only update half the original 100 columns. It succeeds.

    Then we run the original update again, with all the columns. It succeeds.

    So, non deterministic.

    Fails with all the columns.
    Succeeds with few columns.
    Succeeds with all the columns.

    Ever heard of 8060 byte failures in a CCI update?

    (I know a lot more detail may be needed here, and I will try to get a repro…. just chucking this out there for starters)

      1. PAUL ADRIAN HOLMES

        select @@VERSION

        Microsoft SQL Server 2014 (SP3) (KB4022619) – 12.0.6024.0 (X64) Sep 7 2018 01:37:51 Copyright (c) Microsoft Corporation Enterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.1 (Build 7601: Service Pack 1) (Hypervisor)

        (DB Compatibility Level: 110)

        1. Niko Neugebauer Post author

          Hi Paul,

          this sounds like a problem of the Sql Server 2014.
          Delta-Stores in SQL 2014 were page-compressed and that would cause the errors of the 8064 bytes.
          In Sql Server 2016 Delta-Stores were changed to NO compression specifically to allow wide tables to function correctly.

          Do a simple test on 2014 – create a table with 700 integer columns and Clustered Columnstore.
          Try to insert a row. :)
          Yeap – won’t work.

          Best regards,
          Niko Neugebauer

  34. Mari Flor

    Boa Noite Niko

    Preciso de uma ajuda sua. Muito urgente e importante sobre o Windows.
    Mas não dá para explicar por aqui.
    Será que é possível comunicar-me contigo por email (o meu está cadastrado aqui no Blog)
    Serei eternamente grata
    Mari

  35. Alexander Bartosh

    Good morning Niko!
    You have written a lot of useful articles on SQL server.
    I am reaching out to you with a very specific question On SQL server, Columnstore indexes and Bulk-Insert. I know you know the answer :)
    Is there away to disable BulkLoad trim reason on NC Columnstore index ?
    You can read the questions and detailed problem description here: https://social.msdn.microsoft.com/Forums/en-US/5608bd85-f693-4955-9163-215aac77f91d/how-can-i-bulk-insert-into-deltastore-how-to-use-sqlbulkcopy-on-a-table-with-nonclostered?forum=sqldatabaseengine

    Thank you!
    Alexander

    1. Niko Neugebauer Post author

      Hi Alexander,

      well – tricking with Cardinality Estimation to be estimated below 102400 rows and enabling Trace Flag 9348 sounds like a plan …
      https://support.microsoft.com/en-gb/help/2998301/fix-error-8654-when-you-run-insert-into-select-on-a-table-with-cluster

      Another way – lowering memory amount for the insertion queries through Resource Governor

      Another way – forcing higher DOP in the parallel inserts, thus splitting the number of rows and potentially achieving the numbers below 102400 rows.

      Best regards,
      Niko Neugebauer

  36. Doran Douglas

    Hello Niko, when deploying DDL changes for a columnstore table with a DACPAC onto an existing database using sqlpackage.exe or Visual Studio (it is really just VS calling sqlpackage.exe for you), the deployment process seems odd.

    Watching the deployment output you can see it go through these steps:
    1. Drop the columnstore index. This results is the database size increasing a lot as the table is convert to a rowstore.
    2. Alter the table
    3. Apply the columnstore index again. The database size then shrinks as the table is converted back to a columnstore.

    This yoyoing of the database size coupled with the long deployment times as the database goes through all that IO is annoying. Do you know why it wants to go through this process?

    I just finished deploying a change to one fact table with ~30 million rows and it took an hour with 8 vCores. Microsoft claims that should have 4000 IOPS screaming away during the deployment. Any thoughts, not that we can change much about this process?

    1. Niko Neugebauer Post author

      Hi Doran,

      I do not have much to say about the DacPac deployments …
      The found solution is the easy one, not the correct one.
      DacPacs ARE oriented/thought/implemented for smaller DBs, not for the real production heavyweights.
      *cough* Besides that sometimes the implemented solution is to drop and recreate table … *cough*
      Manual optimisations are the key …
      “Devops” optimisations are beautiful and wonderful, if you have a tool that will do the needed job. I have seen it work in small and non-critical installations.
      Yeah “Devops” is not Devops.

      Best regards,
      Niko Neugebauer

      1. Doran Douglas

        Well that is unfortunate. One of the reasons I love DacPac deployments is how easy they are. You just tell sqlpackage to make the database look like the DacPac and you don’t care how it does it.

        I guess I will make our Devops engineers earn their keep, or just suffer through the longer deployments.

        Cheers,
        Doran

  37. Kannan Chandrasekaran

    Dear Niko,

    Our stage table holds 130 + columns with 1 billion 430 million rows with partition with service date. we created non clustered column store index, it took 36 hours for its completion. after creating the index the select count_big(col_name) from table shows 40 billion as row count, we tested this by sp_space used, sys.indexex table, right click and properties the table all shows the 1 billion 430 million rows. Before index creation also we count the table after 45 minutes execution it says the 1 billion 430 million. estimated number of rows in non clustered column store also shows 1 billion 430 million rows, but count shows very strange result. Niko do you have any thought on this?

  38. Kannan Chandrasekaran

    Dear Niko,

    Thanks for the response. Non clustered column store index scan come with parallel streams. Due to policy restrictions i am sharing the query plans in words instead of images.

    Column store index scan (Non clustered)
    scan columnstore index, entirely or only a range.

    Physical operation : columnstore index scan
    logical operation : index scan
    Estimated execution mode : Batch
    Storage : ColumnStore
    Estimated I/O Cost:0.05625
    Estimated Operator Cost : 8.74159(41%)
    Estimated SubtreeCost : 8.74159
    Estimated CPU Cost:8.68534
    Estimated Number Of Executions:1
    Estimated Number Of Rows: 1262930000
    Estimated Row Size: 27B
    Partitioned : True
    Ordered: False
    Node ID : 5

    In the GUI it is displayed like this

    select cost 0% <———Parallelism(Gather Streams) Cost 0% <——-Compute Scalar Cost : 0% <—— Hash Match Aggregate Cost(59%) <——- Columnstore index scan cost 55%.

    Actual rows in the table matches with plan's estimated number of rows, but after executing the below query the count will be 40145305259. I also ran the query with group by clause below are the split-ups with index and without index

    With Non Clustered columnstore Index

    Error_ID Count
    0 32,869,106,587
    99 503, 056,854
    102 384
    888 92,512
    1503 6,194,370,669
    1506 578,678,253
    Total : 40,145,305,259

    Without Index

    Error_ID Count
    0 1,033,556,048
    99 15,769,535
    102 12
    888 2,891
    1503 195,519,462
    1506 18,086,191
    Total : 1,262,934,139

    without index it took 1 hour 13 minutes, with index it took 5 minutes. The table is a stage table with 135 + columns partitioned with date. we tested with sp_space used it gives the without index row count details, table properties it also gave the without index number, queries from sys.indxes also gave the same number. Kindly share your thoughts. I created a clustered index on top of non clustered columnstore for a memberid column after 1 hours we faced disk space issue with error

    Could not allocate space for object 'dbo.SORT temporary run storage: 140746840342528' in database 'XXX' because the 'XXXX_Sx_TBL_xx_5' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup. [SQLSTATE 42000] (Error 1105) The statement has been terminated. [SQLSTATE 01000] (Error 3621). The step failed.

    850 GB of disk space gone within 1 hour with the above error.

  39. Jed j

    Hi

    I am in need of SSIS GoogleAnalyticsSource for one of the project in our office. I tried to download it, but I don’t see any setup files. Can you please help me getting that ?

    I need the 2012 setup files please.

    Kind Regards

    Jed

  40. Hannes Windisch

    Hello Niko,
    we recently met at the SQL Saturday in Vienna (column based indexes), thanks again for this interesting day!

    There is one question I forgot to ask:
    We have a rowstore fact table which we want to convert to columnstore.
    Unfortunately this table was recently expanded with a lot of string columns (order details, etc.) :(

    Although we don´t need these additional columns for most of our queries (and performance is not relevant for those columns as well) I have the feeling, that it is a bad idea to have those columns in a columnstore table in the first place.

    Am I right? Should I remove them (move them to a 1:1 joined table) before converting the table to columnstore?
    Or is it no big deal and has no negative impact on the other (more relevant) columns?

    Thanks in advance & best regards,
    Hannes.

  41. Frank L.

    Hi Niko,

    i have a very strange behaviour with a delete statement and partitioned clustered columnstore indexes.
    We have a stored proc with a statement similar to

    with cte as (
    select business_key, row_number() over (partition by business_key order by import_ts desc) as row_id from myTable where date_key = @myDate
    )
    delete top(50000) from cte where row_id > 1

    which is executed in batches due to the pure amout of rows and transaction log growth.

    These statements result in a parallel execution plan. Sometimes we experience the error 8630 “Internal Query Processor Error. The query processor encountered an unexpected error during execution.”

    We are on SQL 2016 SP2 CU12 (13.0.5698.0, X64). The strange thing is….if we append MAXDOP 1 hint we do not experience these errors. We also rebuilt all our columnstore indexes but that did not help.

    Have you had similar problems? Could this be related to deleteing from a CTE?

    Best regards,
    Frank

    1. Niko Neugebauer Post author

      Hi Frank,

      depending on the size of your table and a number of other factors, unfortunately the situation is “normal”.
      There are regular fixes for the parallelism errors in the CU … and I would consider testing the build .5850 or later (if this message is read later).

      Best regards,
      Niko

Leave a Reply to Jed j Cancel reply

Your email address will not be published. Required fields are marked *