CosmosDB: NoSQL, NoDBA & NoProblem?

Welcome to my series on the CosmosDB, you might enjoy them, or you might not. :) This represents my current opinion and view on this platform at this very moment and it might change in the future (given that I might have made mistakes) or given the eventual corrections and feature improvements in the product.

CosmosDB is a Microsoft Data Platform PaaS (Platform as a Service) solution, available in Azure and more previously known under the name of DocumentDB. It targets supporting multiple models and APIs. It is being extremely heavily promoted on the social media, through Microsoft TSPs, DX, PASS and the “Data Platform Family”.

CosmosDB has clear usage scenarios and patterns (well, at least they seem to be quite clear to me and given the current technology limitations, I believe that they should be clear to those who work with databases professionally), but it seems that the general market is made believing something very different – that this database is universal and it’s application patterns are universal as well (OLTP, OLAP, Hybrid, you name it).
From my point of view, the new applications using an existing popular APIs and requiring globally distributed online transaction processing (OLTP) that do not fit the relational databases are the real pattern that people should be applying CosmosDB at, and so in the very first post of the series I shall take a look at this offering as a database, focusing on the technical items that are understandable for the DBAs.

Before we go any further to the technical details, I need to make clear that I did mention the below items to the CosmosDB people. Many moons ago. It’s not like they had all the times in the world to correct them (there is no magic anywhere, but a lot of hard work), but because I know they were pointed at the documentation issues more than enough times by multiple people and because there is a need for a true technical understanding how things work, I decided to advance with this blog post. After all, this is mostly a technical blog right now. :)

In the very first blog of the series, post my agenda is rather basic, I will take a fast look at the following items, that are mostly taken for granted by the DBAs for the modern databases & developers:

– The Backups
– The Rollback
– The Transaction Log
– The Point In Time Restores
– The Database Corruptions

Backups

Let’s take a look at the DocumentDB CosmosDB documentation – the page is called Full Automatic Online Backups (https://docs.microsoft.com/en-us/azure/cosmos-db/online-backup-and-restore#full-automatic-online-backups):

Here is one of the finest excepts on the backups:
Azure Cosmos DB takes snapshots of your data every four hours at the partition level. At any given time, only the last two snapshots are retained. However, if the collection/database is deleted, we retain the existing snapshots for all of the deleted partitions within the given collection/database for 30 days.

Take a deep breath.
So the backups are taken EVERY … 4 … HOURS … On the partition level (think about consistency between partitions – breath slowly).
Are those backups asynchronous ? If so, we might have a broken transaction after restore …
We assume that we are able to loose up to 4 hours of information with the CosmosDB. This should be better advertised and promoted, if you ask me … People using it deserve to know what they risk.

What are the business reasons for keeping information just for 8 hours? If something goes wrong at 18:01, 1 minute after a DBA leaves the office, what will she/he be able to find at 9 AM, 15 hours later? What can be saved?

Since there is NO RESTORE option, how long does it take to get the information back after opening a support ticket (a paid one I assume), what is the RTO (Return Time Objective) under the current SLA ?

What a malicious attack was executed against my DB and someone deleted 99% of the information within each collection, effectively making that a collection is still available but almost without or with wrong information ?

I won’t even start talking about the 30 days backups story when the Database is promising unlimited storage… :(

That is by far not the most exciting part, of course, but still any enterprise should think twice about what they risk by going and implementing such solution.

Rollbacks

To my understanding there are none. This means if you have executed a wrong command (imagine that sometimes there are developer queries going wrong … I know, I know – it never happened to you), you are own your own.
Good luck, Niko!
It looks like you are going to need it. :)

Transaction Log

There is nothing exposed to serve for such purpose and I can almost ask if there is one.
I suspect that there is – how otherwise it would be possible to replicate the information between the geo-distributed replicas.
But can we back it up or restore it ? Or at least take a look at it ?

Point In Time Restores

There are none right now.
They are badly needed. Like yesterday!
How could someone go GA (General Availability) without them ?
Well, maybe those perky Transaction Logs and Full Backups are needed after all, right ?

Corruptions

Here is one more documentation pearl/gem:
Azure Cosmos DB retains the last two backups of every partition in the database account. This model works well when a container (collection of documents, graph, table) or a database is accidentally deleted since one of the last versions can be restored. However, in the case when users may introduce a data corruption issue, Azure Cosmos DB may be unaware of the data corruption, and it is possible that the corruption may have overwritten the existing backups. As soon as corruption is detected, the user should delete the corrupted container (collection/graph/table) so that backups are protected from being overwritten with corrupted data.

Read it again.
Read it slowly.
Interpret it.
Fall down and cry.
I will translate it for you:
if you have a data corruption, then delete the whole container (graph/table/database) so the automated backups are safe.
:CRY:
Please, do not do this. You can run an extraction of the CosmosDB schema with the Import/Export tool, so schedule it from some server/vm and “back up/extract” your data on your terms. Yes, it won’t work for everyone, but some people might get through the avoidable troubles.

Blogpost Thoughts
There is a reality distortion field around CosmosDB.
Being run and developed by very smart people, it looks to me as a very strange database story.
It seems that someone went and brainwashed a significant part of the Microsoft Data Platform Community: I remember seeing people writing on the social media after PASS Summit Keynote that CosmosDB is their favourite database (even though they have not even touched it at all). I see people submitting presentations on the CosmosDB without trying it. I see people presenting on the CosmosDB without putting any workload on it (like putting 5 rows into a regular relational database table and claim it is scalable across 5 continents).

Please, don’t tell me that CosmosDB has solved/solves all data problem of the world.
Don’t tell me that it is universal, it is like universal military planes – people laugh about them. Jack Of All Trades is MASTER OF NONE. A bomber is not a fighter. Period.
Do NOT tell me that it is just a marketing problem, because the technical people are pushing the agenda of incredibility, without any reasonable critical judgement or at least challenging the technical concepts (Talking about people outside of Microsoft here).
Do not think about using such words as Serverless/Limitless in the technical conversations – you are making wrong points, unless of course, you put 1.000.000.000.000 Petabytes into it or scale it to at least 1.000.000 servers. :) Oh yeah, I am talking about a single application, just in the case you are wondering. And still I might challenge you on the details :)
Do not dare to tell me that it is hard to write a consistent Database, just look around and observe hundreds/thousands of the smartest people in the world fighting to solve those problems for years and their problem is not the lack of brains or the previous technical debt. It is just damn hard to do it right, hard like in mission impossible.
Look at those items and think about your favourite relational database and realize that they guarantee those things, your PaaS is giving them by default.
Think about it.

to be continued …

9 thoughts on “CosmosDB: NoSQL, NoDBA & NoProblem?

  1. tobi

    For some reason Azure likes to cripple really good products. For example, SQL Server databases were limited to 500GB of data on the highest tier for a long time. These CosmosDB drawbacks are devastating as well. Seems like an unforced error by the team. Simply allow the user to configure backups and add a restore option… They invest millions of dollars to build something like CosmosDB and they stop 1% short of making it into something great.

    Cloud providers sometimes seem a bit arrogant to me. They like to tell us how we should build our apps instead of doing the grunt work necessary to allow us to build apps the way WE want them done. They tell us we should partition everything and have loose consistency and so on while real world apps simply want one powerful database and full consistency. Etc…

    1. Niko Neugebauer Post author

      Hi tobi,

      I guess that “crippling” part is complicated and will depend on the product.
      I am not a fan of marketing or wrong positioning where something that is
      I love Cosmos DB when it is properly used and I have friends with real projects successfully using Cosmos DB, like for example for the Key-Value pair cache usage, but seeing and hearing people saying to put Data Warehouse on CosmosDB makes me incredibly upset.
      The lack of the clear positioning of this platform by Microsoft is disturbing me (or maybe this is just me – but heck, so many people agree on that).

      Best regards,
      Niko

  2. GW

    “Fall down and cry” – indeed
    its also too expensive
    my on-prem MSDN Standard SQL license runs 24/7 on great hardware for a year
    for what a SQL Data Warehouse costs for a month of 24/7?

    1. Niko Neugebauer Post author

      Hi GW,

      Everything should fits its purposes and correspond the business goals.
      There will be cases where a certain product is useful, there will be cases where it is not useful/performs bad/too expensive/etc.
      Everyone should decide according to their business needs & goals.

      Best regards,
      Niko

  3. Luis Marques

    Hi Niko,
    Do you have more RDBMS features that you would like to “copy” do CosmosDB, so it behaves, scales and backups exactly like a RDBMS, so we are comfortable with? Foreign Keys?, Btree indexes? or RDBMS ACID with commits rollback mechanisms?
    You missed the point of what these databases are for! If you keep putting on the same use case as your SQLServer or any RDBMS you are a little off.
    You pay a high penalty price for performance on RDBMS ACID-mode (rollback, transaction, tlogs, PIT, etc) and if you don’t believe do the following test case (be fair on hardware side for both platforms):
    – Generate 10k messages per second with 5k each and if you like put some publish-consumer plaform like EventHub or Kafka to handle that (optional).
    – Insert directly to SQLServer every second
    – Insert directly to CosmosDB, Cassandra, HBase (whatever you want for this kind of stuff).

    Measure the performance – Number the events that per hour that any of these platforms can sustain without “trashing”.
    Saying that CosmosDB fits all scenarios is stupid and the problem is that multi-modal marketing is something that can be dangerous for less-alerted minds.

    1. Niko Neugebauer Post author

      Hi Luís,

      I do not think I have missed any point.
      You are mentioning basic stuff that is absolutely logical and I indeed playing/working with other solutions and so I guess I am pretty aware what is good for which purpose.

      The point is how CosmosDB is being packaged and sold.
      Just as you wrote: “Saying that CosmosDB fits all scenarios is stupid and the problem is that multi-modal marketing is something that can be dangerous for less-alerted minds.”
      THIS is the problem and it seems that there are ZERO VOICES saying and specifying the intended purposes.

      Less-alerted minds ? Well, I see Microsoft MVPs and Microsoft Employees spreading information that you should run DWH on CosmosDB or do the relational stuff on it. Just today I saw tweets about CosmosDB backups – where in fact it is an inconsistent data extraction process.
      This is why I am writing these blogposts – to compensate wrong informations saying it fits all purposes.

      Best regards,
      Niko

    2. Laurent

      Hi Luis,

      It seems to me that you have feedback on this use case.
      Can you give us some informations on this point please ?

      Thanks

    1. Niko Neugebauer Post author

      Cool stuff, thank you for sharing Jason!
      I am wondering whether the rollback would be available and how it would affect the chosen consistency model and the global availability … :)

      Best regards,
      Niko

Leave a Reply to Niko Neugebauer Cancel reply

Your email address will not be published. Required fields are marked *