Welcome to my series on the CosmosDB, you might enjoy them, or you might not. 🙂 This represents my current opinion and view on this platform at this very moment and it might change in the future (given that I might have made mistakes) or given the eventual corrections and feature improvements in the product.
CosmosDB is a Microsoft Data Platform PaaS (Platform as a Service) solution, available in Azure and more previously known under the name of DocumentDB. It targets supporting multiple models and APIs. It is being extremely heavily promoted on the social media, through Microsoft TSPs, DX, PASS and the “Data Platform Family”.
CosmosDB has clear usage scenarios and patterns (well, at least they seem to be quite clear to me and given the current technology limitations, I believe that they should be clear to those who work with databases professionally), but it seems that the general market is made believing something very different – that this database is universal and it’s application patterns are universal as well (OLTP, OLAP, Hybrid, you name it).
From my point of view, the new applications using an existing popular APIs and requiring globally distributed online transaction processing (OLTP) that do not fit the relational databases are the real pattern that people should be applying CosmosDB at, and so in the very first post of the series I shall take a look at this offering as a database, focusing on the technical items that are understandable for the DBAs.
Before we go any further to the technical details, I need to make clear that I did mention the below items to the CosmosDB people. Many moons ago. It’s not like they had all the times in the world to correct them (there is no magic anywhere, but a lot of hard work), but because I know they were pointed at the documentation issues more than enough times by multiple people and because there is a need for a true technical understanding how things work, I decided to advance with this blog post. After all, this is mostly a technical blog right now. 🙂
In the very first blog of the series, post my agenda is rather basic, I will take a fast look at the following items, that are mostly taken for granted by the DBAs for the modern databases & developers:
– The Backups
– The Rollback
– The Transaction Log
– The Point In Time Restores
– The Database Corruptions
Let’s take a look at the
DocumentDB CosmosDB documentation – the page is called Full Automatic Online Backups (https://docs.microsoft.com/en-us/azure/cosmos-db/online-backup-and-restore#full-automatic-online-backups):
Here is one of the finest excepts on the backups:
Azure Cosmos DB takes snapshots of your data every four hours at the partition level. At any given time, only the last two snapshots are retained. However, if the collection/database is deleted, we retain the existing snapshots for all of the deleted partitions within the given collection/database for 30 days.
Take a deep breath.
So the backups are taken EVERY … 4 … HOURS … On the partition level (think about consistency between partitions – breath slowly).
Are those backups asynchronous ? If so, we might have a broken transaction after restore …
We assume that we are able to loose up to 4 hours of information with the CosmosDB. This should be better advertised and promoted, if you ask me … People using it deserve to know what they risk.
What are the business reasons for keeping information just for 8 hours? If something goes wrong at 18:01, 1 minute after a DBA leaves the office, what will she/he be able to find at 9 AM, 15 hours later? What can be saved?
Since there is NO RESTORE option, how long does it take to get the information back after opening a support ticket (a paid one I assume), what is the RTO (Return Time Objective) under the current SLA ?
What a malicious attack was executed against my DB and someone deleted 99% of the information within each collection, effectively making that a collection is still available but almost without or with wrong information ?
I won’t even start talking about the 30 days backups story when the Database is promising unlimited storage… 🙁
That is by far not the most exciting part, of course, but still any enterprise should think twice about what they risk by going and implementing such solution.
To my understanding there are none. This means if you have executed a wrong command (imagine that sometimes there are developer queries going wrong … I know, I know – it never happened to you), you are own your own.
Good luck, Niko!
It looks like you are going to need it. 🙂
There is nothing exposed to serve for such purpose and I can almost ask if there is one.
I suspect that there is – how otherwise it would be possible to replicate the information between the geo-distributed replicas.
But can we back it up or restore it ? Or at least take a look at it ?
Point In Time Restores
There are none right now.
They are badly needed. Like yesterday!
How could someone go GA (General Availability) without them ?
Well, maybe those perky Transaction Logs and Full Backups are needed after all, right ?
Here is one more documentation pearl/gem:
Azure Cosmos DB retains the last two backups of every partition in the database account. This model works well when a container (collection of documents, graph, table) or a database is accidentally deleted since one of the last versions can be restored. However, in the case when users may introduce a data corruption issue, Azure Cosmos DB may be unaware of the data corruption, and it is possible that the corruption may have overwritten the existing backups. As soon as corruption is detected, the user should delete the corrupted container (collection/graph/table) so that backups are protected from being overwritten with corrupted data.
Read it again.
Read it slowly.
Fall down and cry.
I will translate it for you:
if you have a data corruption, then delete the whole container (graph/table/database) so the automated backups are safe.
Please, do not do this. You can run an extraction of the CosmosDB schema with the Import/Export tool, so schedule it from some server/vm and “back up/extract” your data on your terms. Yes, it won’t work for everyone, but some people might get through the avoidable troubles.
There is a reality distortion field around CosmosDB.
Being run and developed by very smart people, it looks to me as a very strange database story.
It seems that someone went and brainwashed a significant part of the Microsoft Data Platform Community: I remember seeing people writing on the social media after PASS Summit Keynote that CosmosDB is their favourite database (even though they have not even touched it at all). I see people submitting presentations on the CosmosDB without trying it. I see people presenting on the CosmosDB without putting any workload on it (like putting 5 rows into a regular relational database table and claim it is scalable across 5 continents).
Please, don’t tell me that CosmosDB has solved/solves all data problem of the world.
Don’t tell me that it is universal, it is like universal military planes – people laugh about them. Jack Of All Trades is MASTER OF NONE. A bomber is not a fighter. Period.
Do NOT tell me that it is just a marketing problem, because the technical people are pushing the agenda of incredibility, without any reasonable critical judgement or at least challenging the technical concepts (Talking about people outside of Microsoft here).
Do not think about using such words as Serverless/Limitless in the technical conversations – you are making wrong points, unless of course, you put 1.000.000.000.000 Petabytes into it or scale it to at least 1.000.000 servers. 🙂 Oh yeah, I am talking about a single application, just in the case you are wondering. And still I might challenge you on the details 🙂
Do not dare to tell me that it is hard to write a consistent Database, just look around and observe hundreds/thousands of the smartest people in the world fighting to solve those problems for years and their problem is not the lack of brains or the previous technical debt. It is just damn hard to do it right, hard like in mission impossible.
Look at those items and think about your favourite relational database and realize that they guarantee those things, your PaaS is giving them by default.
Think about it.
to be continued …