In this article, I am going to explain about the different types of consistency support provided by cosmos DB. I am also going to throw light on how cosmos helps retrieve data that are closest to a customers need. Before reading this article, please read basics of Cosmos DB and how global distribution works for better understanding.
Azure cosmos DB tunable consistency level helps to choose one that is very close to a customers requirement.
What is Data consistency:
Consistency in general means to get exact similar returns for each run of an action. Let me explain with a scenario: in an online store, the user address may be part of their signup information, but it also needs to appear on delivery information. If there is a mismatch between signup information and delivery information, it is called data inconsistency.
Inconsistency data leads to duplication or loss of data, due to which people will lose faith on prediction. A report based on inconsistency data makes things worse.
The scale of consistency has a scope limited to a single user request. A write request corresponds to any transaction of insert, replace or delete. A read and write request both are limited to the scope of a single user request. The request is required to read larger result-set or travel to multiple partitions. However, because of the limitation of the scope, each request is confined to a single page and served in single partition.
Cosmos DB consistency models:
There is always a trade-off between availability and consistency in a distributed system. We can expect consistency in RDBMS. But a distributed system chooses availability over consistency. To support consistency, Cosmos DB provides five different type of consistency models. Based on business requirements, the consistency model needs to be chosen. Let me explain one by one:
1. Strong consistency more focused on data consistency, it’s like an SQL commit statement. So, the data latency is high and the performance is too low.
2. Strong consistency scoped to only one region: explained in the image below.
3. A read is always acknowledged by the majority read quorum, a client can never see an uncommitted or partial write and is always guaranteed to read the latest acknowledged write.
4. The cost of read operation is higher in Strong consistency than session and eventual ones. Here cost is calculated based on the RU (Request Unit).
1. With bounded-staleness consistency model, we can set period to replica data to read. So, there is a lag between write and read database data.
2. We can choose bounded staleness in two ways number of versions item and time interval.
3. If we set time is Zero then act as strong
4. We can configure Bounded staleness more than one region.
5. We can choose Bounded staleness where we need consistency and availability same priority.
1. Session consistency is the default consistency: it is also popular because of its consistency and better throughput.
2. Write region user can read the latest data but read region users can read only lag data.
3. Client can change the default consistency on a per-request basis level
1. With Consistency prefix we can maintain a sequence to replicate the data to replicas.
2. If writes were performed in the order A, B, C, then a client sees either A, A,B, or A,B,C, but never out of order like A,Cor B,A,C
1. Eventual consistency performance is good because it doesn’t wait for data commit on read time.
2. It is just opposed to strong consistency. The problem is because of irregular data consistency.
3. Eventual consistency is the weakest consistency but it has the highest latency.
4. The cost of read operation in Eventual consistency is low.
How to configure default consistency:
As I explained, a user can set any consistency level as default based on their need. Here see how to set default consistency in portal.
1. Open Azure portal using portal.azure.com, search cosmos DB.
2. Select Azure cosmos DB, click your already created cosmos DB service.
3. In feature menu, click Default consistency
4. It will show five type of consistency levels and save.
How consistency level support for Query:
Based on our selected consistency, select query will perform the required action. The consistency level for query is the same as the consistency level for reads. If we select strong consistency query response is high, because strong consistency is more focused on consistency of data and ensures the consistency. If we select Eventual query performance is high but it is based on the availability of data.
Basically, queries perform based on indexing. Cosmos DB automatically sets index; also, we can configure certain collection to update their index in a slow manner. Lazy index boosts out write performance where we need bulk insert but primary data center read is heavy.