C# Cosmos DB simple "lock" functionality by implementing Optimistic Concurrency Control
Posted
3 min read
Problem description
Recently I built and delivered a Queue Number service that's composed of a number of different applications. The service was for a shop that wanted to get rid of their old queue number machine that printed out a queue number on a piece of paper. The idea was to fully digitalize it and get rid of paper waste.
Below you'll see an excerpt of the architectural overview which is relevant to this blog post. Basically, a front-end client will perform an http request to an Azure Function, which in turn will make a PATCH
request to update a property for a specific item in a Cosmos DB in Azure.
While testing the happy paths, everything seemed fine. But as soon as I started simulating multiple requests and more importantly, making concurrent requests I could see an issue where two different users could get the exact same queue number. What a disaster 😮
Solution proposal
By using some form of a naive "lock", I was able to ensure that different concurrent users would not get the same number back.
Basically this solution consists of:
ETags: Cosmos DB uses ETags to handle optimistic concurrency. When retrieving an item, an ETag is included which you can use in the code to make conditions before trying to update the item. When performing an update of the item it's possible to specify an "AccessCondition" with the ETag to ensure that the update occurs only if the ETag matches the current state of the item. See more information about the AccessCondition class here.
Retry logic in my Azure Function: whenever I detect a collision I make N number of retries before I give up and let the user make another attempt to get a queue number.
Limitations
So, this solution does obviously have some limitations, and you should evaluate your context and whether this optimistic approach works. In my case, the number of visitors to the shop, and therefor the potential number of concurrent users and requests is likely very small. If I had a situation with much more traffic and a higher likelihood of collisions I'd probably opt for a completely different solution. In that case I'd likely redesign the data model to not have a single item that is updated by everyone... But this works fine for simpler cases with lower traffic applications.
Implementation
Code samples here are based on an Azure function running on .NET 8. They are just samples and not from the production code, so make sure to properly test it out, this is just to showcase a concept..
This blog post and code sample presumes that we have the following set up:
Cosmos DB with a container that has an item called
queuenumber.
The item should at least have a property namedlatestCreatedQueueNumber
that is an integer.Azure Function (http trigger)
Sample method to get new queue number
1private static async Task<QueueNumberDto> GetMyQueueNumberPleaseAsync(HttpRequest req, CosmosClient client)
2{
3 try
4 {
5 var queueNumberContainer = client.GetContainer("myDatabase", "myContainer");
6 var itemId = "queuenumber";
7
8 var item = await queueNumberContainer.ReadItemAsync<QueueNumberDbEntity>(id: itemId, partitionKey: new PartitionKey(itemId));
9
10 List<PatchOperation> operations = new()
11 {
12 PatchOperation.Increment("/latestCreatedQueueNumber", 1)
13 };
14
15 var response = await queueNumberContainer.PatchItemAsync<QueueNumberDbEntity>
16 (
17 id: itemId,
18 partitionKey: new PartitionKey(itemId),
19 patchOperations: operations,
20 requestOptions: new PatchItemRequestOptions
21 {
22 IfMatchEtag = item.ETag,
23 }
24 );
25
26 return new QueueNumberDto(response.Resource.LatestCreatedQueueNumber);
27 }
28
29 catch (CosmosException ex)
30 {
31 // Meaning a conflict occured during a patch request - the item's eTag was changed during the write operation by a concurrenct call
32 if (ex.StatusCode == System.Net.HttpStatusCode.PreconditionFailed)
33 {
34 // Do some logging etc
35 }
36 return null;
37 }
38}
Sample code to showcase the Function and retry logic
1[Function("QueueNumber")]
2public async Task<IActionResult> Run(
3 [HttpTrigger(AuthorizationLevel.Function, "get", Route = "queuenumber")] HttpRequest req)
4{
5 var attempts = 0;
6
7 try
8 {
9 var connectionString = "TheConnectionStringGoesHere";
10 using CosmosClient client = new(connectionString);
11
12 var queueNumberDto = await GetMyQueueNumberPleaseAsync(req, client);
13 attempts++;
14
15 // Perform a few retries if our dto is null, meaning we likely had a collision where requests where made the same time
16 while (queueNumberDto == null && attempts < 4) // set some max retry number, 4 attempts here..
17 {
18 queueNumberDto = await GetMyQueueNumberPleaseAsync(req, client);
19 attempts++;
20 }
21
22 if (queueNumberDto == null)
23 {
24 return new NotFoundResult();
25 }
26
27 return new OkObjectResult(queueNumberDto);
28
29 }
30 catch (Exception ex)
31 {
32 _log.LogError(ex, $"{clientId}: Something went wrong - general exception caught");
33 return new StatusCodeResult(500);
34 }
35}