Using the If-Match Header to Avoid Collisions

by Christoph Schiessl on Python and FastAPI

If you are building an HTTP API and have to support users with concurrent access to the same updatable resource, you must safeguard against collisions. There are many ways to do this, but today, I want to discuss one approach that uses the ETag response header and the If-Match request header.

What are Collisions?

Imagine you have your web application running on localhost:8000 with the following two endpoints:

  • GET /documents/{id} — returns the current content of the document, identified by the given id.
  • PUT /documents/{id} — accepts a request body with some content and creates a new document under the given id (or replaces the document's content if it already exists).

Now, consider the following timeline of two users, Alice and Bob, who are concurrently interacting with your web application:

  1. Alice loads the current content of document number 1 with GET /documents/1.
  2. Bob also loads the current content of document number 1 with GET /documents/1. At this point, Alice and Bob have the same content.
  3. Both users modify their local copy of the content somehow.
  4. Alice is happy with her modifications and submits her finished work to the server with PUT /documents/1.
  5. Shortly after, Bob is also done with his modifications and submits his updated local content to the server with PUT /documents/1.

The last step causes the collision! Bob is unaware that Alice has already updated the document before him, so he doesn't know that his local copy is outdated. When he saves his work, he overwrites the changes that Alice has saved before. This situation is known as a lost update and there are many ways to deal with it. For instance, we could introduce pessimistic locking to guarantee exclusivity to users so that updates from other users are blocked while they modify the content.

That said, pessimistic locking is often overkill because, in many cases, it's good enough to detect conflicts when they occur so that later updates are blocked and earlier updates aren't overwritten.

How to Prevent Lost Updates?

We must extend our API endpoints to support the ETag response header and the If-Match request header to detect conflicts and block users from accidentally overwriting each other user's updates. Essentially, we have to implement the following logic:

  • GET /documents/{id} — returns the document's current content and an ETag response header that represents the document's current content (e.g., using some hash function). For that purpose, it doesn't matter whether we use a strong or weak ETag header.
  • PUT /documents/{id} — accepts a request body with some content and creates a new document identified by the given id (or replaces the document's content if it already exists). It also accepts an optional If-Match header that triggers the following logic if present: If the value of the document's ETag matches the value of the given If-Match header, the endpoint performs the update as before. However, if the two values don't match, the HTTP request must fail with the status 412 Precondition Failed and not update the document's content.

Given this extended API, we can run the timeline from before again and observe the different behavior:

  1. Alice loads the current content of document number 1 with GET /documents/1.
  2. Bob also loads the current content of document number 1 with GET /documents/1. At this point, Alice and Bob have the same content and the same ETag value representing the content.
  3. Both users modify their local copy of the content somehow.
  4. Alice is happy with her modifications and submits her finished work to the server with PUT /documents/1. She also adds an If-Match header with the ETag value she received when loading the document.
  5. Shortly after, Bob is also done with his modifications and submits his updated local copy to the server with PUT /documents/1. He, too, adds an If-Match header with the ETag value he received when loading the document.

Now, we get a different behavior for the last step. The If-Match header Bob submitted no longer matches the ETag value the server computes from the document's current content. You see, Alice modified the content and, hence, made the server calculate a different ETag value for the updated content. Long story short, due to the new logic in the extended API, the server now rejects Bob's PUT request with the status 412 Precondition Failed, which previously would have resulted in the loss of Alice's update.

Blocking Conflicting Updates with FastAPI

All of this isn't too hard to implement in FastAPI. The GET endpoint below attempts to return the content of the document with the given id. If this document doesn't exist, it simply fails with the status 404 Not Found. On the other hand, if it does exist, there are two possible responses.

Firstly, if the request has an If-None-Match header and its value matches the calculated ETag value, then the request is short-circuited with 304 Not Modified (e.g., HTTP caching behavior). Secondly, if the request doesn't include an If-None-Match header or its value doesn't match the computed ETag value, the server responds with a normal 200 OK status and returns the document's content in the response body.

For the PUT endpoint, we start by determining whether the given document id is already known. If not, we create a new document no matter what. However, if this request is about updating an existing document, we have two options again.

Firstly, if the request has an If-Match header and its value doesn't match the calculated ETag value, then the request fails with the status 412 Precondition Failed, and the document's content is not updated. Secondly, if the request doesn't include an If-Match header or its value matches the computed ETag value, the server responds with a 204 No Content status, and it updates the document's content according to the text in the request body.

from fastapi import FastAPI, Request, status
from hashlib import md5
from fastapi.responses import PlainTextResponse
import uvicorn

app = FastAPI()

documents_database: dict[str, str] = {}


@app.get("/documents/{id}")
async def get_document(request: Request, id: str) -> PlainTextResponse:
    if (current_content := documents_database.get(id)) is not None:
        etag = md5(current_content.encode()).hexdigest()
        if request.headers.get("If-None-Match") == etag:
            return PlainTextResponse(status_code=status.HTTP_304_NOT_MODIFIED, headers={"ETag": etag})
        return PlainTextResponse(status_code=status.HTTP_200_OK, headers={"ETag": etag}, content=current_content)
    else:
        return PlainTextResponse(status_code=status.HTTP_404_NOT_FOUND)


@app.put("/documents/{id}")
async def put_document(request: Request, id: str) -> PlainTextResponse:
    if (current_content := documents_database.get(id)) is not None:
        etag = md5(current_content.encode()).hexdigest()
        if request.headers.get("If-Match") and request.headers.get("If-Match") != etag:
            return PlainTextResponse(status_code=status.HTTP_412_PRECONDITION_FAILED)
        documents_database[id] = (await request.body()).decode()
        return PlainTextResponse(status_code=status.HTTP_204_NO_CONTENT)
    else:
        documents_database[id] = (await request.body()).decode()
        return PlainTextResponse(status_code=status.HTTP_201_CREATED)


if __name__ == "__main__":
    uvicorn.run(app=app)

If we run this program with python app.py, we can test it using curl.

$ curl -i -s -X GET localhost:8000/documents/1
HTTP/1.1 404 Not Found
date: Fri, 14 Jun 2024 07:09:41 GMT
server: uvicorn
content-length: 0
content-type: text/plain; charset=utf-8

$ curl -i -s -X PUT --header "Content-Type: text/plain" --data "This is the content." localhost:8000/documents/1
HTTP/1.1 201 Created
date: Fri, 14 Jun 2024 07:09:49 GMT
server: uvicorn
content-length: 0
content-type: text/plain; charset=utf-8

$ curl -i -s -X GET localhost:8000/documents/1
HTTP/1.1 200 OK
date: Fri, 14 Jun 2024 07:10:00 GMT
server: uvicorn
etag: 3cf463e834022556dbf5f5dc571cbb0f
content-length: 20
content-type: text/plain; charset=utf-8

This is the content.

Document number 1 doesn't exist initially. After creating it, we can fetch its content in subsequent GET requests. The responses of these GET requests, include an ETag header, whose value we can use to see the HTTP caching behavior in action ...

$ curl -i -s -X GET --header "If-None-Match: 3cf463e834022556dbf5f5dc571cbb0f" localhost:8000/documents/1
HTTP/1.1 304 Not Modified
date: Fri, 14 Jun 2024 07:10:40 GMT
server: uvicorn
etag: 3cf463e834022556dbf5f5dc571cbb0f
content-type: text/plain; charset=utf-8

We can then proceed to update the document's content, using the same ETag value for the If-Match header. At this point, the PUT succeeds and responds with the status 204 No Content because the value of the If-Match header and the calculated ETag value do match. The successful update of the content changes the ETag value, and therefore, the HTTP caching behavior no longer works if we are still using the ETag value that we originally received.

$ curl -i -s -X PUT --header "Content-Type: text/plain" --header "If-Match: 3cf463e834022556dbf5f5dc571cbb0f" --data "This is *updated* content." localhost:8000/documents/1
HTTP/1.1 204 No Content
date: Fri, 14 Jun 2024 07:10:49 GMT
server: uvicorn
content-type: text/plain; charset=utf-8

$ curl -i -s -X GET --header "If-None-Match: 3cf463e834022556dbf5f5dc571cbb0f" localhost:8000/documents/1
HTTP/1.1 200 OK
date: Fri, 14 Jun 2024 07:11:11 GMT
server: uvicorn
etag: c7557d8a6989d574813b034fa42c86f6
content-length: 26
content-type: text/plain; charset=utf-8

This is *updated* content.

Next, we try to PUT the document again, still using the original ETag value for the If-Match request header. This time, the request fails with the status 412 Precondition Failed and does not update the document's content. Needless to say, this is the whole point of this article because this is precisely the behavior that would avoid lost updates in the real world.

$ curl -i -s -X PUT --header "Content-Type: text/plain" --header "If-Match: 3cf463e834022556dbf5f5dc571cbb0f" --data "This is *differently updated* content." localhost:8000/documents/1
HTTP/1.1 412 Precondition Failed
date: Fri, 14 Jun 2024 07:11:20 GMT
server: uvicorn
content-length: 0
content-type: text/plain; charset=utf-8

$ curl -i -s -X GET localhost:8000/documents/1
HTTP/1.1 200 OK
date: Fri, 14 Jun 2024 07:11:29 GMT
server: uvicorn
etag: c7557d8a6989d574813b034fa42c86f6
content-length: 26
content-type: text/plain; charset=utf-8

This is *updated* content.

Finally, if we PUT the document again but don't include an If-Match request header, then all bets are off. The behavior that gave us 412 Precondition Failed is not active, meaning there is no protection against lost updates.

$ curl -i -s -X PUT --header "Content-Type: text/plain" --data "This is *differently updated* content." localhost:8000/documents/1
HTTP/1.1 204 No Content
date: Fri, 14 Jun 2024 07:11:37 GMT
server: uvicorn
content-type: text/plain; charset=utf-8

$ curl -i -s -X GET localhost:8000/documents/1
HTTP/1.1 200 OK
date: Fri, 14 Jun 2024 07:11:45 GMT
server: uvicorn
etag: d8ed2dc3564dbab400f8b4ec488b0f6e
content-length: 38
content-type: text/plain; charset=utf-8

This is *differently updated* content.

Conclusion

I hope this article made you consider competing updates and inspired you to write safer code for your next project. Thank you very much for reading, and see you soon!

Christoph Schiessl

Hi, I'm Christoph Schiessl.

I help you build robust and fast Web Applications.


I'm available for hire as a freelance web developer, so you can take advantage of my more than a decade of experience working on many projects across several industries. Most of my clients are building web-based SaaS applications in a B2B context and depend on my expertise in various capacities.

More often than not, my involvement includes hands-on development work using technologies like Python, JavaScript, and PostgreSQL. Furthermore, if you already have an established team, I can support you as a technical product manager with a passion for simplifying complex processes. Lastly, I'm an avid writer and educator who takes pride in breaking technical concepts down into the simplest possible terms.

Continue Reading?

Here are a few more Articles for you ...


HTTP Caching with ETag and If-None-Match Headers

Learn how to use ETag and If-None-Match headers to limit your web application's resource consumption by preventing data retransfers.

By Christoph Schiessl on Python and FastAPI

Showcasing Weak and Strong ETag Headers with FastAPI

Comparison of strong and weak ETag headers in HTTP caching, with an example using FastAPI to demonstrate their behavior.

By Christoph Schiessl on Python and FastAPI

Extracting all URLs of your sitemap.xml with JavaScript

Learn JavaScript techniques needed to parse your sitemap.xml in order to obtain a list of all pages making up your website.

By Christoph Schiessl on JavaScript

Web App Reverse Checklist

Ready to Build Your Next Web App?

Get my Web App Reverse Checklist first ...


Software Engineering is often driven by fashion, but swimming with the current is rarely the best choice. In addition to knowing what to do, it's equally important to know what not to do. And this is precisely what my free Web App Reverse Checklist will help you with.

Subscribe below to get your free copy of my Reverse Checklist delivered to your inbox. Afterward, you can expect one weekly email on building resilient Web Applications using Python, JavaScript, and PostgreSQL.

By the way, it goes without saying that I'm not sharing your email address with anyone, and you're free to unsubscribe at any time. No spam. No commitments. No questions asked.