ETag
and If-None-Match
Headersby Christoph Schiessl on Python and FastAPI
When you build web applications, you generally want to limit their resource consumption as much as possible. Usually, you want to keep file sizes for transfer to the client small or, even better, avoid transfers altogether. Modern browsers have a variety of mechanisms built into them to make caching of previously requested resources seamless, thereby helping to prevent retransfers of data in many cases. One of these mechanisms and this article's topic is the ETag
header.
ETag
response headerThe idea behind ETag
headers, which is short for entity tag, is easy to explain: When the HTTP server delivers a resource (i.e., a file), it adds an ETag
response header that contains a representation of the resource. For this purpose, it's common to use a hash value of the response body (e.g., using the SHA1 algorithm). Other alternatives, such as the currently deployed Git revision of the requested resource, would also be feasible. Whatever you use, you must ensure that the representation changes whenever the underlying resource changes.
Imagine having a simple FastAPI application and a _site
directory containing a single index.html
file.
import uvicorn
from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles
app = FastAPI()
app.mount("/", StaticFiles(directory="_site", html=True), name="_site")
uvicorn.run(app=app, port=3000)
$ tree --noreport .
.
├── app.py
└── _site
└── index.html
If you start your FastAPI app with python app.py
and request the index.html
file, you'll see the ETag
header:
$ http GET http://localhost:3000/index.html
HTTP/1.1 200 OK
content-length: 62
content-type: text/html; charset=utf-8
date: Tue, 02 Apr 2024 18:29:56 GMT
etag: "ad14c587836bb86ca86326dd61c9bf23"
last-modified: Tue, 02 Apr 2024 18:29:50 GMT
server: uvicorn
<!doctype html><meta charset=utf-8><title>/index.html</title>
The HTTP client doesn't need to know and usually doesn't care how the HTTP server calculates its ETag
headers, but in this case, you can look it up in the FastAPI source code. As it turns out, StaticFiles
uses an MD5 hash of the file's modification timestamp and size in bytes. Knowing how that works, we can replicate the calculation on the command line:
$ python -c "import os; s = os.stat('_site/index.html'); print(f'{s.st_mtime}-{s.st_size}')" | xargs echo -n | md5sum
ad14c587836bb86ca86326dd61c9bf23 -
Here, I use the os.stat()
function to get meta-information about index.html
, and then I concatenate and print its modification time and size. Next, I use echo -n
to remove the trailing line break that Python prints out, and finally, I pipe the result into md5sum
to calculate the hash. And sure enough, the hash value from the command line matches the one we got in the HTTP response header.
We can also reverse this and use the command line to predict the hash values we will get from the HTTP response. For example, if we somehow modify index.html
so that its modification time and/or file size change, we can use the shell script from above again to recalculate the hash value ...
$ touch _site/index.html
$ python -c "import os; s = os.stat('_site/index.html'); print(f'{s.st_mtime}-{s.st_size}')" | xargs echo -n | md5sum
eb2785e4b5012179e9ffffca80d32eb7 -
... then we get the same hash value that we will get from a subsequent HTTP response.
$ http GET http://localhost:3000/index.html
HTTP/1.1 200 OK
content-length: 62
content-type: text/html; charset=utf-8
date: Tue, 02 Apr 2024 18:31:10 GMT
etag: "eb2785e4b5012179e9ffffca80d32eb7"
last-modified: Tue, 02 Apr 2024 18:30:56 GMT
server: uvicorn
<!doctype html><meta charset=utf-8><title>/index.html</title>
This was the most important information about the ETag
header and its implementation in FastAPI. However, this is only one side of the coin, and it doesn't mean anything without the other side — namely, client-side support with the If-None-Match
header.
If-None-Match
request headerIf the HTTP client supports ETag
caching and receives a response that includes an ETag
header, then it will copy the value of this header (including double quotes) and include it in subsequent requests for the same resource. This is done with the If-None-Match
request header, which is interpreted by the HTTP server as follows:
ETag
header in the response and the value of the If-None-Match
header in the request is the same, it responds with the status 304 Not Modified
(without response body).200 OK
(with a response body).The bottom line is that the server doesn't resend the same response body again if the client already has it, and thereby, it saves resources that would have been wasted by transferring the same response body again.
Fortunately, it's straightforward with httpie
to include If-Not-Modified
headers in requests to simulate this behavior on the command line.
$ http GET http://localhost:3000/index.html
HTTP/1.1 200 OK
content-length: 62
content-type: text/html; charset=utf-8
date: Tue, 02 Apr 2024 18:33:23 GMT
etag: "eb2785e4b5012179e9ffffca80d32eb7"
last-modified: Tue, 02 Apr 2024 18:30:56 GMT
server: uvicorn
<!doctype html><meta charset=utf-8><title>/index.html</title>
$ http GET http://localhost:3000/index.html 'If-None-Match: "eb2785e4b5012179e9ffffca80d32eb7"'
HTTP/1.1 304 Not Modified
date: Tue, 02 Apr 2024 18:33:48 GMT
etag: "eb2785e4b5012179e9ffffca80d32eb7"
server: uvicorn
$ # touch the file so that its ETag changes
$ touch _site/index.html
$ http GET http://localhost:3000/index.html 'If-None-Match: "eb2785e4b5012179e9ffffca80d32eb7"'
HTTP/1.1 200 OK
content-length: 62
content-type: text/html; charset=utf-8
date: Tue, 02 Apr 2024 18:33:57 GMT
etag: "8aeaf4ec41fe782adf2f7b86d884754a"
last-modified: Tue, 02 Apr 2024 18:33:55 GMT
server: uvicorn
<!doctype html><meta charset=utf-8><title>/index.html</title>
That's everything for today. You now know the basics of ETag
-based HTTP caching and have seen it work in a FastAPI application. Thank you for reading, and see you soon!
I send two weekly emails on building performant and resilient Web Applications with Python, JavaScript and PostgreSQL. No spam. Unscubscribe at any time.
Last-Modified
and If-Modified-Since
Headers
Learn about timestamp-based caching in HTTP using the Last-Modified
and If-Modified-Since
headers, with Python's FastAPI as an example.
By Christoph Schiessl on Python and FastAPI
StaticFiles
Learn how to serve a static site using FastAPI. Perfect for locally testing statically generated websites, for instance, with httpie
.
By Christoph Schiessl on Python and FastAPI
sitemap.xml
with JavaScript
Learn JavaScript techniques needed to parse your sitemap.xml
in order to obtain a list of all pages making up your website.
By Christoph Schiessl on JavaScript