Achieving S3 Read-After-Update Consistency

Originally posted on the Hatch Blog

The team at Hatch spun up the Labour Exchange in a few days re-purposing our tech to help stood down workers find employment during the covid-19 crisis.

In order to get the system up and running in such a short time frame we decided to use S3 as a flat-file data store to maintain our serverless batch job states and caches. After a very cursory search to satisfy ourselves S3 would guarantee read-after-write consistency, we flew on.

From the documentation

Amazon S3 provides read-after-write consistency for PUTS of new objects in your S3 bucket in all Regions with one caveat. The caveat is that if you make a HEAD or GET request to a key name before the object is created, then create the object shortly after that, a subsequent GET might not return the object due to eventual consistency.

Unfortunately, we missed the bolded section and the part further down the page that explicitly states our use case:

A process replaces an existing object and immediately tries to read it. Until the change is fully propagated, Amazon S3 might return the previous data.

As these processes run on a schedule (not as part of a user facing API) we could afford to spend some extra calls to S3 to roll our own read-after-update consistency. Knowing that S3 guarantees read-after-firstWrite, we can write a new file for every change, read the latest file and make sure we cleanup.

So every time we write a file we:

  • Append a timestamp to the filename
  • Remove older files

When we read a file we:

  • List all files with the key prefix (S3 guarantees listing files will be ordered by ascending UTF-8 binary order)
  • Get the newest file in the list

import { writeS3Obj, getS3FileAsObj, listObjects, deleteS3Files } from "./S3";
/**
* S3 does not provide read-after-update consistency.
* It does provide read-after-firstWrite consistency (as long as no GET has been requested)
* We write a new file every time it changes, and we read the latest file.
* S3 guarantees list of files are sorted in ascending UTF-8 Binary Order
*
*/
const cleanUp = async (key: string) => {
const response = await listObjects({
MaxKeys: 1000,
Bucket: process.env.BUCKET_NAME!,
Prefix: key,
});
const keys = response.Contents?.map((c) => c.Key!) || [];
await deleteS3Files(keys.slice(0, keys.length 1));
};
export const writeServiceState = async (key: string, state: any) => {
await writeS3Obj(`${key}.${Date.now()}`, state);
await cleanUp(key);
};
export const getServiceState = async (key: string, defaultVal: T): Promise => {
const response = await listObjects({
MaxKeys: 1000,
Bucket: process.env.BUCKET_NAME!,
Prefix: key,
});
if (!response.Contents || response.Contents.length === 0) {
console.log("No state file for key " + key);
return defaultVal;
}
return getS3FileAsObj(response.Contents[response.Contents.length 1].Key!);
};
view raw s3.ts hosted with ❤ by GitHub

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s