async function* StateMachine() {}

Using an asnyc generator to create a state machine interpreter

If you've read about generator functions before, you probably didn't come away thinking "Awesome, I know exactly what I'd use this for". At least, I know that's not how it was for me. The popular example with the Fibonacci sequence was interesting but didn't seem immediately useful. I've never needed the Fibonacci sequence for any problem I was trying to solve. It finally became clearer when I started to learn about XState and state machines.

In order to create the github-contents-cache library - I knew I wanted to use a state machine for the caching application logic. I love XState, but for this project I wanted to keep the dependencies as small as I could, so I decided it would be fun to write a small state machine interpreter I could use in the library and an async generator fit the bill.

Here it is:

type ControlFlowStepsEntry = {
  nextEvent?: string;
  [key: string]: any;
};

type ControlFlowSteps<T> = {
  entry?: (arg: T) => Promise<ControlFlowStepsEntry>;
  final?: boolean;
  [key: string]: any;
};

type ControlFlowArgs<T> = {
  initialStep: string;
  steps: {
    [key: string]: ControlFlowSteps<T>;
  };
  stepContext: T;
};

type ControlFlowReturn = {
  step: string;
  data?: any;
  event?: string;
};

async function* createControlFlow<T>({
  initialStep,
  steps,
  stepContext,
}: ControlFlowArgs<T>): AsyncGenerator<ControlFlowReturn> {
  let currentStep = initialStep;
  let currentConfig = steps[currentStep];
  while (true) {
    if (currentConfig.final) {
      return;
    }
    let data = await currentConfig.entry(stepContext);
    let nextEvent = data.nextEvent;
    delete data.nextEvent;
    let next = { step: currentConfig[nextEvent], data, event: nextEvent };
    currentStep = next.step;
    currentConfig = steps[currentStep];
    yield next;
  }
}

export default async function controlFlow<T>({
  initialStep,
  steps,
  stepContext,
}: ControlFlowArgs<T>): Promise<ControlFlowReturn> {
  const controlFlowInstance = createControlFlow<T>({
    initialStep,
    steps,
    stepContext,
  });
  let result;
  for await (const next of controlFlowInstance) {
    result = next;
  }
  return result;
}

And here's how I'm using it in the github-contents-cache library:

...
let result = await controlFlow<GetGithubContentStepContext>({
  initialStep: ignoreCache ? "clearCacheEntry" : "lookInCache",
  stepContext: {
    cache,
    token,
    owner,
    repo,
    path,
    userAgent,
    serialize,
    maxAgeInMilliseconds,
    max404AgeInMilliseconds,
  },
  steps: {
    clearCacheEntry: {
      entry: clearCacheEntry,
      onCachedCleared: "lookInGithub", // Was able to clear the cache, lets get the latest from github
      onError: "error", // Something went wrong clearing the cache - either from a corrupt cache or a manual call to clear
    },
    lookInCache: {
      entry: lookInCache,
      onFound: "found", // Found in the cache and the maxAgeInMilliseconds had not yet expired
      onFoundInCache: "lookInGithub", // Ask github if what we have in cache is stale (Does Not count against our api limit)
      onNotInCache: "lookInGithub", // Ask for it from github (Does count against our api limit)
      on404CacheExpired: "clearCacheEntry", // We found a cache 404 but it has expired
      on404InCache: "notFound", // We asked github earlier and they said they didn't have it
      onError: "error", // Unknown error we couldn't recover from
    },
    lookInGithub: {
      entry: lookInGithub,
      onFound: "found", // Either came from the cache or from github and then we cached it
      on404FromGithub: "notFound", // Github said they didn't have it
      onRateLimitExceeded: "rateLimitExceeded", // Github said we are hitting their api to much
      onError: "error", // Having trouble getting data from the github api?
    },
    found: { final: true }, // Got it!
    notFound: { final: true }, // Don't have it
    rateLimitExceeded: { final: true }, // Oops
    error: { final: true }, // Hopefully this never happens, but we are taking care of it if it does :thumbsup:
  },
});

if (result.step === "found") {
  return {
    status: "found",
    content: result.data.content,
    etag: result.data.etag,
    cacheHit: result.data.cacheHit,
  };
}
if (result.step == "notFound") {
  return { status: "notFound", content: "", cacheHit: result.data.cacheHit };
}
if (result.step == "rateLimitExceeded") {
  return {
    status: "rateLimitExceeded",
    limit: result.data.limit,
    remaining: result.data.remaining,
    timestampTillNextResetInSeconds:
      result.data.timestampTillNextResetInSeconds,
    content: result.data.content,
    etag: result.data.etag,
    cacheHit: result.data.cacheHit,
  };
}
if (result.step == "error") {
  return {
    status: "error",
    message: result.data.message,
    error: result.data.error,
  };
}
...

State machines are amazing for complex control flow logic. Just by looking at the steps from the code above, I think you would have a pretty decent idea what this application is trying to do.

Anyway, if your interested and want to dive more into the code you can check it out on GitHub.