Competitive Racing

How does Restate help?

The benefits of using Restate for competitive racing patterns are:

Durable coordination: Restate turns Promises/Futures into durable, distributed constructs that persist across failures and process restarts. Race multiple approaches and return the first successful result.

Cancel slow tasks: Failed or slower approaches can be cancelled, preventing resource waste

Serverless scaling: Deploy racing strategies on serverless infrastructure for automatic scaling while the main process remains suspended

Works with any LLM SDK (Vercel AI, LangChain, LiteLLM, etc.) and any programming language supported by Restate (TypeScript, Python, Go, etc.).

Example

When you need a quick response and have access to multiple AI models, race them against each other to get the fastest result:

async function run(
  ctx: Context,
  { message }: { message: string },
): Promise<string> {
  // Start both service calls concurrently
  const slowCall = ctx.serviceClient(racingAgent).thinkLonger({ message });
  const slowResponse = slowCall.map((res) => ({ tag: "slow", res }));

  const fastCall = ctx.serviceClient(racingAgent).respondQuickly({ message });
  const fastResponse = fastCall.map((res) => ({ tag: "fast", res }));

  const pending = [slowResponse, fastResponse];

  // Wait for the first one to complete
  const { tag, res } = await RestatePromise.any(pending);

  if (tag === "fast") {
    console.log("Quick response won the race!");
    const slowInvocationId = await slowCall.invocationId;
    ctx.cancel(slowInvocationId);
  } else {
    console.log("Deep analysis won the race!");
    const quickInvocationId = await fastCall.invocationId;
    ctx.cancel(quickInvocationId);
  }

  return res ?? "LLM gave no response";
}

View on GitHub: TS / Python

In the Restate UI, you can see how multiple approaches are started simultaneously, with the first successful result being returned while other tasks are automatically cancelled:

Run the example

Requirements

AI SDK of your choice (e.g., OpenAI, LangChain, Pydantic AI, LiteLLM, etc.) to make LLM calls.
API key for your model provider.

Download the example

git clone https://github.com/restatedev/ai-examples.git &&
cd typescript-patterns &&
npm install

Start the Restate Server

restate-server

Start the Service

Export the API key of your model provider as an environment variable and then start the agent. For example, for OpenAI:

export OPENAI_API_KEY=your_openai_api_key
npm run dev

restate deployments register localhost:9080

Send a request

In the UI (http://localhost:9070), click on the run handler of the RacingAgent service to open the playground and send a prompt to race multiple models:

Check the Restate UI

In the UI, you can see how multiple models are queried simultaneously, with the fastest response winning the race:

LLM & Agent SDK Integrations

Orchestration

Humans & Chat

Resiliency & Consistency

Competitive Racing

How does Restate help?

Example

LLM & Agent SDK Integrations

Orchestration

Humans & Chat

Resiliency & Consistency

Documentation Index

​How does Restate help?

​Example

How does Restate help?

Example