Keep Your Agent Running When Models Disappear
Kenny Rogers and Alex Atallah ·
On this page
- Hard-coded slug vs preset reference
- Give this to your agent
- Capture a working request as a preset
- Reference the preset from your code
- Add fallback models so requests keep flowing
- Set your data policy in the preset
- Roll out and roll back across your team
- Wire it all together
- Start with one preset
- FAQ
Providers retire and restrict models routinely. More than 70 models have been pulled or deprecated by providers in the last few years. Anthropic’s Fable being pulled recently is perhaps the most high-profile and impactful example of this we’ve seen, but the pattern isn’t new and isn’t going away.
OpenRouter already handles one layer of this for you. When a model runs on several providers and one of them fails or rate-limits, the marketplace reroutes to another provider automatically, with no configuration. We cover how that failover works in a separate post.
That keeps a single model reachable through provider trouble, but it can’t help once the model itself is gone. For that you want model failover, where requests move to a different model when your first choice disappears. Presets are how you set that up.
Hard-coding a model slug pins your choice inside every service that uses it. When that model goes away, the only fix is to edit the code and redeploy each service, and requests keep failing until you do.
A preset takes that choice out of your code. It’s a named, server-side configuration (model, fallback models, provider rules, parameters, and a system prompt) that you reference by slug. The model lives in the preset instead of the code, so you change it in one place and every service that calls the preset picks it up with no redeploy.
Here’s a simple preset definition. Copy it and adjust the models:
{
"models": [
"anthropic/claude-fable-5",
"anthropic/claude-opus-4.8",
"openai/gpt-5.5"
],
"provider": { "allow_fallbacks": true }
}
The models array is your fallback chain, in priority order. If the first model is unavailable, OpenRouter tries the next one.
Hard-coded slug vs preset reference
| Question | Hard-coded model string | Preset reference |
|---|---|---|
| A provider restricts the model | Every service breaks until you edit and redeploy | Edit the preset once; callers keep running |
| Who ships the fix | Whoever owns each codebase | Whoever owns the preset |
| Blast radius of a change | One edit per repo, per service | One edit, applied everywhere |
| Data policy (ZDR, retention) | Re-stated in every request | Set once on the preset |
| Rollback | Revert a commit and redeploy | Re-designate a previous version |
Give this to your agent
Want your coding agent to wire this up? Copy the prompt below:
I want to stop hard-coding model slugs so one provider change can't take down my app. Set up an OpenRouter preset and route my calls through it.
1. Create a preset named "customer-support" with a fallback chain: a primary model plus 2 backups in priority order, using the models array.
2. Set provider rules on the preset: allow_fallbacks true, and zdr true if my data policy requires Zero Data Retention.
3. Capture it by POSTing a known-good chat/completions body to https://openrouter.ai/api/v1/presets/customer-support/chat/completions with my OpenRouter API key.
4. Replace the model field in my inference calls with "@preset/customer-support".
5. Keep my OpenRouter API key in an environment variable. Never hard-code it.
Use these references for current shapes:
- Presets: https://openrouter.ai/docs/guides/features/presets
- Provider routing and fallbacks: https://openrouter.ai/docs/guides/routing/provider-selection
Capture a working request as a preset
You can build a preset in the dashboard, or capture one from a request body you already trust.
Send a known-good chat/completions body to the preset capture endpoint. OpenRouter persists the fields that overlap with the preset config (models, provider, temperature, and so on) and ignores transient fields like messages:
curl https://openrouter.ai/api/v1/presets/customer-support/chat/completions \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"models": [
"anthropic/claude-fable-5",
"anthropic/claude-opus-4.8",
"openai/gpt-5.5"
],
"provider": { "allow_fallbacks": true },
"messages": [
{ "role": "system", "content": "You are a concise support assistant." },
{ "role": "user", "content": "Summarize this ticket in one sentence." }
]
}'
If a preset with that slug already exists, this creates a new version and designates it active. If it doesn’t exist, it creates the preset. Pick a slug that isn’t already in use, since capturing onto an existing slug overwrites its live config with a new active version. The response returns the preset with its designated version (trimmed here):
{
"data": {
"name": "customer-support",
"slug": "customer-support",
"status": "active",
"designated_version": {
"version": 1,
"system_prompt": "You are a concise support assistant.",
"config": {
"models": [
"anthropic/claude-fable-5",
"anthropic/claude-opus-4.8",
"openai/gpt-5.5"
],
"provider": { "allow_fallbacks": true }
}
}
}
}
Reference the preset from your code
Now point your inference calls at @preset/customer-support. The model choice lives in the preset, so this line stays the same when the underlying model changes.
Install an SDK first:
pip install openrouter # Python
npm install @openrouter/sdk # TypeScript
from openrouter import OpenRouter
import os
client = OpenRouter(api_key=os.getenv("OPENROUTER_API_KEY"))
response = client.chat.send(
model="@preset/customer-support",
messages=[
{"role": "user", "content": "Summarize this ticket in one sentence."}
],
)
print(response.choices[0].message.content)
import { OpenRouter } from '@openrouter/sdk';
const client = new OpenRouter({ apiKey: process.env.OPENROUTER_API_KEY });
const response = await client.chat.send({
chatRequest: {
model: '@preset/customer-support',
messages: [
{ role: 'user', content: 'Summarize this ticket in one sentence.' },
],
},
});
console.log(response.choices[0]?.message.content);
Default to the @preset/slug form shown above. Reach for the combined model@preset/slug form when you want to name a base model and layer a preset’s config on top of it, or use a separate preset field alongside model if you’d rather keep them as distinct request fields. The presets docs cover all 3.
Add fallback models so requests keep flowing
The models array is the part that survives a deprecation. Pass models in priority order, and OpenRouter walks the list when one is unavailable.
Put the model you trust most as the last entry, so your final fallback is a floor you’re comfortable shipping. For a coding workload that leaned on Fable 5, a chain like anthropic/claude-fable-5, then anthropic/claude-opus-4.8, then openai/gpt-5.5 keeps strong models in reserve.
Here’s that fallback firing. With Fable 5 restricted, a request naming the chain above succeeds anyway, and the response’s model and provider fields name what actually served it:
{
"model": "anthropic/claude-4.8-opus-20260528",
"provider": "Anthropic",
"choices": [
{
"message": {
"role": "assistant",
"content": "Customer cannot log in because password reset emails are not being received, despite checking spam and confirming the correct email address."
}
}
]
}
OpenRouter skipped the restricted primary and served the next model in the array. Your code didn’t change.
The model field reports the concrete version that actually served the request, so it reads differently from the slug you sent. Here anthropic/claude-opus-4.8 resolved to the dated build anthropic/claude-4.8-opus-20260528 on Anthropic.
2 layers of recovery stack here. Provider-layer failover is automatic: for one model served by several providers, OpenRouter retries the next provider on a 5xx or rate-limit. Model-layer fallbacks are the models array, which moves to a different model when the whole primary is gone. For the mechanics of each, see reliability and automatic failover and model routing.
Set your data policy in the preset
Provider rules ride along in the same preset, so a routing policy applies to every caller without a code change.
{
"models": [
"anthropic/claude-fable-5",
"anthropic/claude-opus-4.8",
"openai/gpt-5.5"
],
"provider": {
"zdr": true,
"data_collection": "deny",
"allow_fallbacks": true
}
}
zdr: true keeps requests on endpoints that honor Zero Data Retention. data_collection: "deny" blocks providers that train on or store prompts. You can also pin or exclude specific providers with only, ignore, and order. See provider routing for the full list.
This is where the Fable 5 situation gets concrete. Its model page notes that Anthropic’s policy “does not allow zero data retention.” With zdr: true set on the preset, routing skips Fable 5 because it can’t satisfy the rule, and falls through to the next model in your array that can. One switch, enforced server-side, for every request that names the preset.
Roll out and roll back across your team
On an organization account, every member can use organization presets, so a routing decision made once is shared instead of copied into each repo.
Every capture or edit creates a new version and marks it active. Version history is kept, so a bad change is one re-designation away from a rollback. Through the API, the latest designated version is always the one that runs. You re-designate versions and delete presets from the dashboard; the API captures and reads presets but has no delete endpoint.
Parameters you pass in a request override the preset’s values, shallow-merged. Request fields win, and preset fields you don’t send are preserved. That lets a single call bump temperature without forking the preset.
Wire it all together
The data-policy config above already holds all 3 layers: the model chain, the provider policy, and (once you add one) the system prompt. Capture it once with the curl call, reference @preset/customer-support everywhere, and the next time a provider restricts a model you edit one config instead of chasing slugs through every service.
If you’d rather not pin a primary at all, point the chain at a self-updating alias like ~anthropic/claude-opus-latest, which always resolves to the newest model in that family.
Start with one preset
Pick your highest-traffic call, create a preset for it at openrouter.ai/settings/presets, give it a fallback chain, and swap the model string for @preset/your-slug. That one move turns a forced migration into a config edit.
Presets also pair well with governance work: the same control point that survives a deprecation is where you enforce data-handling rules, as covered in human oversight for AI agents.
Note: This post covers engineering patterns, not legal advice. For export-control, data-residency, or retention obligations, consult counsel about your specific use case and jurisdiction.
FAQ
What happens to my app when a model is deprecated or restricted?
If your code hard-codes the model slug, requests to that model start failing and every service that used it breaks until you edit the code and redeploy. Route through a preset and you edit one config; callers pick up the change with no redeploy. A models fallback array can keep requests succeeding on a backup while you decide what to do.
How is a preset different from passing a models fallback array in code?
A models array sets the fallback order for one request. A preset stores that array, plus provider rules, parameters, and a system prompt, on the server under a slug. You reference it with @preset/slug, so the configuration lives in one place across every service and changes without a code edit.
Do request parameters override preset values?
Yes. Request parameters take priority over the preset’s values, shallow-merged. Request-level fields override matching preset fields, and preset fields you don’t send are preserved.
Can I enforce Zero Data Retention with a preset?
Yes. Set provider.zdr to true. OpenRouter routes only to endpoints that honor Zero Data Retention, skips models or providers that can’t, and falls through your models array to the next qualifying option.
How do I roll back a preset change?
Every capture or edit creates a new version and designates it active. Version history is kept, so you can re-designate a previous version. Through the API, the latest designated version is always used.