murmur.red AI Operations · Field Notes
The Portfolio Approach · For small & medium companies

Building AI
Resilience

Six moves that keep your operation running when the market consolidates to three vendors. And one of them blinks.

The technology is solved. The structure is fragile. Most companies building on AI today have a single point of failure that isn't technical. It's a vendor, a regulator, or a country away from being switched off. You can't own a frontier model. You can own everything around it. That's the work.

01

Multimodel Architecture

Don't bet the company on one vendor's API. Build your critical paths to run across multiple models: OpenAI for some tasks, Anthropic for others, open source where it fits. When one goes down or gets restricted, you degrade gracefully instead of stopping.

Fails without itOne API key away from zero.
02

In-House Capability for Critical Paths

Identify what actually matters to the business and fine-tune smaller models on your own data for those functions. You don't need frontier capability everywhere. You need ownership where it counts.

Fails without itRenting your own core competency.
03

Fallback Infrastructure

Keep open-source models (Llama, Mistral, others) running on your own servers for essential operations. They're not as good as frontier models. They're good enough when the alternative is nothing.

Fails without itNothing is worse than good enough.
04

Data Moat

Build proprietary datasets and domain-specific fine-tuning. Your advantage isn't renting someone else's model. It's knowing your domain better than the generic LLM does.

Fails without itThe generic model knows your business as well as anyone's. Which is to say, not at all.
05

Vendor Diversification

Spread the dependency. Multiple cloud providers. Multiple API vendors. Multiple model architectures. Single points of failure kill companies faster than bad technology does.

Fails without itOne outage you didn't cause, charged to your name.
06

Regulatory Hedging

Know where the rules are moving. Build infrastructure that stays compliant across jurisdictions (US, EU, others), so a policy shift in one region doesn't freeze your entire operation.

Fails without itA policy you didn't write, freezing a business you did.
The Toolmap

Six moves, mapped to what you actually use

Principles are cheap. Here's the working stack: the tools and utilities that turn each move into something running in production. Pick one per layer and you have resilience by Friday.

01 · Multimodel Architecture

Routing & Gateways

OpenRouter LiteLLM Portkey Vercel AI SDK AWS Bedrock Cloudflare AI Gateway
02 · In-House Capability

Fine-Tuning

Unsloth Axolotl Hugging Face PEFT Predibase Together AI Modal
03 · Fallback Infrastructure

Self-Hosted Serving

Ollama vLLM llama.cpp TGI LM Studio SkyPilot
04 · Data Moat

Data & Retrieval

LlamaIndex pgvector Qdrant Weaviate Unstructured Label Studio
05 · Vendor Diversification

Portable Infrastructure

Azure OpenAI Google Vertex AWS Bedrock Terraform Kubernetes Docker
06 · Regulatory Hedging

Compliance & Guardrails

Mistral (EU) Scaleway OVHcloud Guardrails AI Lakera Vanta

Examples, not endorsements: one viable pick per layer. The point is coverage, not a specific logo.

The actual point
The real risk isn't AI. It's an AI-dependent economy built without resilience.
Build it properly