Why Cloud APIs Are Already Obsolete in Healthcare
Three years ago, I started using Manus AI as a research agent. Like many, I was captivated by its capability. But as I integrated it deeper into my daily workflows, two things slipped entirely out of my control: cost and ownership.
When the platform recently surged its pricing-forcing users who were paying for a €200 model to suddenly switch to a €1,000 model - the trap snapped shut. Because power users had embedded the API deep into their processes, the switching costs were painfully high. The lock-in was real.
As a builder of an AI startup focused on what we call proximity AI, the logical move was clear. I switched to a local AI agent. Today, it delivers the exact same performance, but I have regained my sovereignty, and my costs are entirely predictable.
I switched to a local AI agent. Today, it delivers the exact same performance, but I have regained my sovereignty, and my costs are entirely predictable.
This personal experience is a microcosm of a much larger industry shift. For the past three years, the tech industry has been obsessed with scale, chasing parameter counts into the trillions behind opaque cloud APIs. But if you look closely at the economics and physics of the clinical frontline, a different reality is emerging. We are witnessing a classic Clayton Christensen disruption—and the incumbent cloud APIs are standing right in the crosshairs.
The Performance Overshoot
Christensen’s theory of disruptive innovation teaches us that established companies continually improve their products to capture the highest-paying tier of the market. In doing so, they inevitably "overshoot" what mainstream customers actually need.
Today, frontier models are massively overserving the healthcare market. The daily tasks of a clinician-writing a SOAP note, summarizing a patient history, triaging an inbox-do not require a trillion-parameter model. Research confirms that Small Language Models (SLMs), fine-tuned on high-quality medical data, can achieve over 98% validity in structured tasks.
Because the frontier APIs have exceeded what clinicians can actually use, the basis of competition is shifting. It is no longer about raw reasoning power. It is about data sovereignty, sub-second latency, offline availability, and cost predictability. This is the exact moment a simpler, more accessible technology - on-device AI - moves upmarket to disrupt the giants.

Figure 1: Christensen disruption trajectory showing local AI crossing the "good enough" threshold for clinical utility.
The Illusion of Cheap APIs
Token-based cloud systems might sound easy and cheap in the beginning. But as we move from simple chatbots to true agentic workflows, token consumption explodes. A single task executed by an AI agent - planning, retrieving context from a local knowledge base, drafting, and critiquing - can consume thousands of tokens.
We recently modeled the Total Cost of Ownership (TCO) for a standard 10-clinician practice running just five active AI agents per day. (For the full deep-dive on this model, see our Internal TCO Research Report).
The results are stark. Using a premium frontier API like Anthropic's Claude 4 Sonnet costs roughly €22,000 in the first year. But because you are renting intelligence by the word, those costs scale linearly with your success. When you factor in a conservative 30% price escalation over three years, that 10-clinician practice will spend over €75,000 on API calls. If they use a credit-based orchestration platform, that number approaches €183,000.
The Economics of Proximity AI
Compare this to the Isaree approach: local, on-device AI.
By running specialized agents directly on the clinician's existing hardware, the marginal cost of an additional AI query drops to the cost of electricity. In our model, a local AI deployment running on a clinician's existing MacBook Pro M5 Max - drawing just ~35W during active inference with community-driven support-costs roughly €9,300 over three years (driven almost entirely by the platform license).
That is a fraction of the cost of a premium API, and radically cheaper than cloud-based orchestration platforms. More importantly, it is a fixed cost. You are no longer penalized for using your AI more often.
3-Year Total Cost of Ownership (TCO) Comparison
Scenario: 10 Clinicians, 5 Agents/Day, Heavy RAG Context (207,000 total LLM calls/year).
Deployment Model | Year 1 Cost | Year 3 Cumulative TCO* | Cost per Clinical Encounter (Yr 1) |
Local AI (Isaree on MacBook M5 Max) | €3,105 | €9,315 | €0.07 |
OpenAI GPT-4.1 mini | €7,257 | €21,771 | €0.16 |
OpenAI GPT-4.1 | €15,884 | €50,830 | €0.35 |
Anthropic Claude 4 Sonnet | €22,257 | €75,673 | €0.48 |
Manus AI Extended | €47,610 | €183,298 | €1.04 |
Even when compared to the cheapest "lightweight" API baseline (GPT-4.1 mini at €21,771), running an equivalent local Small Language Model (SLM) on existing Apple Silicon yields a ~57% cost reduction. And unlike GPT-4.1 mini, the local SLM runs offline, keeps patient data strictly on-device, and provides a pathway to MDR certification-features that are non-negotiable for clinical deployment.
*Note: The token consumption, escalation rates, and hardware utilization figures are estimates based on standard clinical workflows and industry pricing trends. Year 3 TCO includes vendor price escalation estimates (10-30% for frontier APIs, 60% for Manus credits). For a detailed breakdown of the methodology and assumptions, see our Internal TCO Research Report.

Figure 2: Cumulative TCO over 3 years. Local AI flattens out into a fixed operational cost, while API costs compound exponentially with token bloat and price hikes.You cannot build a sustainable, cost-effective healthcare system on someone else's subsidized cloud tokens.
The Regulatory Fortress
In healthcare, the cost of the model is secondary to the cost of compliance.The EU AI Act classifies AI-enabled medical devices as high-risk by default. For frontier API-based clinical AI, certification is structurally nearly impossible. The models are moving targets with frequent, silent updates, and the data is processed in black-box servers across jurisdictions.
Local, on-device SLMs are frozen, auditable, and deployed on-premise. Sensitive patient data never leaves the hospital's network. This makes certification materially simpler and eliminates the need for complex, zero-retention data processing agreements. This regulatory gradient is itself a Christensen-style enabler, carrying local AI rapidly upmarket.
A Vendor-Neutral Agent Harness
The era of relying on centralized, monolithic cloud APIs for clinical automation is ending. The future is small, fast, local, and open.
Isaree is building the infrastructure for this future. We are an operating system and vendor-neutral agent harness with an agent builder platform that allows you to personalize agents (compliant with MDR Article 5) and download them directly into your client for local use.Although we are at the beginning of this journey, our analysis shows that this approach already gives clinicians the power to solve their own n=1 problems.
By providing a privacy-first, edge-based assistant, we empower physicians to build, share, and utilize certified AI agents that run locally, offline, and at a fraction of the cost of cloud APIs.
We are currently piloting Isaree with a closed group of clinicians across 14 countries, spanning from Canada to Cambodia.
The Agentic Web is already here. It’s time to build agentic workflows that you can own and control.