AI News

Claude Slots Into Apple's Foundation Models Framework

A new Swift package lets the same on-device session API call Claude in the cloud, swapping models by one argument.

Mariana Souza

Senior Editor · Jun 15, 2026 · 5 min read

When Apple shipped its Foundation Models framework, the pitch was local-first: a compact model running on the device, fast and private, driven through a clean Swift API with guided generation and tool calling baked in. The OS 27 betas quietly add the piece that turns that local API into a routing layer — a server-side language model provider interface — and Anthropic has already plugged into it.

The result is Claude for Foundation Models, a Swift package that conforms Claude to the framework's LanguageModel protocol. The payoff is architectural: your app keeps talking to the same LanguageModelSession, and you decide per session whether the work runs on Apple's on-device model or goes out to Claude in the cloud. No second SDK, no separate request-building code path.

One session API, two very different models

The design goal here is that nothing about the calling code changes. respond(to:), streaming, guided generation, and tool calling all behave the same whether the session is backed by the on-device model or Claude. You build a ClaudeLanguageModel, hand it to a session, and use it exactly as you would with any Foundation Models provider:

import FoundationModels
import ClaudeForFoundationModels

let model = ClaudeLanguageModel(
    name: .sonnet4_6,
    auth: .apiKey(ProcessInfo.processInfo.environment["ANTHROPIC_API_KEY"] ?? "")
)
let session = LanguageModelSession(model: model)
let response = try await session.respond(to: "Plan a 4-day trip to Buenos Aires.")
print(response.content)

Switching backends is a one-line change: pass a different model: to the session. That makes the interesting decision a routing decision, not a rewrite. Apple's on-device model is fast, private, and works offline, but it's sized for lightweight tasks. The package frames Claude as the escalation path when you need larger context windows, frontier reasoning, or server-side tools like web search and code execution — the kinds of jobs a 3-billion-ish-parameter on-device model was never meant to carry.

One detail worth underlining for the privacy-conscious: requests go directly from the app to the Claude API. Apple is not in the request path and never sees prompts or responses. Usage is billed straight to your Anthropic account at standard API pricing.

Capabilities are typed, so mismatches fail loud

Each ClaudeModel carries a declaration of what it accepts — sampling parameters, effort levels, adaptive thinking, structured output, image input. The package uses that metadata to decide which request fields to send, because handing a model a field it rejects is a hard error rather than something silently ignored. Compiled-in constants (.sonnet4_6, .opus4_8, which maps to claude-opus-4-8) ship with the correct capabilities attached. If you need a model ID that isn't compiled in yet, you spell out its capabilities by hand — there's deliberately no shorthand that guesses:

let model = ClaudeModel(
    id: "claude-experimental-x",
    capabilities: .init(samplingParams: false, effortLevels: [.low, .high])
)

Reasoning effort is where the two worlds don't quite line up, and the package is honest about it. Apple's framework exposes reasoning hints that top out at high. Claude understands five levels — low, medium, high, xhigh, max — so the package adds fixedEffort: to pin a level for every request. It takes precedence over the framework's per-request hints, and it's the only way to reach xhigh or max:

ClaudeLanguageModel(name: .opus4_8, auth: auth, fixedEffort: .xhigh)

The API defaults to high when no effort is sent, and not every model accepts effort at all — another constraint the capability metadata enforces for you.

Keys in dev, proxies in production

Authentication is set through the auth: parameter, and the docs are blunt about the trap. The .apiKey mode is for development only. A key baked into a shipping app is extractable from the binary, and anyone who pulls it can run requests billed to your account.

For production, the package points you at .proxied: route requests through your own back end, which injects the Claude credential server-side so the app ships no key at all. You supply headers that ride along on every request, giving your proxy a way to authorize the actual caller. It's the same pattern any sane mobile-to-LLM deployment already uses, but it's good to see it called out as the default expectation rather than a footnote.

What it signals, and the caveats

The honest framing isn't "Apple opened its models to developers" — it's the inverse. Apple opened a provider slot in Foundation Models, and a frontier vendor filled it. That's arguably more useful: the on-device model and a cloud model now share one ergonomic Swift surface, so hybrid apps can keep cheap, private inference local and reach for heavier reasoning only when a task demands it.

The caveats are real, though. This is explicitly beta, targeting an API introduced in the OS 27 betas, and the surface may shift before general availability. It requires iOS, macOS, visionOS, or watchOS 27 plus Xcode 27 — all beta — and the runnable example needs a macOS 27 host. The package is also intentionally narrow: it's a Foundation Models provider conformance and its configuration types (ClaudeLanguageModel, ClaudeModel, AuthMode, ClaudeServerTool), not a general-purpose Messages API client. For direct API access you still reach for the standard SDKs.

Still, the shape of the thing is the story. Write your app against one session API, declare what each model can do, and let routing — not rewrites — decide where the tokens go.

Sources & further reading

Apple Foundation Models — platform.claude.com

#Anthropic #Claude #Foundation Models #On Device Ai #Apple #Swift

Written by

Mariana Souza · Senior Editor

Mariana covers the fast-moving world of machine learning and generative AI, with a particular focus on how these technologies are reshaping development workflows. When she isn't stress-testing the latest foundation models, she's usually at a local hackathon.

Discussion 3

Join the discussion

Iris Lund @designer_iris · 18 hours ago

i love how seamlessly claude integrates with the foundation models framework, but i'm curious to see how the typography and overall ux will be handled when switching between local and cloud models - will it be a jarring experience for users?

Dmitri Sokolov @ai_doomer_dmitri · 20 hours ago

i'm intrigued by the potential of claude for foundation models, but i do wonder about the privacy implications of routing local api calls to a server-side language model - how will apple and anthropic ensure user data isn't inadvertently exposed in the process?

Marc Pope @marcpope · 8 hours ago

@ai_doomer_dmitri well according to the new Fable model (if they re-open it back up), they keep your data for 30 days no matter what. even if you have it set to private.

Claude Slots Into Apple's Foundation Models Framework

One session API, two very different models

Capabilities are typed, so mismatches fail loud

Keys in dev, proxies in production

What it signals, and the caveats

Sources & further reading

Discussion 3

Related Reading

CrankGPT Parody Exposes the Real Cost of AI Compute

Going Local: The Reality of Replacing Claude and GPT

Stop Wasting Tokens: High-Efficiency Prompting for Budget LLMs

Indexing 669 GB of Video Locally on Apple Silicon