FoundationModels: Quick Reference Cheatsheet

The scannable companion to the full iOS 26 FoundationModels reference — key types, patterns, and things not to do

1 March 20268 min read
AI-Ometer
Heavily assisted75%
 

This is the fast version. For full detail on any section, follow the deep-dive links to the complete reference.


The 15-Part Index

PartCoversOne thing to knowDeep dive
1Availability & Setup.modelNotReady is transient — model is downloading, not missing
2Sessions & Basic Promptingrespond() returns Response<T> — always unwrap .content
3Prompt EngineeringShort beats long. <200 words. Explicit rules beat prose descriptions
4@GenerableMacro generates a structured output schema; @Guide adds constraints
5StreamingstreamResponse()AsyncSequence; use .collect() to finalise
6Generation Optionstemperature: nil or 0.0–0.2 for correction tasks; higher for creative
7Tool CallingPre-fetch if always needed; define as Tool only for conditional data
8Token BudgetAll inputs + output share one fixed window (~4,096 tokens)
9The TranscriptNew session per call for stateless tasks — don't accumulate history
10Failure Modesnormalise() should never throw — return raw input on any failure
11Testing@Generable types are unit-testable without the model (memberwise init)
12Use Cases10 concrete patterns: BJJ, recipes, journaling, commits, triage...
13Quick ReferenceFull type table + anti-patterns (see below)
14Context Engineering4,096 tokens = ~3,000 words shared. Select, don't dump
15Advanced Patternscall() runs @concurrent — hop to @MainActor for state access

Key Types

TypePurpose
SystemLanguageModelSingleton entry point — .default, .availability, .isAvailable
SystemLanguageModel.Availability.available / .unavailable(reason) — always handle @unknown default
LanguageModelSessionOne conversation thread. Stateful — holds Transcript
InstructionsSystem prompt — set once at session creation, not per-turn
PromptPer-turn user input to the model
Response<Content>Wrapper around typed output — always access .content, not response
ResponseStream<Content>AsyncSequence of Snapshot<Content> for streaming
GenerationOptionstemperature, maximumResponseTokens, SamplingMode
@GenerableMacro — synthesises guided generation schema for a struct or enum
@GuideProperty wrapper on @Generable fields — description + constraints
GenerationGuide<T>Constraint type: .range(), .count(), .pattern()
TranscriptLinear history: .instructions, .prompt, .response, .toolCalls, .toolOutput
ToolProtocol — Arguments (Generable), Output (PromptRepresentable), call()
SystemLanguageModel.TokenUsage.tokenCount — measure cost before injection

Session Init Variants

// Minimal — no tools, inline instructions
LanguageModelSession { "Correct BJJ terminology. kimora→Kimura, half card→Half Guard." }

// Explicit model
LanguageModelSession(model: SystemLanguageModel.default) { "..." }

// With tools
LanguageModelSession(tools: [PositionLookupTool()]) { "..." }

// Resume from saved transcript
LanguageModelSession(model: .default, tools: [], transcript: savedTranscript)

respond() vs streamResponse()

respond()streamResponse()
ReturnsResponse<Content>ResponseStream<Content>
Best forBackground processing, pipelinesLive UI with typing effect
Partial resultsNoYes — snapshot.content returns PartiallyGenerated
Finalise streamN/A.collect()Response<Content>

Rule: if the output is going directly into a pipeline or SwiftData model, use respond(). If the user sees it appear on screen as it generates, use streamResponse().


Token Budget Formula

Total window ≈ 4,096 tokens ≈ 3,000 words

instructions + tool definitions + transcript history + prompt + response

All five compete for the same pool. Response tokens are consumed from the same window as input tokens — a 500-token response leaves 3,596 tokens for everything else.

Measure before injecting:

let cost = try await model.tokenUsage(for: instructions).tokenCount
let window = await model.contextSize  // back-deployed via @backDeployed

@Generable vs Raw String

Use @Generable when:

  • You need multiple structured fields
  • Output must be parsed or processed programmatically
  • You want compile-time guarantees on shape
  • You need constraints (@Guide) on values

Use raw String when:

  • Output is prose for display to the user
  • You're summarising or generating a paragraph
  • Streaming a typing effect

The AnyObject? Pattern (Availability Without @available Spread)

The problem: adding a @State private var session: LanguageModelSession? forces @available(iOS 26, *) onto the whole view.

The fix: use AnyObject? as the declared type and cast inside #available guards.

// In your view — no @available annotation needed on the view itself
@State private var normalisationService: AnyObject?

// In .onAppear or .task
if #available(iOS 26, *) {
    normalisationService = TranscriptNormalisationService()
}

// At call site
if #available(iOS 26, *),
   let service = normalisationService as? TranscriptNormalisationService {
    let result = await service.normalise(rawText)
}

Context Engineering — 4 Patterns

When app data is too large to inject directly:

PatternWhen to useHow
Select, Don't DumpData is queryableSwiftData predicate — fetch only relevant rows
Layered InjectionHierarchical dataInject summaries; load detail on demand via tools
Two-Step CompressionLarge corpus, summary neededCall 1 summarises → Call 2 reasons with summary
Pre-Summarise at Write TimeRich entities with stable detailGenerate + store AI summary when entity is saved; reuse forever

The 10 Anti-Patterns

1. Accessing response instead of response.content respond() returns Response<T>, not T. Always unwrap .content.

2. Storing LanguageModelSession persistently when you don't need history For stateless tasks (normalisation, extraction, classification), create a new session per call. History accumulates and eventually overflows the context window.

3. Too many tools Each tool definition consumes ~50–100 tokens whether called or not. Keep to 3–5 per session. Split into multiple focused sessions if you have more.

4. Calling isAvailable / checkAvailability() in the hot path Availability doesn't change mid-session. Check once at service init; cache the result.

5. High temperature for structured / correction tasks @Generable correction types need nil or temperature: 0.0–0.2. High temperature produces creatively varied — and wrong — output.

6. Long, elaborate instructions modelled on frontier model prompts The on-device model is ~3B parameters. Instructions over ~200 words dilute signal. Short, explicit rules outperform discursive prose every time.

7. Not testing the fallback path On most devices today, Apple Intelligence is unavailable. Your non-AI path is the primary experience for most users. Test it as thoroughly as the AI path.

8. Using FoundationModels for regex-solvable tasks If the task is a known, fixed pattern (extract a UUID, validate an email, format a date), use a deterministic function. LLM overhead — latency, availability, complexity — is waste.

9. Propagating @available(iOS 26, *) to SwiftUI views Adding @available to a @State property forces the whole view to require iOS 26. Use the AnyObject? pattern instead.

10. Treating .modelNotReady as permanent .modelNotReady means the model is downloading. It is transient. Show "not available right now" and retry on next app launch. Do not display a permanent "unsupported" message.


Minimum Viable Service Pattern

The production-safe wrapper — never throws, falls back silently:

@available(iOS 26, *)
@MainActor final class TranscriptNormalisationService {

    private func makeSession() -> LanguageModelSession {
        LanguageModelSession {
            """
            You are a BJJ transcript corrector. Fix misrecognised terms only.
            Common corrections: kimora→Kimura, half card→Half Guard, arm bar→armbar.
            Vocabulary: Kimura, Triangle, Armbar, Half Guard, Full Guard, Mount, Back Control.
            Return the corrected transcript and the BJJ terms found.
            """
        }
    }

    /// Never throws. Returns raw input unchanged on any failure.
    func normalise(_ rawTranscript: String) async -> NormalisedTranscript {
        guard !rawTranscript.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty else {
            return NormalisedTranscript(normalisedText: rawTranscript, extractedTerms: [])
        }
        guard SystemLanguageModel.default.isAvailable else {
            return NormalisedTranscript(normalisedText: rawTranscript, extractedTerms: [])
        }
        do {
            let session = makeSession()
            let result = try await session.respond(
                to: Prompt { rawTranscript },
                generating: NormalisedTranscript.self
            )
            return result.content
        } catch {
            return NormalisedTranscript(normalisedText: rawTranscript, extractedTerms: [])
        }
    }
}

Availability Cases at a Glance

CaseMeaningWhat to do
.availableReady to useCreate session, proceed
.unavailable(.deviceNotEligible)Hardware doesn't support Apple IntelligenceShow permanent alternative UI; remove AI option
.unavailable(.appleIntelligenceNotEnabled)User hasn't enabled it in SettingsOptionally prompt user; respect their choice
.unavailable(.modelNotReady)Model weights downloadingShow "not available right now"; retry on next launch

Full 15-part reference: iOS 26 FoundationModels: Comprehensive Swift/SwiftUI Reference