Flue’s most important architectural decision is not a route, a tool, or a config file. It is the seam where Flue stops.
At the pinned source, Session imports Agent from @mariozechner/pi-agent-core and model/message types from @mariozechner/pi-ai. That is not an implementation detail. It is the boundary behind the hub thesis: Flue owns the harness layer and rents the lower-level model loop.
The source pin for this chapter is withastro/flue@dbaa9effa305561c627c6836559f8a0cbce67875.
Provider state crosses the seam for one call, then the harness restores the previous runtime scope.
Domain Word
In Flue, the provider seam is the boundary where Flue configures model execution before delegating to pi-agent-core and pi-ai. The model loop is the provider/tool-call loop that those packages run.
The invariant is: provider settings do not become harness memory, and model transport does not own Flue’s session, run, tool, or deployment semantics.
The twelve-factor pressure is dependencies. Flue should depend on provider/model packages explicitly instead of hiding a forked provider loop inside its runtime.
Imports Tell The Truth
The pinned packages/runtime/src/session.ts starts with the dependency split:
import type { AgentMessage, AgentTool, AgentToolResult } from '@mariozechner/pi-agent-core';
import { Agent } from '@mariozechner/pi-agent-core';
import type {
AssistantMessage,
ImageContent,
Model,
ToolResultMessage,
UserMessage,
} from '@mariozechner/pi-ai';
Those package names matter. Older prose that names the retired pi package scope is stale for this pin. The runtime package at this commit is @flue/runtime 0.5.3.
The Constructor Boundary
Session constructs the rented loop once it has prepared Flue-owned state:
this.harness = new Agent({
initialState: {
systemPrompt,
model: this.config.model,
tools,
messages: previousMessages,
thinkingLevel: this.config.thinkingLevel ?? 'medium',
},
getApiKey: (provider) => this.getProviderApiKey(provider),
onPayload: (payload, model) => this.applyProviderPayloadOverrides(payload, model),
toolExecution: 'parallel',
sessionId: options.affinityKey,
});
Flue supplies the system prompt, current model, tool list, rebuilt context messages, API key lookup, payload hook, tool execution mode, and affinity ID. pi-agent-core then runs the loop.
Flue Session ├─ builds system prompt ├─ builds active-path messages ├─ builds built-in + custom tool list ├─ resolves role/model/thinking overrides ├─ resolves provider API keys └─ constructs Agent(...) │ ▼ pi-agent-core │ provider/tool-call loop ▼ pi-ai providers After the loop: Flue syncs messages, emits events, saves history, aggregates usage, and checks compaction.
That is the seam. Flue configures the rented loop, but it does not become the provider SDK.
API Key Lookup
getProviderApiKey(provider) checks explicit provider configuration first and registered provider API keys second. If neither exists, it returns undefined and lets pi-ai fall through to its own environment lookup.
That order is the right shape for a framework:
| Source | Why it exists |
|---|---|
configureProvider(...) override | Runtime app wants explicit provider config. |
| registered provider template | Generated or user code registered a prefix with credentials. |
| pi-ai fallback | Provider SDK may already know env-var conventions. |
The agent file should not scatter provider key lookup. The session owns the seam.
Provider Payload Override
applyProviderPayloadOverrides(...) is intentionally narrow. At the pinned source it only returns a modified payload for openai-responses and azure-openai-responses, and only when provider configuration has storeResponses === true.
The result is:
return { ...(payload as Record<string, unknown>), store: true };
This is a transport setting. It tells the OpenAI Responses API to store provider-side response state. It is not Flue session history, not replay state, and not run inspection. Flue’s durable harness record remains SessionHistory plus the configured session store.
If a reader collapses these ideas into one word, “memory”, they will misread the system. Provider retention is hosted transport behavior. Session history is harness-owned execution state.
Role And Call Scoping
Flue also owns model and thinking-level precedence before the loop runs.
For models, the pinned source uses:
- agent default model from config
- role model, if the effective role has one
- call-level model override
requireModel(...)to fail clearly if no model exists
For thinking level, the precedence is:
- call-level thinking level
- role thinking level
- agent default thinking level
'medium'
withScopedRuntime(...) applies those scoped values to the underlying agent state for one call, then restores the previous tools, model, system prompt, and thinking level in a finally block.
That finally is load-bearing. Without it, one prompt(...) call could leak a role, tool set, model, or thinking level into the next call on the same session.
Model Resolution Belongs At Runtime Boundaries
packages/runtime/src/runtime/providers.ts holds the provider registry that backs model resolution. It lets runtime app code register provider prefixes, configure provider settings, attach platform-specific model bindings, and resolve a model string such as name/modelId against registered providers or known model catalogs.
This is why provider registration belongs near runtime app composition, not buried in every agent file. Build target and runtime environment decide which bindings and secrets exist. The session only needs a resolved model when it is about to run the call.
Why Not Reimplement The Loop?
Flue could have tried to own provider streaming, model catalogs, tool-call execution semantics, and provider-specific payload construction. That would make the runtime larger but not necessarily more valuable.
Instead, Flue spends its ownership budget on the harness surfaces that need to remain stable across model churn:
| Stable Flue surface | Rented lower-level surface |
|---|---|
| Sessions and active paths | Provider message formats |
| Tool contracts and sandbox adapters | Provider tool-call lifecycle |
| Compaction entries and retry recovery | Context overflow detection details |
| Run identity and inspection APIs | Streaming event protocol details |
| Build targets and app composition | Model catalog internals |
That is a conservative framework bet. Models and provider APIs will change faster than the headless runtime concepts.
What Breaks If This Boundary Drifts
| Drift | Failure |
|---|---|
| Flue reimplements provider streaming | Every provider API change becomes framework churn. |
| Provider hosted state is treated as replay source | Session recovery depends on data outside Flue’s store. |
| Role overrides are applied without restoration | One call silently contaminates the next. |
| API key lookup moves into user code | Agent files become environment adapters instead of work declarations. |
storeResponses is explained as Flue memory | Operators misconfigure retention and expect replay guarantees they do not have. |
What To Copy
The copyable pattern is the seam shape: prepare your harness-owned state, pass a narrow set of hooks to the rented loop, and restore all call-scoped mutations in a finally block.
That pattern lets a framework adopt provider improvements without surrendering the semantics that make the framework useful.
Verify In Source
packages/runtime/package.jsonhas@mariozechner/pi-aiand@mariozechner/pi-agent-core.packages/runtime/src/session.tsimportsAgentfrom@mariozechner/pi-agent-core.Sessionconstructsnew Agent(...)withinitialState,getApiKey,onPayload,toolExecution, andsessionId.applyProviderPayloadOverrides(...)only setsstore: truefor OpenAI Responses APIs withstoreResponses.withScopedRuntime(...)restores tools, model, system prompt, and thinking level infinally.packages/runtime/src/runtime/providers.tsowns provider registration, configuration, binding attachment, and registered model resolution.