Companion SDK & API reference
Embed the Pouchy companion into any web app, game, or device. All six capabilities — memory, reasoning, skills, agent-to-agent social, wallet, and Instant UI — are reachable over one isomorphic TypeScript SDK on an open REST/SSE + WebRTC protocol. Each capability rides a platform-neutral contract, so the same companion runs on web, native iOS/Android, or a CLI.
@pouchy_ai/companion-sdk · v0.11 · public npm · proprietary licenseOverview
The Companion SDK gives any product its own Agent. The same companion — its memory, personality, voice and avatar — plugs into your surface and can act through tools, reason over your app's live state, and hold a continuous relationship with each user across sessions and devices.
- Standards-first. REST + SSE for messaging, WebRTC for voice, OpenAPI 3.1 for discovery.
- Isomorphic. Runs in the browser and in Node 18+; tree-shakeable, zero required runtime deps.
- Scoped & safe. Every capability — memory, skills, wallet, social, Instant UI — is gated by token scopes; sensitive ops are confirmation-gated, with an optional biometric step-up.
- Beyond the sandbox. The wire contracts are platform-neutral data, so Instant UI renders on web, native iOS/Android or a CLI, and skills can run host-local / device capabilities — not just HTTP.
Installation
# the SDK (isomorphic: browser + Node 18+)
npm i @pouchy_ai/companion-sdk
# optional — only for ElevenLabs Convai voice; OpenAI Realtime needs nothing extra
npm i @elevenlabs/client No bundler? Load it from the CDN with an import map:
<script type="importmap">
{ "imports": { "@pouchy_ai/companion-sdk": "https://esm.sh/@pouchy_ai/companion-sdk" } }
</script>
<script type="module">
import { createCompanion } from '@pouchy_ai/companion-sdk';
const c = createCompanion({ baseUrl: 'https://www.pouchy.ai', token: PAT });
await c.connect();
</script>Quickstart
Connect, subscribe to replies, open the stream, and talk — four calls:
import { createCompanion } from '@pouchy_ai/companion-sdk';
const companion = createCompanion({
baseUrl: 'https://www.pouchy.ai',
token: POUCHY_PAT // a server-minted Personal Access Token (pchy_…)
});
await companion.connect(); // handshake → HelloAck
companion.onMessage((text) => render(text)); // stream the reply
companion.start(); // open the reply channel (SSE)
await companion.sendText('hey — what should I do next?');Authentication & scopes
Authenticate with a Personal Access Token (PAT), prefixed pchy_, sent as Authorization: Bearer <token>. Mint PATs server-side and hand the client a short-lived token — never embed a long-lived PAT in client code. OAuth 2.1 + PKCE is also supported for user-authorized integrations.
// Mint a PAT server-side and hand the client a short-lived token.
// Never ship a long-lived PAT in client code.
const companion = createCompanion({
baseUrl: 'https://www.pouchy.ai',
token: PAT, // Authorization: Bearer <token> on every call
modalities: ['text', 'voice'], // intersected with the token's grant
}); Sensitive capabilities are opt-in per token scope — e.g. wallet.spend, skills.execute, social.message, memory.*, and the representative scopes below. Instant UI rendering rides the non-sensitive ui.render scope. The handshake returns the effective grantedScopes.
Platform: sessions & Admin API
For developers and enterprises building on the dashboard: a project holds agent templates; every end user gets their own instance of an agent (separate memory, relationship progress, wallet). Your end users never handle keys — your backend holds a project Secret Key (pchy_sk_…, or pchy_sk_test_… for the unmetered test environment) and exchanges it per end user for a short-lived session token. First-seen external_user_ids are auto-provisioned.
# Your BACKEND exchanges the project Secret Key for a per-user session.
# First-seen external_user_ids are auto-provisioned — zero user setup.
POST /v1/sessions
Authorization: Bearer pchy_sk_…
{ "agent": "<agentId>", "external_user_id": "user_4211" }
→ { "session_token": "pchy_…", "expires_in": 3600,
"instance": { "id": "…", "external_user_id": "user_4211", "created": true } }
// Client: connect the SDK with the session token — same API as a PAT.
const companion = createCompanion({ baseUrl: 'https://www.pouchy.ai',
token: sessionToken, modalities: ['text'] }); The returned session_token is an ordinary bearer for the client SDK — pass it as token to createCompanion(). Sessions carry the non-sensitive default scopes and can only be narrowed; agent-level capabilities widen them — setting genui: true on the template adds ui.render (Instant UI) to every session it mints, and a skills list of built-in catalog slugs pre-installs those skills on every instance with skills.execute granted (v1: credential-free public read APIs — dictionary, wikipedia, exchange-rates, …; they auto-run without a confirm round-trip). A social policy ({ pair, crossProject }) opts instances into agent-to-agent pairing + text messaging via the SDK's pairVisitor flow — enforced server-side at pair creation; wallet interactions stay excluded. Templates also carry scripted scenes (story lines with triggers + beats, compiled into a pinned prompt section every turn), per-language voice defaults (voices, ElevenLabs id preferred on calls, OpenAI preset for TTS/fallback) and an initialStage relationship seed applied to new instances only. A wallet flag gives instances a READ-ONLY wallet (balance + own deposit address via wallet.read; end users fund their own instance, paired friends can send to it — instances can never spend). Projects additionally carry a shared knowledge corpus (POST /v1/projects/{id}/knowledge): upload product docs / FAQ / lore once, and every instance recalls the relevant chunks semantically alongside the user's personal memory, cited by document name. Plans cap monthly active users (free tier: 10; a new live user beyond the cap gets 402 limit reached — existing users keep working; test keys are quota-exempt).
Everything the dashboard does is also available programmatically via the Admin API with a project admin key (pchy_admin_…): list/create agents on /v1/admin/agents, read/update a template on /v1/admin/agents/{agentId} (updates bump templateRev; live instances re-apply the persona on their next session). Key types are strictly separated: secret keys can't manage, admin keys can't mint sessions.
# Manage the project from your backend — the dashboard is optional.
# Authorization: Bearer pchy_admin_… (project implied by the key)
GET/POST/PATCH/DELETE /v1/admin/agents[/{id}] # templates incl. draft/publish
GET/POST /v1/admin/keys · DELETE /{keyId} # secret keys (plaintext once)
GET /v1/admin/users · PATCH/DELETE /{iid} # suspend / GDPR-erase end users
GET/POST /v1/admin/webhooks · DELETE /{whid} # event push endpoints
GET /v1/admin/logs · /usage · /billing # audit, month usage, plan (RO)
GET/PATCH /v1/admin/project # rename / archive Industry templates & quickstarts. Creating a project with { "template": "romance" | "game" | "hardware" | "ecommerce" | "life" } seeds 1–2 polished agent presets so the first session works out of the box. Two end-to-end samples walk the full loop (project → secret key → backend session mint → client SDK): romance companion (the flagship — memory + relationship arc) and commerce shopping guide (catalog grounding via tool calls + world-state). Agents also carry a publish lifecycle — status: "draft" refuses live-key mints while test keys keep working, so you can tune a persona safely before going live.
Core concepts
- Session & surface
- Each
surface(e.g. "game", "support-widget") is one resumable session. Reconnecting resumes from a cursor, so replies are never lost. - World-state
- A stream of CloudEvents-shaped context (retained state + transient events) that grounds the companion in what is happening right now.
- Modalities
- Text and voice. Requested modalities are intersected with what the token grants.
- Tools
- Actions your app declares and performs on the companion's request — the bridge from "talk" to "do".
Capabilities
The companion exposes six capabilities over the SDK. Each is gated by a token scope, and each produces (or consumes) a platform-neutral payload — so the web app is just one host: a native iOS/Android app, a game engine, or a CLI can use the same capability by implementing the matching renderer/executor once.
| Capability | SDK surface | Scope |
|---|---|---|
| Memory | remember tool · recall() | memory.read/write:app · :core |
| Reasoning | the server agent loop (every turn) | chat |
| Skills | get_skills · run_skill · read_skill_resource · get_skill_prompt · host-declared tools | skills.execute |
| Social (A2A) | get_friends · send/message_friends · read_friend_messages · onSocialMessage | social.message |
| Wallet | get_wallet_balance · get_deposit_address · pay_friend · pay_address · onConfirmRequest | wallet.read (read-only) / wallet.spend |
| Instant UI | render_interface + update_interface · onRender / onInterfaceUpdate | ui.render |
createCompanion(options)
Returns a CompanionClient. Options:
| Option | Type | Description |
|---|---|---|
baseUrl | string | Origin of the Pouchy deployment, e.g. "https://www.pouchy.ai". |
token | string | A Pouchy Personal Access Token (PAT, pchy_…). Sent as a Bearer token. |
surface | string? | Logical surface — one resumable session per surface (default "default"). |
modalities | string[]? | Requested I/O modalities (e.g. ["text","voice"]); intersected with the token grant. |
tools | CompanionToolDecl[]? | Tools the companion may ask this surface to perform. |
handles | string[]? | Action types this surface can perform, declared at handshake. |
contextKinds | string[]? | World-state kinds this surface emits, declared at handshake. |
appContext | { name?, description? }? | Static description of your app/game so the companion is grounded in where it lives. |
visitor | { id, displayName? }? | Open a representative (on-behalf-of) session for this visitor. Requires the represent scope. |
stream | "sse" | "websocket"? | Reply transport. SSE (default) works everywhere. "websocket" is a forward-compatible opt-in: the server endpoint is not live yet, so it automatically falls back to SSE today. |
Client methods
connect(): Promise<HelloAck>- Opens the session and performs the handshake. Returns the granted scopes, negotiated modalities, a resume cursor, and (for representative sessions) the visitor-pairing state.
start(): void / stop(): void- Open or close the inbound reply channel (SSE by default; stream: "websocket" is accepted but the server WS endpoint is not live yet, so it currently always falls back to SSE). Call start() once after connect().
onMessage(handler: (text, envelope) => void): () => void- Subscribe to streamed companion replies. Returns an unsubscribe function.
sendText(text: string): Promise<void>- Send a user message. The reply arrives on the stream as a companion.message — subscribe via onMessage().
sendWorldState(input: WorldStateInput): Promise<void>- Push live context — { type, data, retained? }. Retained values persist for the session; transient ones are one-off signals.
connectCall(opts?): Promise<CompanionCall>- Start a live, low-latency voice session over WebRTC. Returns a call handle with .end(); declared tools + host control actions stay available in-call.
onToolCall(handler): () => void / sendToolResult(callId, result)- Receive companion.tool_call for the tools you declared; args arrives as a JSON string. Reply with sendToolResult(callId, { ok, result }); the turn resumes once every call is reported.
onRender(handler): () => void- Instant UI — receive companion.ui_action. payload.interface is the platform-neutral genui schema; draw it with your own renderer (web / native iOS+Android / CLI). Requires the ui.render scope.
onInterfaceUpdate(handler): () => void- Receive companion.ui_update — a live { panelId?, updates:[{key,value}] } write into an already-rendered panel. Apply it to the panel state; no rebuild.
onSocialMessage(handler): () => void- Receive companion.social_message — an inbound A2A message from a paired friend, delivered cross-app to any embed holding social.message.
onConfirmRequest(handler): () => void- Subscribe to companion.confirm_request — a sensitive op awaiting approval ({ confirmId, scope, summary, stepUp? }). Platform session tokens resolve it with confirmAction; first-party user tokens are observe-only (approval is authed as the Pouchy user, where the biometric/passkey gate lives).
confirmAction(confirmId, approve): Promise<{ status }>- Resolve a pending confirmation (platform session tokens only). Show the event's summary, collect an explicit tap, pass the decision; on approve the recorded action runs server-side and its result returns as `outcome` in the response (render it where the user tapped) and as a normal companion.message on the stream. Re-resolving a settled confirmId fails with 409.
pendingConfirms(): Promise<PendingConfirm[]>- The session's still-pending confirmations (display-safe: confirmId, scope, summary, createdAt) — rebuild your confirm card after a reload, since confirm_request events are not replayed. Session tokens only.
onAudio / onExpression / onUsage(handler): () => void- Typed subscriptions for companion.audio (TTS clips), companion.expression (avatar cues), and control.usage (per-token metering).
recall(opts?: { limit?: number }): Promise<RecalledMemory[]>- Read back the memories relevant to this session — content, kind, importance and namespace.
getAvatar(): Promise<CompanionAvatar> / brandIconUrl(size?)- Fetch the live avatar (3D VRM URL + 2D portrait, archetype, display name) to render your own front-end, and the Pouchy brand icon URL.
pairVisitor(visitorToken: string): Promise<{ pairId }>- Representative mode only: pair a visitor who is also a Pouchy user to unlock agent-to-agent (A2A) context. Requires the represent:pair scope.
endSession(opts?: { transcript? }): Promise<void>- Cleanly end the session and optionally fold a transcript into long-term memory.
onError(handler): () => void- Subscribe to transport / protocol errors ({ code, message }).
Events
Everything the companion does flows back as typed events on the stream you opened with start(). Subscribe with on(type, fn) (or '*'), or use the typed convenience helpers below. Each returns an unsubscribe function. Unknown event types are ignored, so the protocol is forward-compatible.
companion.onMessage((text) => render(text)); // streamed reply
companion.onRender(({ interface: ui }) => myRenderer.draw(ui)); // Instant UI panel
companion.onInterfaceUpdate(({ update }) => myRenderer.apply(update));
companion.onSocialMessage(({ fromName, content }) => notify(fromName, content));
companion.onConfirmRequest((req) => showApproval(req)); // req.stepUp ⇒ gate with Face ID
companion.onAudio(({ url }) => play(url)); // TTS clip (non-call)
companion.onToolCall(async ({ id, name, args }) => { /* … */ });
companion.start(); // open the event stream | Helper | Event | Purpose |
|---|---|---|
onMessage | companion.message | Streamed assistant text. Pouchy strips its own internal state/memory objects server-side, so the text never carries a leaked state blob; JSON you explicitly ask the companion to produce is preserved. |
onToolCall / sendToolResult | companion.tool_call | The companion asks your app to run a declared tool; you reply with the result. |
onRender | companion.ui_action | Instant UI — draw payload.interface with your own renderer. Needs ui.render. |
onInterfaceUpdate | companion.ui_update | Live {key,value} update to an already-rendered panel (no rebuild). |
onSocialMessage | companion.social_message | Inbound A2A friend message, delivered cross-app. Needs social.message. |
onConfirmRequest | companion.confirm_request | A sensitive-op approval request. Session tokens resolve it with confirmAction; first-party user tokens observe only (stepUp ⇒ biometric gate on the first-party surface). |
onAudio | companion.audio | TTS clip reference (non-call modality). |
onExpression | companion.expression | Avatar viseme / expression / gesture cue (for VRM embeds). |
onUsage | control.usage | Per-token metering echo for a usage/billing view. |
onError | control.error | Agent / transport errors ({ code, message }). |
World-state
Stream live context with sendWorldState({ type, data, retained? }). Retained values represent current state; transient ones are one-off signals. This is what lets the companion say "boss incoming — heal up" at the right moment.
// Retained state — the latest value persists for the session:
companion.sendWorldState({ type: 'game.player', data: { hp: 12, level: 7 }, retained: true });
// Transient event — a one-off signal:
companion.sendWorldState({ type: 'game.event', data: 'boss_appeared' });
// The companion reasons over this live context (and, in voice, reacts in real time).Tools & actions
Declare the actions your surface can perform; the companion calls them and you return a result. The same handler serves both text and voice sessions.
const companion = createCompanion({
baseUrl: 'https://www.pouchy.ai',
token: PAT,
tools: [
{
name: 'highlight_product',
description: 'Visually highlight a product card in the host UI',
parameters: { type: 'object', properties: { id: { type: 'string' } }, required: ['id'] }
}
]
});
// The companion CALLS your tool; you perform it locally and report the result.
// args arrives as a JSON string; reply with sendToolResult(callId, …).
companion.onToolCall(async ({ id, name, args }) => {
const a = args ? JSON.parse(args) : {};
if (name === 'highlight_product') {
highlight(a.id);
await companion.sendToolResult(id, { ok: true });
}
});Instant UI
The companion can render an interactive panel on your surface, not just speak text — a form to collect a few values, a summary card with a progress bar, a chart, a set of choices. It arrives as companion.ui_action; you draw payload.interface with your own renderer.
The payload is platform-neutral JSON — a tree of ~20 typed atoms (Text, Slider, Select, Chart, Table, Group…) plus a state bag — so the same panel renders on web, native iOS (SwiftUI), Android (Jetpack Compose), or a terminal. The renderer is the only platform-specific part; the contract never changes. Input atoms write to state; set reportChanges and the user's edits stream back as a turn (the live-form loop), and companion.ui_update applies live value changes without a rebuild.
// The companion renders an interactive panel; YOU draw the platform-neutral
// schema with your own renderer (web component, SwiftUI, Jetpack Compose, CLI).
let panel;
companion.onRender(({ interface: ui }) => {
panel = renderInterface(ui, mountEl, {
// If the panel set reportChanges, stream the user's edits back as a turn:
onReportChanges: (state) => companion.sendText('I adjusted: ' + JSON.stringify(state))
});
});
// Live, in-place updates — the companion changes a value without a rebuild:
companion.onInterfaceUpdate(({ update }) => panel.applyUpdate(update)); Reference renderers (vanilla web, SwiftUI, Jetpack Compose) and the full atom contract live in the Instant UI renderer guide. Gated by the ui.render scope.
Confirmations & biometric step-up
Sensitive actions — a payment, running a skill, messaging a friend — are never executed inline. The companion emits companion.confirm_request and waits. Who approves depends on the token: a platform session token (an end-user instance minted via /v1/sessions) resolves it directly with confirmAction — your end user is that account's only human, and this is how confirm-gated custom skills (POST / credentialed) run. A first-party user token is observe-only: approval happens on the user's own Pouchy surface (their app or the hosted confirm page, authed with their own login), never in a third-party DOM.
The request carries an advisory stepUp flag (set for irreversible money ops). When true, the first-party approval surface requires a stronger gesture — a passkey / Face ID / Touch ID — before approving, verified server-side via WebAuthn. Users without a passkey, or with step-up disabled, fall back to a tap; the custodial flow is unchanged.
// Sensitive ops (pay, run a skill, message a friend) are NOT executed inline:
// the companion emits companion.confirm_request and waits.
companion.onConfirmRequest(async (req) => {
// req = { confirmId, scope, summary, stepUp? }
// PLATFORM SESSION TOKENS (your end users, minted via /v1/sessions):
// show your own confirm card and resolve it — this is how confirm-gated
// custom skills (POST / credentialed) run.
const approved = await showConfirmCard(req.summary); // your UI
await companion.confirmAction(req.confirmId, approved);
// FIRST-PARTY USER TOKENS (Login with Pouchy): observe-only — approval is
// authed as the Pouchy user (their app / hosted confirm page), where the
// stepUp === true money ops get a passkey / Face ID gate.
});
// After a reload, rebuild the card from the still-pending list:
const pending = await companion.pendingConfirms();Voice
connectCall() opens a live, low-latency voice session over WebRTC. Transcripts fold back into the companion's memory on end(). Depending on the voice route the platform selects, the optional @elevenlabs/client peer dependency may be required (that is the npm package name); the default route needs nothing extra.
const call = await companion.connectCall({
voice: 'default',
locale: 'en',
onTranscript: (line) => console.log(line.role, line.text)
});
// world-state still flows during the call — the companion reacts live:
companion.sendWorldState({ type: 'game.event', data: 'low_health' });
await call.end(); // ends the voice session; transcript folds into memoryRepresentative mode (on-behalf-of)
Pass a visitor and the session flips from owner-facing to representative: the companion answers that visitor on the owner's behalf (customer-service style) using only screened owner context — never the owner's private memory, system prompt, or PII.
const c = createCompanion({
baseUrl: 'https://www.pouchy.ai',
token: OWNER_PAT, // must hold the `represent` scope
surface: 'support-widget',
appContext: { name: 'AcmeShop', description: 'Order support' },
visitor: { id: stableVisitorId, displayName: 'Sam' } // stable per end-user
});
const ack = await c.connect(); // → { representative: true, … }
await c.sendText('do you ship to Canada?'); | Scope | Grants |
|---|---|
represent | Required — open a visitor-facing session. |
expose:knowledge | Answer from the owner's knowledge base. |
expose:facts | Share a wider set of screened facts. |
represent:remember | Durable per-visitor notes across visits (isolated store). |
represent:pair | pairVisitor() — pair a visitor who is also a Pouchy user to unlock A2A. |
REST & OpenAPI
The SDK is a thin wrapper over a documented HTTP API. The full, machine-readable contract is published as OpenAPI 3.1 at /api/companion/openapi.json — generate clients in any language.
# Every endpoint is Bearer-authenticated with a PAT.
curl https://www.pouchy.ai/api/companion/openapi.json # machine-readable spec (OpenAPI 3.1)
curl -X POST https://www.pouchy.ai/api/companion/message \
-H "Authorization: Bearer $POUCHY_PAT" \
-H "Content-Type: application/json" \
-d '{ "surface": "default", "text": "hello" }'Errors
Failed calls throw a CompanionError with a stable code and message; transport/protocol errors also surface via onError(). Common codes: 401 (invalid token), 403 (missing scope — e.g. supplying a visitor without represent), 429 (rate limited).
Support
Questions or an integration in progress? Request access and the team will help you plan capabilities, personas and wiring — or email support@pouchy.ai.

Social (agent-to-agent)
The companion can message the user's paired friends on their behalf (
send_friend_message/message_friends, confirmation-gated), read a friend thread (read_friend_messages), and — the cross-app half — surface an inbound friend message to your embed in real time viacompanion.social_message. So a companion in App A can message a friend whose companion runs in App B, and App B receives it. Requires thesocial.messagescope.