← Back to Wiki
Development and Tools
Couples Therapy Bot
A project Grace Tempest is building: a formulaic, voice-and-video couples therapy bot
that runs an Emotionally Focused Therapy (EFT) session with a couple sitting on the
other side of a camera. This page is the working plan for v1, the therapeutic framework
we are encoding, the ethics posture, and the architecture for the web app.
Framing: this is an adjunct to therapy, not a replacement for a licensed clinician.
The product is consistent, low-cost, and available between sessions. It is not a covered
medical service.
Open the app in its own page
The working v0.1 app is best used as its own full-screen page so it feels like a real
product instead of an embedded block inside the wiki.
State persists in your browser, so when you come back the app will still remember your saved session.
Live Prototype
A working v0.1 prototype of the platform is live as a full-screen app page. It runs
the full onboarding flow (consent, partner setup, personality picker, voice picker,
intake) and a mock EFT session that walks through Stages 1, 2, and 3 with the chosen
personality. State persists in the browser. No backend yet, no live LLM yet.
- Open the app: Couples (v0.1)
- Best on iOS: open in Safari and Add to Home Screen for the full PWA feel.
- What works: consent flow, partner names, personality selection (Warm Anchor / Gentle Guide / Direct Coach / Playful Companion), voice selection (8 OpenAI voices grouped by perceived gender), 4-question intake, mock chat session that escalates through EFT stages, session recap with homework, settings, and clear-all-data.
- What is mocked: the LLM responses are template-driven, the voice mic button is a stub, and there is no server yet. v0.2 wires in OpenAI Realtime over WebRTC.
Therapeutic Framework
The bot follows Sue Johnson's 9-step EFT model as its long-arc structure, with the
EFT Tango as the per-turn macro intervention and RISSSC as the delivery style guide.
9-Step EFT Model
- Stage 1 - Cycle De-escalation:
- Step 1: Identify the relational conflict and key issues.
- Step 2: Identify the negative interaction cycle that keeps the issues stuck.
- Step 3: Access the unacknowledged emotions underneath each partner's position.
- Step 4: Reframe the problem in terms of the cycle, the underlying emotions, and unmet attachment needs.
- Stage 2 - Restructuring Interactional Positions:
- Step 5: Help each partner own disowned attachment emotions, needs, and aspects of self.
- Step 6: Promote acceptance of the other partner's experience.
- Step 7: Facilitate expression of needs and wants. This is where withdrawer re-engagement and pursuer softening happen.
- Stage 3 - Consolidation and Integration:
- Step 8: Help the couple build new solutions to old problems from the new emotional position.
- Step 9: Consolidate the new positions and the new cycle of attachment behavior.
Macro Intervention: The EFT Tango
- Move 1: Mirror present process - reflect what is happening between the partners right now.
- Move 2: Affect assembly and deepening - slow down and name the deeper emotion under the surface reaction.
- Move 3: Choreograph an engaged encounter - invite one partner to share the deeper emotion directly with the other.
- Move 4: Process the encounter - help the receiving partner take it in and respond.
- Move 5: Integrate and validate - name what just happened and why it matters.
RISSSC Delivery Rules
- Repeat: repeat key phrases to anchor the moment.
- Images: use simple, vivid images.
- Simple: short sentences, plain language.
- Slow: slow the pace, especially around emotion.
- Soft: soft tone, low volume.
- Client's words: use the couple's own language back to them.
RISSSC is encoded as a style block in the model's system prompt and reinforced by
the TTS voice and pacing settings.
Additional EFT Principles to Encode
- Attachment-theory framing: partners as each other's secure base. Conflict is read as protest behavior, not pathology.
- Track the negative cycle: name and externalize the couple's "demon dialogues" - Find the Bad Guy, the Protest Polka, Freeze and Flee.
- Reflect and validate before reframing: nothing moves until each partner feels heard.
- Hold Me Tight conversations: bonding events as the target output of Stage 2 sessions.
- Slow escalation, de-pathologize: the bot never sides with a partner; it sides with the cycle being the enemy.
Ethics, Safety, and Confidentiality
- Hard guardrails: active domestic violence, suicidal ideation, child safety, and acute substance crisis short-circuit the EFT loop and route to human resources (988, local DV hotline). The session pauses; the bot does not attempt to do therapy on a crisis.
- Informed consent: a consent screen with a recorded acknowledgement is required before the first session. Both partners must consent.
- Clear disclaimer: not therapy, not a covered medical service. Recommend a licensed therapist for clinical issues.
- Data handling: at-rest encryption, per-couple isolation, granular delete, no training on user content, configurable retention windows (30, 90, 365 days).
- HIPAA posture: aim for BAA-able infrastructure (Vercel + a HIPAA-eligible model provider, or self-hosted). Launch v1 explicitly framed as non-clinical so PHI rules do not gate the first release.
- No manipulation of one partner against the other: the bot's job is the relational system, never alliance with one side.
User Experience (v1)
- iOS-first responsive web app, installable as a PWA. Camera and mic permissions handled at session start.
- Couple sits together, taps Start Session. Camera and mic turn on.
- Bot listens, transcribes locally and on the server, and runs the EFT Tango on each turn.
- Session ends with a short recap, one bonding prompt, and a homework card the couple can revisit before the next session.
- Default session length is 30 minutes; couple can end early at any time.
Bot Personality and Voice
Each couple picks a bot personality and a voice during onboarding. They can change
either at any time from settings. All personalities are EFT-faithful: they share
the 9-step framework, the Tango, RISSSC, and the safety pipeline. Personality only
changes tone, pacing, and the kinds of phrasing the model favors.
Personality Options (v1)
- Warm Anchor (default): calm, grounding, slightly reflective. Leads with validation. Good for high-conflict couples in early sessions.
- Gentle Guide: extra soft, attachment-forward, lots of mirroring and gentle invitations. Good for partners who shut down easily.
- Direct Coach: still warm, but more structured. Names the cycle quickly, asks for clear next steps, holds the couple to their homework.
- Playful Companion: lighter touch, occasional warmth and humor, useful for couples in Stage 3 consolidation work.
Each personality is implemented as a versioned system-prompt block layered on top of
the shared EFT identity, so therapeutic behavior stays consistent and only style shifts.
Voice Options
Users pick a voice independent of personality. v1 ships OpenAI's built-in voices,
grouped by perceived gender so couples can choose what feels safest in the room:
- Feminine: Shimmer, Coral, Sage, Nova.
- Masculine: Echo, Onyx, Ash, Ballad.
- Neutral / non-binary: Alloy, Verse.
Default voice is paired to the chosen personality (for example Warm Anchor defaults
to Sage), but the user can override at any time. Voice settings (pace, warmth) are
tuned per personality and pushed through OpenAI's voice models.
How Voice Is Implemented
- Primary path: OpenAI Realtime API (
gpt-realtime) over WebRTC for true voice-to-voice with low latency. The browser negotiates a peer connection directly with OpenAI; the server only mints short-lived ephemeral session tokens. This keeps latency low enough to feel like a real conversation.
- Fallback path: Whisper for speech-to-text + GPT for the EFT turn + OpenAI TTS (
tts-1-hd) streamed back. Used when Realtime is unavailable or for cheaper async sessions.
- Personality at the model layer: the active personality's system prompt is injected into the Realtime session config, along with the current 9-step stage and the couple memory snippet.
- Safety pipeline still runs: Realtime audio is mirrored to a transcript stream, and each user turn passes through the DV / suicidality / minor-at-risk classifier before the model is allowed to keep going.
Intake and Onboarding
Onboarding gathers context the model needs to be useful from the first session, and
screens out cases that should go directly to a human clinician.
- Relationship history: how long together, married or not, kids, prior therapy.
- Conflict frequency and topics.
- Intimacy frequency and satisfaction.
- Individual mental health screens: PHQ-9 (depression), GAD-7 (anxiety). Both are short and well validated.
- Relationship satisfaction: Couples Satisfaction Index (CSI-16) or DAS-7.
- Attachment style: ECR-S short form for each partner.
- Safety screen: short DV and abuse screener. A positive result routes to resources and blocks the couples session path.
Internal Tracking and Follow-up
- Per-couple memory: recurring themes, the named negative cycle, each partner's primary emotions, agreed homework, and unresolved threads.
- Session log: timestamp, current 9-step stage focus, which Tango moves were used, any risk flags raised.
- Auto-seeded follow-ups: next session opens with prompts pulled from the prior session's open threads. Example: "Last time Sam named feeling unseen when work calls run late. Want to revisit that?"
- Admin dashboard for Grace: cohort view, risk-flag queue, session counts, churn signals. This is internal-only.
Product Aesthetic - Matted Purple
The product brand is calm, low-contrast, matted purple. Not neon, not glossy. Soft
surfaces, generous spacing, humanist sans typography, 16px rounded corners. No emojis
in product chrome.
- Background #1a1626
- Surface #251f33
- Divider #2e2640
- Primary #8b7ab8
- Accent #a89bc9
- Text #e8e3f2
V1 Architecture
- Frontend: Next.js App Router, deployed on Vercel, installable as an iOS PWA. Camera and mic via WebRTC.
- Voice loop: OpenAI Realtime API (
gpt-realtime) over WebRTC as the primary path - true voice-to-voice with low latency, browser negotiates directly with OpenAI using a server-minted ephemeral token. Fallback path: Whisper for STT + GPT for the EFT turn + OpenAI TTS (tts-1-hd) streamed back.
- LLM prompting: layered system prompt - EFT identity + active personality block (Warm Anchor / Gentle Guide / Direct Coach / Playful Companion) + RISSSC style + current 9-step stage + couple memory snippet + safety guardrails. Each turn explicitly runs the Tango as a structured plan, not freeform chat.
- Non-voice LLM calls: session summarization, follow-up generation, and the safety classifier go through Vercel AI Gateway with a
provider/model string so providers stay swappable.
- Auth: Clerk via the Vercel Marketplace. One couple is one shared workspace with two named partners.
- Storage: Supabase Postgres with RLS for couple profiles, intake responses, session summaries, and the follow-up queue. Encrypted blob storage for raw transcripts on Vercel Blob (private).
- Background work: Vercel Functions for session summarization and follow-up generation. Vercel Cron to enqueue weekly check-ins.
- Safety pipeline: every user turn runs through a fast classifier (DV, suicidality, minor-at-risk) before reaching the EFT loop. Positive hits short-circuit to crisis routing.
Roadmap
- v0.1 (1-2 weeks): static marketing page, intake forms, consent flow, personality + voice picker, single text-only session running the EFT Tango with couple memory write-back.
- v0.2: voice in and voice out via OpenAI Realtime, iOS PWA polish, safety classifier in the request path.
- v0.3: video on (presence only, no analysis yet), follow-up engine, Grace's admin view.
- v0.4: clinician-in-the-loop pilot with Grace reviewing flagged sessions.
Open Questions
- Single therapist persona, or a named "Grace-style" persona that mirrors how Grace would actually run a session?
- Should Grace approve each personality block before it ships, or do we treat them as design surface that can iterate freely?
- Do we let couples switch voice mid-session, or lock the voice for the duration of a session to preserve the therapeutic relationship?
- HIPAA-grade infrastructure at launch, or non-clinical framing first and harden later?
- Pricing model and whether sessions are time-boxed or unlimited per month.
- Do we record video, or audio only? Video adds presence but raises the consent and storage bar.
Related