Premonition

A prepared mind behind the voice.

An experimental backend for conversational AI that predicts likely next moves, prepares safe draft responses in shadow mode, waits for reality to confirm a branch, and then measures what actually improved.

See the harness Read the evidence

21.7%

Base readiness

First branch immediately usable at baseline.

76.5%

Guarded readiness

Usable prepared draft after branch confirmation.

25/25

Stress folds

Recovery rule cleared ESConv shuffled gates.

560ms

Latency saved

Median estimate when preparation is selected.

Anticipate - prepare - protect - learn

Measured preparedness

Not prophecy. A better rehearsal loop.

Premonition is built around a simple distinction: the speaking model should stay present, while a hidden backend rehearses likely next turns and keeps every speculative draft behind a confirmation gate.

Observe the live moment

The system starts with the current conversation or replay turn. Reality remains the root, not the model's imagination.

Prepare possible next moves

The backend predicts likely response modes, builds bounded drafts, and keeps them hidden until the observed branch matches.

Grade what happened

Every hit, miss, quality gate, latency estimate, and segment regression feeds back into the benchmark loop.

Current evidence

The backend is getting measurably more ready.

In held-out tests, the important question is not whether the system can blindly speak early. It is whether, once the next conversational direction becomes clear, a useful speech-ready draft is already waiting.

21.7%
to 76.5%

Usable prepared response availability after branch confirmation, up from the original first-speech baseline.

Base state

0.217

Probability Pack

0.546

Guarded swarm

0.765

+0.219

Quality-ready lift

Recovery policies increased prepared coverage while preserving the TTS-readiness quality floor.

+0.102

Mean stress gain

Average quality-ready improvement across the 5-seed x 5-fold ESConv stress test.

10/15

Second-corpus transfer

DailyDialog promoted safely on ten folds while weak folds stayed baseline instead of forcing regressions.

149

Verification checks

Replay, Probability Pack, live-shadow, CLI, and dashboard tests covered the current implementation checkpoint.

Parallel cognition

One lane speaks. One lane rehearses. Reality teaches both.

This is the core experiment: keep the front-facing agent fast and present, while the hidden backend prepares branches in parallel and waits for confirmation before anything reaches the user.

Conscious conversation1

User speaks

"I am not sure what to do next."

Frontend LLM

Stays present, understands the current turn, and avoids carrying the whole future in working memory.

Agent replies

Uses confirmed context only. Speculation stays hidden.

Hidden Premonition backend2

Reality and learning loop3

Real next turn

Validate

Confirmed branch

The observed next move matched a prepared validation branch. The draft can be used quickly after confirmation.

Grade

exact match feeds the benchmark report.

Refinement

Hits, misses, weak response modes, and protected-slice regressions update the next run.

Live Shadow Lab

The experiment can run beside a real conversation.

The lab observes transcript turns, generates hidden Premonition drafts, accepts the actual next move, grades draft readiness, estimates saved latency, and exports replay rows for future benchmarks.

Premonition Live Shadow Lab interface with conversation, prepared drafts, grading metrics, and timeline lanes

Transcript in. Drafts prepared. Reality graded. Shadow-mode testing before voice runtime integration.

Architecture layers

The conscious model does not carry the whole future.

Premonition separates the visible conversation from the background rehearsal layer, then scores whether rehearsal actually helped.

Branch generator

Predicts likely next events or response modes from the current context.

Artifact builder

Prepares bounded drafts, policy checks, and tool plans for useful branches.

Safety filter

Keeps speculation hidden until reality confirms a matching branch.

Replay evaluator

Compares prediction, preparedness, latency, cost, and safety across runs.

Progression

From offline replay to a voice-ready shadow layer.

The current project is still a research harness, not a production voice agent. The honest milestone is that the backend can now be measured, visualized, and tested in shadow mode.

Baseline

Offline replay harness

Measure branch prediction, prepared artifacts, latency estimates, and unsafe leak rate across repeatable turns.

Loop

Probability Pack and recovery policies

Train guarded preparation behavior and accept only improvements that survive held-out checks.

Now

Live Shadow Lab

Run beside manual or live transcript turns, inspect prepared drafts, and export new benchmark rows.

Voice runtime integration

Connect the shadow layer to a TTS/voice stack so confirmed drafts can reduce perceived response delay.

Premonition swarm outcome dashboard showing readiness reach, quality-ready lift, stress promotion, and policy ladder

Outcome dashboard Base state vs Probability Pack vs guarded swarm.

Can AI become more present by preparing before it speaks?

Premonition is a build-in-public attempt to answer that question with benchmarks, shadow-mode testing, and clear safety boundaries. The magic is not prediction. The magic is measured readiness.

Explore the repo Review the evidence