05 · feedback · mixed

What happens after you act on a recommendation

A rec is written. You read it, act on it (or don't), and mark its disposition. From that point on, two things happen: the markdown file gets patched to reflect your action, and the system's overall action-rate updates. If action-rate falls below 30%, the system declares itself broken.

The full feedback loop

  1. A. rec is created — markdown + DB row, status = open
  2. operator picks act, snooze (≤7d), or dismiss
  3. B. surface writes the disposition — UI button (Lovable / FastAPI) or markdown edit (learning)
  4. C. rec store updated — status, acted_disposition, fit_1_5, snoozed_until
  5. D. next /rx-* invocation — step 0.5 auto-revive + step 0.7 Phase W reconcile
  6. E. markdown patched — status, acted_at, phase_w_synced_at
  7. F. action-rate updates (per-domain rolling fraction) and feeds back into A

Snoozed recs auto-revive back to open — lazily, at the next invocation.

Hover any node to see its connections highlighted. The reconcile step is lazy — it fires when you next run a rx command.

Phase W reconciler in plain language

Phase W = the reconciler. After you act on a rec in the UI, the database row is updated. But the markdown file on disk still says status: open. The reconciler is the step that fixes that mismatch. It runs at step 0.7 of the next /rx-<door> invocation: queries recent rec rows, compares each to its on-disk markdown, and patches the frontmatter where the DB has newer state. If everything is already in sync, it does nothing.

Two assumptions worth naming. The reconciler leans on rec frontmatter staying machine-parseable YAML, and on nothing writing these files concurrently — a safe bet today because the system is single-operator and single-laptop, and the reconcile runs inline in the command rather than as a daemon. There is no lock protocol because there is no second writer. The cost: a hand-edit that breaks the YAML silently stops the patch for that file until it's fixed.

What gets patched into frontmatter

FieldPatched when
statusDB row moved from open → snoozed / acted / dismissed
acted_dispositionoperator picked acted_as_prescribed / acted_modified / skipped / dismissed
acted_atany non-open transition stamps this
subjective_fit_1_5operator gave a 1–5 rating in the UI
outcome_noteoperator typed a free-text reflection
snoozed_until · snooze_countoperator snoozed the rec
phase_w_synced_atstamped every time the reconciler touches the file (idempotency marker)

The disposition options

acted_as_prescribed

Did the exact thing the rec said. Strongest positive signal.

acted_modified

Took the direction but altered the specifics. Still counts as acted for action-rate.

skipped

Made an explicit choice not to act but agreed with the framing. The one non-acting disposition that stays in the action-rate denominator — it pulls the rate down without counting as a win. Distinct from dismiss in acted_disposition. See the formula below.

dismissed

Rejected the rec entirely. Excluded from the action-rate — a rejection is treated as a rec-quality signal (tracked via acted_disposition and the fit score), not a follow-through miss, so it neither helps nor hurts the ratio.

The subjective fit score (1–5)

Required whenever the operator acts on a rec (a Pydantic validator rejects an acted_* disposition without it); optional on the non-acting dispositions. It is the operator's rating of how well the rec fit the actual moment. Useful for /rx-analyze later — recs from sources or signals that consistently get fit ≤2 are candidates for weight reduction.

Outcome attribution — the finance closed loop

Finance is the only door where the loop closes all the way to P&L. The trades table has a related_rec_id FK (D-046) — when a trade is journalled, the operator can link it back to the rec that triggered it. /rx-analyze later joins trades to recs to compute per-signal hit-rate weighted by realised P&L.

rec id, drift, signal acted trade entry + related_rec_id FK close realised P&L exit_price - entry_price attributed per-signal hit-rate weights tunable
Closed-loop attribution lives in finance only. Fitness equivalent is "did the workout actually happen" — proxied by next-session log existence.

De-biasing the loop — when feedback flatters itself

A closed loop has a failure mode that looks like learning: a rec nudges the operator into a trade, the trade's P&L is then credited back to the rec, and the system congratulates itself for an outcome it caused rather than predicted. Three app-side mechanisms break that flywheel.

Attribution split

Each trade carries rec_influence_kindpreceded_independent (the idea predated or was independent of the rec) vs influenced (the rec drove it). It is operator-set at trade capture, not inferred from timestamps. /rx-analyze's predictive_lift excludes influenced trades — only independent (and unclassified legacy) trades can credit a rec, so the system can't take credit for moves it caused.

Explicit hypothesis linkage

Recs now carry linked_hypothesis_ids named at compose time (match_type="explicit"). The old D-046 substring heuristic isn't removed — it's demoted to a fallback suggestion: any already-explicitly-linked hypothesis suppresses its substring match, and the rest are tagged match_type="substring_fallback". The old landmine — a rec mentioning "NVDA" silently matching every nvda-slug hypothesis — is defused.

Value vs engagement

Action-rate says recs get acted on; it says nothing about whether acting paid. P&L-per-rec sums realised P&L over a rec's linked closed trades; a divergence_flag fires when action-rate ≥ 30% AND total value < 0 — green engagement, red money. Surfaced in the /rx-analyze report. The value metric generalises per door (P&L for finance, drift-improvement for fitness/nutrition, goal-progress for learning) — but only finance has hard money today.

Action-rate itself is unchanged — still acted / (acted + skipped) (dismissed + snoozed excluded). De-biasing adds a value axis beside it; it doesn't touch the engagement formula.

Prediction accuracy and model drift (finance)

The finance door sits on top of a price-prediction model, so the loop has a second self-monitoring layer the other doors don't. Two things are measured continuously, and one of them is itself a rec signal.

Offline accuracy — measured without any traffic

With a single operator there is no click-stream, so quality is judged against the market, not against engagement:

Model-drift detector — the drift_alerts signal's source

A regime change can silently degrade a checkpoint: predictions stay plausible but get systematically worse on exactly the tickers being traded. The detector catches that by watching its own error rate:

Mechanismcompare recent_mape (last 30 days) to all_time_mape
Fires whenrecent_mape / all_time_mape ≥ 1.5 (recent error is 1.5× the baseline)
Minimum data≥10 recent samples AND ≥30 all-time samples
Action on firewrites a DriftAlert row + Telegram ping. Alert-only — no auto-retrain, no auto-demote, no reweight. Idempotent per (ticker, horizon, model).
Feeds back asthe drift_alerts component of the finance drift composite (weight 0.15)
Why alert-only, and why it was built. A Kronos checkpoint degraded to roughly 2× MAPE after a regime shift; it was noticed only when realised P&L drifted from the predicted move over several weeks. The fix was to surface degradation proactively as a rec signal rather than a separate ops alarm. It stays alert-only because a 30-day / 10-sample window is too thin to justify auto-demoting a long-running rule — regime changes are often transient, and a human acknowledgment is a circuit-breaker that doesn't amplify short-term variance.

Note this is model drift (prediction-quality degradation), distinct from the user-state drift the rx composite measures — and now also distinct from the two further detectors below.

Two more drift detectors (pure math, not yet scheduled)

Two newer detectors were added as pure functions — deterministic, cheap, and deliberately unwired for now: the series-assembly and scheduling (a cron) aren't in place yet, so today they're callable building blocks, not live alarms.

DetectorMechanismFires when
Preference drift least-squares slope of action-rate / subjective-fit over an evenly-spaced rolling window (e.g. per week) slope ≤ −0.05declining; ≥ +0.05improving; else stable (<2 points → insufficient_data)
Embedding-distribution shift cosine distance between rolling per-domain embedding centroids of incoming chunks distance > 0.15shifted

Preference drift is the discrete answer to "is the operator quietly losing trust?" — it watches the slope, not just whether the absolute number sits below a band. The embedding-shift monitor watches whether the corpus itself is drifting under the index. Both are distinct from the MAPE-ratio model-drift detector above.

Action-rate as system-death signal

D-003 · the < 30% rule

An open-loop recommender that nobody acts on is broken regardless of the quality of its content. The /rx-*-history commands surface this metric per domain. The threshold:

Example: 58% — healthy.

The red tick at the top marks 30% — the GREEN/YELLOW boundary, and the bar the future cross-domain meta-rec must clear to unlock. Below 15% the door is RED (critical). The gauge fills the moment the metric is computed and colours itself by band.

action_rate = acted / (acted + skipped)

The formula filters on acted_disposition, and only three of the four values appear in it. Numerator = acted_as_prescribed + acted_modified. Denominator = those plus skipped. dismissed and snoozed are both excluded entirely — so, perhaps counter-intuitively, a skip (agreed but didn't act) drags the rate down while a dismiss (rejected outright) doesn't touch it. The reasoning: the metric measures follow-through among recs the operator accepted as legitimate; a dismissal is a rec-quality problem handled elsewhere, not a follow-through miss.

Why the band cutoff is global while every other dial is per-door: the 30% threshold measures operator trust in the rec channel itself, not domain dynamics — an unacted rec costs the same attention in every door, regardless of whether that door decides daily (fitness) or slowly (finance). The window is already per-domain; a per-door cutoff is a v1.x candidate once every door has enough dispositioned recs for one to be statistically meaningful.

Proposals only, never auto-applied (D-019). Even when /rx-analyze is allowed to run, it only proposes weight adjustments — it never rewrites them. The operator approves each change. This is the same human-in-the-loop stance as the model-drift detector above: the system surfaces, the operator decides.

Per-door surface comparison The reconciler is split. Phase W has two implementations: one reads Supabase (handles fitness rows), the other reads TradingV postgres (handles finance rows). They share no code — which is fine because the doors don't share storage anyway.

DoorHow operator dispositionsReconcile direction
Zeus Lovable verocity UI button → Supabase UPDATE Supabase → markdown (step 0.7)
Athena no surface (D-047) not applicable (no DB writes)
Lakshmi FastAPI panel → POST /v1/rx/recs/{id}/disposition TradingV postgres → markdown (step 0.7)
Ganesh /rx-learning-status <id> directly edits markdown not applicable — markdown is authoritative
← prev
04 · Doors