02 · ingestion · non-technical

Where does the data come from?

Nine distinct sources feed the system — eleven feeds in total, since live market data alone carries three. Each lands in a specific table or file, on a specific cadence, in a specific door's lane. Nothing is shared between domains at the source level.

Nine sources each land in one store — TradingV postgres (live market data, webhooks, chart pastes), the knowledge-vault (YouTube, Edgar, smart-money), Supabase (workout logs), and door-owned state files (measurements, learning state) — and each door reads only the stores its domain owns: Lakshmi from TradingV + vault; Zeus from Supabase + vault + door-state; Athena from Supabase + vault; Ganesh from door-state + vault.

Sources land in one of four stores; each door reads only the stores its domain owns. Hover any node to trace its lane. Edge labels carry each source's cadence — webhooks are real-time, workouts live, polls daily or hourly, imports manual.

The nine sources, one by one

1 · Live market data — yfinance + FRED + earnings calendar

Three sub-pipes inside the TradingV app's market-data module:

  • OHLCV bars via yfinance — daily refresh for every watchlist ticker (240+ bars). Lands in ohlcv_bars (composite PK on symbol+interval+ts).
  • Macro series via yfinance + FRED API — daily 01:00 UTC, gated by MACRO_ENABLED. Lands in macro_series.
  • Earnings calendar — rolling 150-row universe (NASDAQ + Street Tier-1+2 + 8-K Item 2.02 confirm), 90-day TTL. Lands in earnings_calendar_rows.
  • IV percentile + earnings dates per ticker upserted into ticker_market_data.

Door consumer: Lakshmi only.

2 · TradingView webhooks (Pine Script alerts)

The legacy unversioned POST /webhook endpoint receives Pine Script alerts as text. These are notification-semantic — no dedup, fast lane.

  • Lands in alerts table
  • Fans out to tv_context_items with kind=webhook (7-day TTL, deduped by SHA256 within a rolling window).

Door consumer: Lakshmi only.

3 · Operator chart pastes (separate pipe!)

This is not the webhook pipe. When the operator pastes a chart screenshot into the app's TV Context route (POST /v1/tv-context/screenshot), a vision pipeline kicks in:

  1. Tesseract OCR extracts tickers
  2. Claude vision summarises the chart
  3. Qwen2-VL extracts structured chart references
  4. Sidecar markdown written to knowledge-vault/Sources/.../<ticker>_<HMS>_<id>.md
  5. Unknown tickers flow into ticker_review_queue

Door consumer: Lakshmi only. Webhooks and screenshots are two pipes — documented as such (see the reminder at the bottom of this page).

4 · YouTube videos (channel-polled, vision-extracted)

The vault indexer's YouTube-channel ingester polls every channel listed under Videos/<author>/_channel.yaml. Cadence: hourly, gated by VIDEO_INGEST_ENABLED.

For each new video, a 3-stage pipeline runs:

  1. Frame detection — sample candidate keyframes
  2. Tesseract OCR — pull text overlays
  3. Qwen2-VL (MLX on Apple Silicon) — caption frames + extract structured chart YAML

The output is a markdown at knowledge-vault/Videos/<author>/<week>-<slug>.md. Unknown tickers seen in vision frames go into ticker_review_queue.

Door consumer: any door whose vault scope includes Videos/<subdomain>.

5 · SEC Edgar filings

The vault indexer's Edgar ingester polls Edgar daily for watchlist tickers, gated by EDGAR_INGEST_ENABLED. Writes per-filing markdown to knowledge-vault/Filings/<ticker>/<accession>.md. Idempotent on accession number.

Door consumer: Lakshmi only (finance vault scope).

6 · Smart-money snapshots (The Street/)

An aggregator pipeline that imports tier-1/tier-2 conviction lists, politician buys (House STOCK Act, Senate EFD), insider buys (Form-4), and options flow.

Each snapshot is dated: The Street/snapshots/YYYY-MM-DD/. Files: tier-1-conviction.md, tier-2-conviction.md, politicians-buys.md, insiders-top.md, options-bullish.md, plus _index.md and methodology.md.

Chunking unit: per-ticker H2 section (~600 words by design — see Deep tech on chunk sizing).

Status: all 6 data sources currently enabled: false in _source.yaml; manual snapshots exist (last: 2026-05-08).

7 · Verocity workout logs (Supabase, live)

Every workout the operator does in the Lovable verocity app writes one row to Supabase workout_logs. Fields include planned vs actual sets, RPE, total_seconds, avg_hr_bpm.

Derived signals are computed by a SQL view v_drift_signals with seven components (pace, block_lag, gap_breach, load_drop, rpe_drift, recovery, conditioning).

Door consumer: Zeus.

8 · Manual measurements YAML (weekly)

Append-only file at Zeus/03_execution/measurements.yaml. Operator writes bodyweight + waist circumference weekly.

A small body-composition helper reads this and emits a body_recomp_sub JSON signal that becomes the 10% second-layer weight on the fitness drift composite (D-022).

Door consumer: Zeus only.

9 · Learning state (markdown only)

Ganesh has no DB. Signals come from heterogeneous markdown via a single helper:

  • Ganesh/02_library/active_sprints.md — current slot deadlines
  • Ganesh/_inbox/ — unprocessed drops (7-day rotation)
  • Ganesh/03_execution/learning_log.md — append-only session log
  • Ganesh/02_library/notes/ — processed notes with applied_in: graduation tracking

A learning-state helper reads the markdown and emits a 5-component drift signal JSON.

Door consumer: Ganesh only.

Cadences at a glance

Eleven rows for nine sources: live market data fans out into three separately-scheduled feeds (OHLCV, macro, earnings).

SourceCadenceDestination
OHLCV barsdailyTradingV ohlcv_bars
Macro seriesdaily 01:00 UTCTradingV macro_series
Earnings calendardailyTradingV earnings_calendar_rows
TV webhooksreal-time pushTradingV alerts + tv_context_items
Chart pastesoperator-drivenTradingV tv_context_items + vault sidecar
YouTubehourly pollvault Videos/<author>/
Edgardaily pollvault Filings/<ticker>/
Smart-moneyoperator importvault The Street/snapshots/
Workoutsper-session liveSupabase workout_logs
Measurementsweekly manualfile Zeus/03_execution/measurements.yaml
Learning stateoperator-drivenGanesh markdown tree
Demo-discipline reminder. Webhooks and chart pastes are listed as two separate sources because they go through different code paths and have different semantics. They are easy to conflate; do not document them as one pipe.
← prev
01 · Overview