02 · ingestion · non-technical

Where does the data come from?

Nine distinct sources feed the system. Each lands in a specific table or file, on a specific cadence, in a specific door's lane. Nothing is shared between domains at the source level.

Live market data yfinance · FRED · earnings TradingView webhooks Pine Script alerts → text Operator chart pastes screenshot → vision OCR YouTube videos transcript + Qwen2-VL SEC Edgar filings 8-K · 10-Q · 10-K Smart-money snapshots The Street/ aggregator Verocity workout logs live per-session TradingV postgres 29 tables (laptop) ohlcv_bars · macro_series alerts · tv_context_items recommendations (finance) Knowledge-vault ~/Documents/knowledge-vault Videos/<author>/ Filings/<ticker>/ The Street/snapshots/ Books/ Newsletters/ Topics/<domain>/ 4 indexers (8001-8004) Supabase cloud — phone-shared workout_logs plans · movements recommendations (fitness, nutrition) Lakshmi finance Zeus fitness Athena nutrition (scaffold) Ganesh learning
Sources land in one of three places. Each door reads only from the places its domain owns.
Pulse cadence indicates update frequency: webhooks fast · live workouts · daily polls medium · manual imports slow.

The nine sources, one by one

1 · Live market data — yfinance + FRED + earnings calendar

Three sub-pipes inside the TradingView app's app/market_data/ module:

  • OHLCV bars via yfinance — daily refresh for every watchlist ticker (240+ bars). Lands in ohlcv_bars (composite PK on symbol+interval+ts).
  • Macro series via yfinance + FRED API — daily 01:00 UTC, gated by MACRO_ENABLED. Lands in macro_series.
  • Earnings calendar — rolling 150-row universe (NASDAQ + Street Tier-1+2 + 8-K Item 2.02 confirm), 90-day TTL. Lands in earnings_calendar_rows.
  • IV percentile + earnings dates per ticker upserted into ticker_market_data.

Door consumer: Lakshmi only.

2 · TradingView webhooks (Pine Script alerts)

The legacy unversioned POST /webhook endpoint receives Pine Script alerts as text. These are notification-semantic — no dedup, fast lane.

  • Lands in alerts table
  • Fans out to tv_context_items with kind=webhook (7-day TTL, deduped by SHA256 within a rolling window).

Door consumer: Lakshmi only.

3 · Operator chart pastes (separate pipe!)

This is not the webhook pipe. When the operator pastes a chart screenshot into the app's TV Context route (POST /v1/tv-context/screenshot), a vision pipeline kicks in:

  1. Tesseract OCR extracts tickers
  2. Claude vision summarises the chart
  3. Qwen2-VL extracts structured chart references
  4. Sidecar markdown written to ~/Documents/knowledge-vault/Sources/.../<ticker>_<HMS>_<id>.md
  5. Unknown tickers flow into ticker_review_queue

Door consumer: Lakshmi only. (Per CLAUDE.md demo-discipline: "webhooks and screenshots are two pipes — document them as such.")

4 · YouTube videos (channel-polled, vision-extracted)

The vault-indexer's youtube_channel.py ingester polls every channel listed under Videos/<author>/_channel.yaml. Cadence: hourly, gated by VIDEO_INGEST_ENABLED.

For each new video, a 3-stage pipeline runs:

  1. Frame detection — sample candidate keyframes
  2. Tesseract OCR — pull text overlays
  3. Qwen2-VL (MLX on Apple Silicon) — caption frames + extract structured chart YAML

The output is a markdown at ~/Documents/knowledge-vault/Videos/<author>/<week>-<slug>.md. Unknown tickers seen in vision frames go into ticker_review_queue.

Door consumer: any door whose vault scope includes Videos/<subdomain>.

5 · SEC Edgar filings

The vault-indexer's ingest_edgar.py polls Edgar daily for watchlist tickers, gated by EDGAR_INGEST_ENABLED. Writes per-filing markdown to ~/Documents/knowledge-vault/Filings/<ticker>/<accession>.md. Idempotent on accession number.

Door consumer: Lakshmi only (finance vault scope).

6 · Smart-money snapshots (The Street/)

An aggregator pipeline that imports tier-1/tier-2 conviction lists, politician buys (House STOCK Act, Senate EFD), insider buys (Form-4), and options flow.

Each snapshot is dated: The Street/snapshots/YYYY-MM-DD/. Files: tier-1-conviction.md, tier-2-conviction.md, politicians-buys.md, insiders-top.md, options-bullish.md, plus _index.md and methodology.md.

Chunking unit: per-ticker H2 section (~600 tokens by design).

Status: all 6 data sources currently enabled: false in _source.yaml; manual snapshots exist (last: 2026-05-08).

7 · Verocity workout logs (Supabase, live)

Every workout the operator does in the Lovable verocity app writes one row to Supabase workout_logs. Fields include planned vs actual sets, RPE, total_seconds, avg_hr_bpm.

Derived signals are computed by a SQL view v_drift_signals with seven components (pace, block_lag, gap_breach, load_drop, rpe_drift, recovery, conditioning).

Door consumer: Zeus.

8 · Manual measurements YAML (weekly)

Append-only file at Zeus/03_execution/measurements.yaml. Operator writes bodyweight + waist circumference weekly.

A stdlib helper body_recomp_snapshot.py reads this and emits a body_recomp_sub JSON signal that becomes the 10% second-layer weight on the fitness drift composite (D-022).

Door consumer: Zeus only.

9 · Learning state (markdown only)

Ganesh has no DB. Signals come from heterogeneous markdown via a single helper:

  • Ganesh/02_library/active_sprints.md — current slot deadlines
  • Ganesh/_inbox/ — unprocessed drops (7-day rotation)
  • Ganesh/03_execution/learning_log.md — append-only session log
  • Ganesh/02_library/notes/ — processed notes with applied_in: graduation tracking

The helper ganesh_state_snapshot.py reads the markdown and emits a 5-component drift signal JSON.

Door consumer: Ganesh only.

Cadences at a glance

SourceCadenceDestination
OHLCV barsdailyTradingV ohlcv_bars
Macro seriesdaily 01:00 UTCTradingV macro_series
Earnings calendardailyTradingV earnings_calendar_rows
TV webhooksreal-time pushTradingV alerts + tv_context_items
Chart pastesoperator-drivenTradingV tv_context_items + vault sidecar
YouTubehourly pollvault Videos/<author>/
Edgardaily pollvault Filings/<ticker>/
Smart-moneyoperator importvault The Street/snapshots/
Workoutsper-session liveSupabase workout_logs
Measurementsweekly manualfile Zeus/03_execution/measurements.yaml
Learning stateoperator-drivenGanesh markdown tree
Demo-discipline reminder. Webhooks and chart pastes are listed as two separate sources because they go through different code paths and have different semantics. They are easy to conflate; do not document them as one pipe.
← prev
01 · Overview