跳到主要内容

Pipeline

The pipeline is a Python CLI under services/pipeline.

Stages:

  • discover: parse Tenhou FileIndex content and record source index metadata.
  • fetch: enforce Tenhou download policy before any network implementation is enabled.
  • parse: parse mjlog XML into game facts, player facts, and round summaries.
  • load: idempotently upsert parsed facts into SQLite/D1-compatible tables.
  • aggregate: refresh player/day and player/mode summaries.
  • verify: run local consistency checks and emit status information.
  • backfill: iterate dates in batches for slow historical recovery.
  • reparse: reparse stored XML with a new parser version.

The default CLI behavior is offline/local. Network fetching must preserve the policy guards in paifu_pipeline.tenhou.TenhouFetchPolicy.