Skip to content

Unified Node Pipeline

SoftCopy now runs a single node-based pipeline for daily generation, analytics, and publishing metadata.

Overview

  • Node definitions live in config/nodes.json and are loaded by NodeRegistry.
  • Each node declares:
  • id, kind, family
  • depends_on (graph edges)
  • output_policy (where outputs are written)
  • publish_policy (day/stream visibility and labels)
  • scheduler behavior (parallelizable, rerun_policy, step_order)
  • Scheduler resolves the dependency closure for requested targets, executes nodes in topological order, and runs ready parallelizable nodes concurrently.

Flow

graph TD
  A["transcript"] --> B["transcript_clean"]
  B --> C["narrative"]
  C --> D["sam_pov"]
  C --> E["dave_pov"]
  C --> F["Pulse / Odyssey / Obey"]
  D --> G["signals"]
  E --> G
  F --> G
  G --> H["rollups"]
  H --> I["insights"]
  I --> J["publish (site/day+streams)"]

Dashboard Integration

  • The dashboard reads and edits node metadata via /nodes endpoints.
  • Day and batch runs enqueue scheduler-backed jobs (run_nodes, run_day, run_range).
  • Prompt create/update operations synchronize prompt metadata into node entries so custom prompts are first-class pipeline nodes.

Publish Integration

  • Publisher reads node metadata from the registry.
  • publish_policy.day_section controls inclusion in day pages.
  • publish_policy.stream.enabled and stream metadata (slug, title) control stream pages.
  • output_policy.relative_path and extra_relative_paths define which files are read for each node.

Adding New Nodes and Prompts

  1. Register node metadata in config/nodes.json (or via dashboard node APIs).
  2. Set depends_on to model prerequisites.
  3. Set output_policy.relative_path to the canonical output location.
  4. Set publish_policy to opt in/out of day and stream pages.
  5. For prompt nodes:
  6. Create/update the runtime prompt definition.
  7. Ensure prompt/node IDs match.
  8. Use dependencies to anchor fan-out and analytics prerequisites.

Rerun Semantics

  • force=True: forces explicit targets.
  • force_prerequisites=True: also forces dependency closure.
  • rerun_dependents=True: when a node reruns, downstream dependents are forced in the same run.
  • Prompt cache and node rerun policy combine to decide completed vs skipped.