Skip to content

Add a native type: transform step for deterministic data reshape between agents #217

@jrob5756

Description

@jrob5756

Summary

Add a step type that runs a deterministic data transformation (Jinja2 / JMESPath / jq-style) over the current context and binds the result as that step's output. No LLM call, no subprocess.

Motivation

In real workflows you constantly need to reshape data between agents:

  • Project a list of objects down to a list of IDs for for_each to iterate over
  • Filter a research agent's output to just the high-confidence items
  • Merge two parallel agents' outputs into a single structured object the downstream agent expects
  • Compute a derived value (counts, totals, slugs) from prior output

Today the options are:

  1. Do it inline in the next agent's prompt — wastes tokens, non-deterministic, error-prone
  2. type: script with a tiny Python/jq one-liner — works but is heavy: subprocess, stdout parsing, separate file for non-trivial logic
  3. Build it into the upstream agent's output: schema — only works for the simplest projections

A native transform step makes data reshape first-class, deterministic, and zero-cost.

Proposed shape

agents:
  - name: extract_kpi_ids
    type: transform
    engine: jinja2  # or: jmespath | jq
    expression: "{{ finder.output.kpis | map(attribute='id') | list }}"
    output:
      type: array
      items: { type: string }

  - name: merge_research
    type: transform
    engine: jinja2
    expression: |
      {
        "summary": "{{ summarizer.output.summary }}",
        "sources": {{ source_finder.output.sources | tojson }},
        "confidence": {{ (validator.output.confidence + scorer.output.confidence) / 2 }}
      }

The step's output is whatever the expression evaluates to. Routes downstream can match on it normally.

Engine choices

  • jinja2 (mandatory, already a dependency) — covers 80% of cases with our existing template environment and filters.
  • jmespath (optional, small dep) — much nicer for deep object projections than Jinja2. JSON-native.
  • jq (optional, larger dep or subprocess) — likely defer to a follow-up unless someone asks.

Shipping with just jinja2 first keeps the dependency surface flat.

Why now

  • Removes the most common reason people reach for type: script (which then needs working_dir, env vars, exit codes — overkill for "pull field X out of agent Y's output").
  • Composes with everything: routes:, output: schema validation, for_each (a transform step's output is a clean source: for the next for-each), parallel groups.
  • Implementation reuses our existing TemplateRenderer for the Jinja2 case; minimal new code in executor/.

Open questions

  • Should we allow multiple named outputs from one transform step (e.g. outputs: { ids: "...", names: "..." }) or keep one expression → one output?
  • Default engine when not specified: probably jinja2 since it's already a workflow primitive.
  • Output type coercion: Jinja2 renders to a string by default; we'll need to detect "this looks like JSON / a list / a dict" and parse it, or require the user to mark output_type: json|string|number|bool.

Acceptance criteria

  • type: transform accepted by the YAML schema
  • Jinja2 engine works out of the box, has access to the full workflow context
  • Result is bound as the step's output, available to routes and downstream agents
  • Output type detection or explicit output_type: field for non-string results
  • Example under examples/ showing extraction + reshape between two agents
  • Tests for: simple projection, list filtering, conditional output, type coercion, error on invalid expression

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:configYAML schema, loader, validatorarea:executorAgent and script execution, templates, outputenhancementNew feature or requestideaSpeculative feature proposal — not yet committed

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions