Summary
Add a step type that runs a deterministic data transformation (Jinja2 / JMESPath / jq-style) over the current context and binds the result as that step's output. No LLM call, no subprocess.
Motivation
In real workflows you constantly need to reshape data between agents:
- Project a list of objects down to a list of IDs for
for_each to iterate over
- Filter a research agent's output to just the high-confidence items
- Merge two parallel agents' outputs into a single structured object the downstream agent expects
- Compute a derived value (counts, totals, slugs) from prior output
Today the options are:
- Do it inline in the next agent's prompt — wastes tokens, non-deterministic, error-prone
type: script with a tiny Python/jq one-liner — works but is heavy: subprocess, stdout parsing, separate file for non-trivial logic
- Build it into the upstream agent's
output: schema — only works for the simplest projections
A native transform step makes data reshape first-class, deterministic, and zero-cost.
Proposed shape
agents:
- name: extract_kpi_ids
type: transform
engine: jinja2 # or: jmespath | jq
expression: "{{ finder.output.kpis | map(attribute='id') | list }}"
output:
type: array
items: { type: string }
- name: merge_research
type: transform
engine: jinja2
expression: |
{
"summary": "{{ summarizer.output.summary }}",
"sources": {{ source_finder.output.sources | tojson }},
"confidence": {{ (validator.output.confidence + scorer.output.confidence) / 2 }}
}
The step's output is whatever the expression evaluates to. Routes downstream can match on it normally.
Engine choices
jinja2 (mandatory, already a dependency) — covers 80% of cases with our existing template environment and filters.
jmespath (optional, small dep) — much nicer for deep object projections than Jinja2. JSON-native.
jq (optional, larger dep or subprocess) — likely defer to a follow-up unless someone asks.
Shipping with just jinja2 first keeps the dependency surface flat.
Why now
- Removes the most common reason people reach for
type: script (which then needs working_dir, env vars, exit codes — overkill for "pull field X out of agent Y's output").
- Composes with everything:
routes:, output: schema validation, for_each (a transform step's output is a clean source: for the next for-each), parallel groups.
- Implementation reuses our existing
TemplateRenderer for the Jinja2 case; minimal new code in executor/.
Open questions
- Should we allow multiple named outputs from one transform step (e.g.
outputs: { ids: "...", names: "..." }) or keep one expression → one output?
- Default engine when not specified: probably
jinja2 since it's already a workflow primitive.
- Output type coercion: Jinja2 renders to a string by default; we'll need to detect "this looks like JSON / a list / a dict" and parse it, or require the user to mark
output_type: json|string|number|bool.
Acceptance criteria
Summary
Add a step type that runs a deterministic data transformation (Jinja2 / JMESPath / jq-style) over the current context and binds the result as that step's output. No LLM call, no subprocess.
Motivation
In real workflows you constantly need to reshape data between agents:
for_eachto iterate overToday the options are:
type: scriptwith a tiny Python/jq one-liner — works but is heavy: subprocess, stdout parsing, separate file for non-trivial logicoutput:schema — only works for the simplest projectionsA native transform step makes data reshape first-class, deterministic, and zero-cost.
Proposed shape
The step's output is whatever the expression evaluates to. Routes downstream can match on it normally.
Engine choices
jinja2(mandatory, already a dependency) — covers 80% of cases with our existing template environment and filters.jmespath(optional, small dep) — much nicer for deep object projections than Jinja2. JSON-native.jq(optional, larger dep or subprocess) — likely defer to a follow-up unless someone asks.Shipping with just
jinja2first keeps the dependency surface flat.Why now
type: script(which then needsworking_dir, env vars, exit codes — overkill for "pull field X out of agent Y's output").routes:,output:schema validation,for_each(a transform step's output is a cleansource:for the next for-each), parallel groups.TemplateRendererfor the Jinja2 case; minimal new code inexecutor/.Open questions
outputs: { ids: "...", names: "..." }) or keep one expression → one output?jinja2since it's already a workflow primitive.output_type: json|string|number|bool.Acceptance criteria
type: transformaccepted by the YAML schemaoutput_type:field for non-string resultsexamples/showing extraction + reshape between two agents