Read the original free-form source artifact if present, the free-form summary Markdown document, and the freeform-to-Galaxy interface and data-flow briefs. Emit a gxformat2 skeleton with workflow inputs, workflow outputs, placeholder steps, rough connections, and TODO slots for later implementation Molds.
The free-form summary does not have a concrete schema yet; treat it as Markdown. Treat the prior-step index as the working context: source transcript or paper, free-form summary, freeform-to-Galaxy interface and data-flow briefs, and any open questions carried forward.
Topology is this Mold’s job to settle. The output must be concrete gxformat2: workflow inputs with their final collection shapes and formats, workflow outputs, the step set, the producer→consumer edge graph, branches, and when: guards are all decided here. The upstream freeform-to-Galaxy interface and data-flow briefs guide those decisions, but if they hedge or leave a topology choice open, this Mold makes the call from source evidence, IWC exemplars, and pattern pages — never emit a topology TODO. Wrapper resolution, by contrast, is evidence-gated, not source-gated: resolve each tool step to the tier its evidence supports — Resolved (fully concrete, no _plan_*), Identity-pinned (concrete tool_id, parameters and changeset left to the per-step Mold), or Deferred (tool_id: TODO) — as defined in galaxy-workflow-draft-format. Capture whatever you defer in the _plan_* family (_plan_state, _plan_context, _plan_in, _plan_out) so the per-step Mold has the source evidence and constraints it needs.
Source tendency: free-form sources rarely name tools, so steps land in Deferred more often than nf-core or CWL — but a free-form source that does name a specific tool/version with evidence hardens to the matching tier, and a corpus-confirmed utility wrapper (interval/tabular/collection op) is not deferred just because the surrounding prose is informal. When deferring a domain tool, cite the originating paper section, interview answer, figure, or supplementary table in _plan_context, and record vague intent in _plan_state (“default settings”, “stranded reverse if mentioned, else unstranded”) so the per-step Mold knows the evidence ceiling.
Before handing off, sanity-check that each step is computable from what feeds it. Once the step set is settled, re-read it and ask, for each step, whether the operation its intent implies can actually be produced from the inputs you wired. The connection graph only knows that ports connect — not what each port is supposed to contain — so an output that needs evidence no input carries will validate fine and still be impossible to implement. Where you spot that gap, don’t leave it implicit: wire (or add) the step that supplies the missing input, or record the unmet need plainly in _plan_state so the per-step Mold or a reviewer can act on it rather than discover it late.
Things worth a second look:
- an output column or field that no wired input carries;
- an aggregate or summary whose grouping key isn’t present upstream;
- a filter or threshold whose criterion isn’t produced by any input;
- a join whose key doesn’t exist on both sides;
- a step whose
_plan_* promises more than its in: ports can supply;
- if classification step, is that classification/enumeration possible only from inputs.
Optionally, once topology is settled, group the step set into titled stage frames via the gxformat2 comments: array (one frame per analysis stage, contains_steps: populated, color decorative) — see galaxy-workflow-comments for the convention.
Before handing off, check each settled step is computable from what feeds it. The connection graph knows that ports connect, not what they carry — so a declared output that needs evidence no wired input supplies will validate yet can’t be implemented. Where you find that gap, wire (or add) the producer; if you can’t, append a blocking entry to the open-requirements-ledger naming the step, the uncomputable output, and the missing evidence (and record vague intent in _plan_state) so the per-step loop or repair-galaxy-draft-topology acts on it rather than discovering it late. More generally, carry the ledger: read the entries bearing on your topology decisions and mark resolved the ones you close.
Output shape is gxformat2 with wrapper-tier relaxations and _plan_state / _plan_context / _plan_in / _plan_out per tool step — see galaxy-workflow-draft-format. Refinement open work for those planning fields lives in refinement.md.