Claude skill · cast

freeform-summary-to-galaxy-template

gxformat2 skeleton with per-step TODOs from a free-form summary and Galaxy design brief.

← All cast skills · Source mold →

Install

/plugin marketplace add galaxyproject/foundry
/plugin install foundry-skills@galaxy-workflow-foundry

Then invoke as:

/foundry-skills:freeform-summary-to-galaxy-template

Skill Bundle

/ packaged cast
attached files
12
upfront
3
on demand
9
cast rev
n/a
validated
0

Produces: 1 artifact.

Consumes: 4 artifacts.

Artifact Contract

/ skill handoff

Produces

galaxy-workflow-draft

gxformat2 draft (see [[galaxy-workflow-draft-format]]): topology fully resolved (workflow inputs, outputs, step set, edges); tool_id / tool_state / tool_shed_repository and wrapper-determined port names may be TODO with free-text _plan_state / _plan_context / _plan_in / _plan_out per step for later implementation Molds.

yamlgalaxy-workflow-draft.gxwf.yml[[galaxy-workflow-draft]]
Raw artifact contract
{
  "id": "galaxy-workflow-draft",
  "kind": "yaml",
  "default_filename": "galaxy-workflow-draft.gxwf.yml",
  "schema": "[[galaxy-workflow-draft]]",
  "description": "gxformat2 draft (see [[galaxy-workflow-draft-format]]): topology fully resolved (workflow inputs, outputs, step set, edges); tool_id / tool_state / tool_shed_repository and wrapper-determined port names may be TODO with free-text _plan_state / _plan_context / _plan_in / _plan_out per step for later implementation Molds."
}

Consumes

freeform-summary

Free-form source summary emitted by [[summarize-paper]] or [[interview-to-freeform-summary]]; consulted while emitting placeholder steps.

Raw artifact contract
{
  "id": "freeform-summary",
  "description": "Free-form source summary emitted by [[summarize-paper]] or [[interview-to-freeform-summary]]; consulted while emitting placeholder steps.",
  "producers": [
    "interview-to-freeform-summary",
    "summarize-paper"
  ]
}

freeform-galaxy-interface

Galaxy interface brief from [[freeform-summary-to-galaxy-interface]] that pins workflow inputs, outputs, labels.

Raw artifact contract
{
  "id": "freeform-galaxy-interface",
  "description": "Galaxy interface brief from [[freeform-summary-to-galaxy-interface]] that pins workflow inputs, outputs, labels.",
  "producers": [
    "freeform-summary-to-galaxy-interface"
  ]
}

freeform-galaxy-data-flow

Galaxy data-flow brief from [[freeform-summary-to-galaxy-data-flow]] that pins abstract operations and collection choices.

Raw artifact contract
{
  "id": "freeform-galaxy-data-flow",
  "description": "Galaxy data-flow brief from [[freeform-summary-to-galaxy-data-flow]] that pins abstract operations and collection choices.",
  "producers": [
    "freeform-summary-to-galaxy-data-flow"
  ]
}

iwc-comparison-notes

Structural diff guidance from [[compare-against-iwc-exemplar]] (run on the design brief); steers the skeleton toward IWC-aligned structure before per-step authoring.

Raw artifact contract
{
  "id": "iwc-comparison-notes",
  "description": "Structural diff guidance from [[compare-against-iwc-exemplar]] (run on the design brief); steers the skeleton toward IWC-aligned structure before per-step authoring.",
  "producers": [
    "compare-against-iwc-exemplar"
  ]
}

Attached Files

/ runtime references

Load upfront

research

galaxy-data-flow-draft-contract

packaged

Respect the handoff from the freeform-to-Galaxy interface and data-flow briefs to the gxformat2 skeleton.

Trigger: When translating abstract nodes, unresolved tool needs, and placeholder transformations into template TODOs.

upfront runtime verbatim hypothesis deterministic 6.5 KB
bundle
references/notes/galaxy-data-flow-draft-contract.md
source
content/research/galaxy-data-flow-draft-contract.md
Preview md
---
type: research
subtype: design-spec
title: "Galaxy data-flow draft contract"
tags:
  - research/design-spec
  - target/galaxy
status: draft
created: 2026-05-02
revised: 2026-05-03
revision: 2
ai_generated: true
related_notes:
  - "[[nextflow-to-galaxy-channel-shape-mapping]]"
  - "[[nextflow-operators-to-galaxy-collection-recipes]]"
  - "[[galaxy-workflow-draft]]"
related_molds:
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[cwl-summary-to-galaxy-data-flow]]"
  - "[[freeform-summary-to-galaxy-data-flow]]"
  - "[[nextflow-summary-to-galaxy-template]]"
  - "[[cwl-summary-to-galaxy-template]]"
  - "[[freeform-summary-to-galaxy-template]]"
  - "[[compare-against-iwc-exemplar]]"
  - "[[advance-galaxy-draft-step]]"
sources:
  - "https://github.com/galaxyproject/foundry/issues/54"
summary: "Defines the proposed boundary between Galaxy data-flow drafts, gxformat2 templates, and concrete step implementation."
---

# Galaxy Data-Flow Draft Contract

This is an architectural contract, not a schema. Evidence is strongest for Mold and Pipeline boundaries. Proposed fields are speculative until exercised by two or three worked translations.

## Boundary

The data-flow draft owns a target-shaped abstract DAG for Galaxy. It should not be valid `gxformat2` and should not resolve exact Tool Shed tools.

Data-flow draft owns:

- Galaxy-facing workflow inputs and outputs.
- Abstract nodes, edges, branches, collection mapping, collection reduction, and placeholder transformations.
- Input/output shape decisions such as `File`, `list`, `paired`, `list:paired`, or `list:list`.
- Conceptual Galaxy idioms: map-over, reduction, Apply Rules, collection cleanup, identifier synchronization, tabular bridge.
- Abstract unresolved tool needs with input and output shapes.
- Confidence and rat
...
research

galaxy-workflow-draft-format

packaged

Emit the gxformat2 draft superset: TODO tool_id, optional tool_state / tool_shed_repository, and per-step _plan_state / _plan_context planning fields.

upfront runtime verbatim hypothesis deterministic 7.2 KB
bundle
references/notes/galaxy-workflow-draft-format.md
source
content/research/galaxy-workflow-draft-format.md
Preview md
---
type: research
subtype: design-spec
title: "Galaxy workflow draft format"
tags:
  - research/design-spec
  - target/galaxy
status: draft
created: 2026-05-06
revised: 2026-05-10
revision: 2
ai_generated: true
related_notes:
  - "[[gxformat2-schema]]"
  - "[[galaxy-data-flow-draft-contract]]"
  - "[[discover-shed-tool]]"
  - "[[galaxy-workflow-draft]]"
related_molds:
  - "[[nextflow-summary-to-galaxy-template]]"
  - "[[cwl-summary-to-galaxy-template]]"
  - "[[freeform-summary-to-galaxy-template]]"
  - "[[compare-against-iwc-exemplar]]"
  - "[[implement-galaxy-tool-step]]"
  - "[[advance-galaxy-draft-step]]"
summary: "gxformat2 draft superset: wrapper-tier TODOs (tool_id, tool_state, port names) plus _plan_state / _plan_context / _plan_in / _plan_out per tool step."
---

# Galaxy workflow draft format

The output artifact `galaxy-workflow-draft` produced by the `*-summary-to-galaxy-template` Molds is **gxformat2 with wrapper-tier relaxations and free-text planning fields**, sized to the gap between data-flow design and tool-resolved implementation.

Topology — workflow inputs and their collection shapes, workflow outputs, the step set, the producer→consumer edge graph, branches, and `when:` guards — is **settled by the template Mold itself**, drawing on the upstream interface and data-flow briefs (see [[galaxy-data-flow-draft-contract]]). The output is concrete gxformat2 with no topology TODOs. Everything deferred to the per-step implementation Mold is wrapper-tier: which Tool Shed wrapper, what parameters, and the wrapper-determined port names that populate `in:` / `out:` / `outputSource`.

## Relaxations vs. gxformat2

For tool steps, when the wrapper has not been picked:

- `tool_id` and `tool_version` MAY be the literal string `TODO`. Resolution belongs to [[discov
...
schema

galaxy-workflow-draft

packaged

Output contract: the emitted gxformat2 draft conforms to [[galaxy-workflow-draft]]. Cast bundles the JSON Schema so the skill carries its output shape alongside the [[draft-validate]] CLI checks.

upfront runtime verbatim cast-validated deterministic 58.6 KB
bundle
references/schemas/galaxy-workflow-draft.schema.json
source
package://@galaxy-tool-util/schema#galaxyWorkflowDraftJsonSchema
Preview json
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$defs": {
    "WorkflowStepSchema": {
      "type": "object",
      "required": [],
      "properties": {
        "id": {
          "anyOf": [
            {
              "type": "null"
            },
            {
              "type": "string"
            }
          ]
        },
        "label": {
          "anyOf": [
            {
              "type": "null"
            },
            {
              "type": "string"
            }
          ]
        },
        "doc": {
          "anyOf": [
            {
              "type": "null"
            },
            {
              "type": "string"
            },
            {
              "type": "array",
              "items": {
                "type": "string"
              }
            }
          ]
        },
        "position": {
          "anyOf": [
            {
              "type": "null"
            },
            {
              "type": "object",
              "required": [
                "top",
                "left"
              ],
              "properties": {
                "top": {
                  "type": "number"
                },
                "left": {
                  "type": "number"
                }
              },
              "additionalProperties": false
            }
          ]
        },
        "tool_id": {
          "anyOf": [
            {
              "type": "null"
            },
            {
              "type": "string"
            }
          ]
        },
        "tool_shed_repository": {
          "$ref": "#/$defs/Shared15"
        },
        "tool_version": {
          "anyOf": [
            {
              "type": "null"
            },
            {
              "type": "string"
            }
          ]
        },
        "errors": {
          "anyOf": [
            {
              "type": "null"
            },
            {
              "type": "string"
            }
          ]
        },
        "uuid": {
          "anyOf": [
            {
              "type": "null"
            },
            {
              "type": "string"
            }
          ]
        },
        "in": {
          "$ref": "#/$defs/Shared16"
        },
        "out": {
          "anyOf": [
            {
              "type": "array",
              "items": {
                "$ref": "#/$defs/Shared4"
          
...

Load on demand

cli-command

draft-validate

packaged

Validate the emitted draft against draft-contract rules (sentinel form, topology, _plan_* placement) before handing off.

Trigger: After writing or modifying the draft workflow file.

on-demand runtime sidecar hypothesis deterministic 4.7 KB
bundle
references/cli/draft-validate.json
source
content/cli/gxwf/draft-validate.md
Preview json
{
  "type": "cli-command",
  "tool": "gxwf",
  "command": "draft-validate",
  "summary": "Validate a `class: GalaxyWorkflowDraft` workflow against draft-contract rules; with --concrete, also validate the extracted concrete subset.",
  "source_path": "content/cli/gxwf/draft-validate.md",
  "source_revision": 1,
  "body": "# `gxwf draft-validate`\n\nValidate a draft Galaxy workflow against the **draft contract**: sentinel form, dangling edge references, top-level `_plan_*` placement, `_plan_*` on fully-resolved tool steps, and recursive draft subworkflows. Native (.ga) input is rejected — drafts are format2-only.\n\nDistinct from [[validate]], which validates a fully concrete `class: GalaxyWorkflow` and would reject the draft relaxations outright. Use `draft-validate` during the per-step authoring loop; use `validate` at the terminal pass once `promoteFullyConcreteDrafts` has flipped the class.\n\n## Output\n\nDefault output is human-readable: counted buckets for structure / topology / semantic errors and warnings, plus a one-line survey (TODO sentinel count, paths carrying `_plan_*`). `--json` emits a `SingleDraftValidationReport`; `--report-html` and `--report-markdown` write the same data as a self-contained HTML page or templated Markdown. With `--concrete`, the report carries an optional `concrete: ConcreteValidationReport` whose buckets (`structure_errors`, `strict_structure_errors`, `strict_encoding_errors`, `strict_state_errors`, `tool_state`, `connection_report`) are **absent when the corresponding check did not run** — readers should treat absence as \"not run,\" not as \"passed.\"\n\n## Flags\n\n`--concrete` runs the extract+promote pipeline (`extractConcreteSubset` → `stripPlanFields` → `promoteFullyConcreteDrafts`) and applies the full concrete `gxformat2` validation surface to the result. The following pass-through flags only take effect under `--concrete`; passing them without it prints a stderr warning and no-ops:\n\n- `--cache-dir <dir>` — tool cache for tool-state lookups.\n- `--no-tool-state` — skip tool-state validation on the concrete pass. Combined with `--strict-state`, the strict flag warns + no-ops (there's no state to be strict about).\n- `--connections` — run connection validation on the concrete subset.\n- `--strict` — escalate every strict bucket (structure, encoding, state) to error.\n- `--strict-structure` / `--strict-encoding` /
...
pattern

galaxy-collection-patterns

packaged

Use corpus-grounded collection pattern guidance for unresolved skeleton steps.

Trigger: When adding TODO steps for collection cleanup, reshaping, relabeling, identifier synchronization, or collection-tabular bridges.

on-demand runtime verbatim corpus-observed deterministic 4.4 KB
bundle
references/patterns/galaxy-collection-patterns.md
source
content/patterns/galaxy-collection-patterns.md
Preview md
---
type: pattern
pattern_kind: moc
evidence: corpus-observed
title: "Galaxy: collection patterns"
aliases:
  - "Galaxy collection pattern MOC"
  - "collection transformation patterns"
  - "IWC collection pattern map"
tags:
  - pattern
  - target/galaxy
  - topic/galaxy-transform
  - topic/collection-transform
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
summary: "Use this MOC to choose corpus-grounded Galaxy collection transformation patterns."
related_notes:
  - "[[iwc-transformations-survey]]"
  - "[[iwc-conditionals-survey]]"
related_patterns:
  - "[[manifest-to-mapped-collection-lifecycle]]"
  - "[[cleanup-sync-and-publish-nonempty-results]]"
  - "[[reshape-relabel-remap-by-collection-axis]]"
  - "[[fan-in-bundle-consume-and-flatten]]"
  - "[[collection-cleanup-after-mapover-failure]]"
  - "[[sync-collections-by-identifier]]"
  - "[[harmonize-by-sortlist-from-identifiers]]"
  - "[[regex-relabel-via-tabular]]"
  - "[[relabel-via-rules-and-find-replace]]"
  - "[[collection-swap-nesting-with-apply-rules]]"
  - "[[collection-split-identifier-via-rules]]"
  - "[[collection-build-list-paired-with-apply-rules]]"
  - "[[tabular-to-collection-by-row]]"
  - "[[tabular-concatenate-collection-to-table]]"
  - "[[tabular-pivot-collection-to-wide]]"
related_molds:
  - "[[implement-galaxy-tool-step]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[cwl-summary-to-galaxy-data-flow]]"
  - "[[nextflow-summary-to-galaxy-template]]"
  - "[[cwl-summary-to-galaxy-template]]"
  - "[[freeform-summary-to-galaxy-template]]"
  - "[[compare-against-iwc-exemplar]]"
---

# Galaxy: collection patterns

This is the runtime-facing map for Galaxy collection transformation choices. Use it before loading raw survey notes. The survey remains evidence backin
...
pattern

galaxy-conditionals-patterns

packaged

Use corpus-grounded conditional pattern guidance for unresolved skeleton steps.

Trigger: When adding TODO steps for optional steps, gating on non-empty results, routing between alternative outputs, or transform-or-pass-through branches.

on-demand runtime verbatim corpus-observed deterministic 2.6 KB
bundle
references/patterns/galaxy-conditionals-patterns.md
source
content/patterns/galaxy-conditionals-patterns.md
Preview md
---
type: pattern
pattern_kind: moc
evidence: corpus-observed
title: "Galaxy: conditionals patterns"
aliases:
  - "Galaxy conditional pattern MOC"
  - "Galaxy when patterns"
  - "conditional workflow patterns"
tags:
  - pattern
  - target/galaxy
  - topic/galaxy-transform
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
summary: "Use this MOC to choose corpus-grounded Galaxy when and pick_value conditional patterns."
related_notes:
  - "[[iwc-conditionals-survey]]"
related_patterns:
  - "[[conditional-run-optional-step]]"
  - "[[conditional-route-between-alternative-outputs]]"
  - "[[conditional-gate-on-nonempty-result]]"
  - "[[conditional-transform-or-pass-through]]"
  - "[[collection-cleanup-after-mapover-failure]]"
related_molds:
  - "[[implement-galaxy-tool-step]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[cwl-summary-to-galaxy-data-flow]]"
  - "[[nextflow-summary-to-galaxy-template]]"
  - "[[cwl-summary-to-galaxy-template]]"
  - "[[freeform-summary-to-galaxy-template]]"
  - "[[compare-against-iwc-exemplar]]"
---

# Galaxy: conditionals patterns

This is the runtime-facing map for Galaxy conditional workflow choices. Use it before loading raw survey notes. The survey remains evidence backing; the operation and recipe pages are the actionable references.

## Direct Gates

- [[conditional-run-optional-step]] — expose or derive a boolean, connect it as `inputs.when`, and use `when: $(inputs.when)` to skip optional steps.
- [[conditional-gate-on-nonempty-result]] — compute a boolean from empty/non-empty dataset or collection state before gating downstream reporting/export. The MGnify recipe is corpus-backed but clunky pending verified-pattern workflow work.

## Routes and Fallbacks

- [[conditional-route-between-alternati
...
pattern

galaxy-interval-patterns

packaged

Use corpus-grounded genomic-interval pattern guidance for unresolved skeleton steps.

Trigger: When adding TODO steps for interval overlap, merge, coverage, windowing, masking, or set-algebra on coordinate features.

on-demand runtime verbatim corpus-observed deterministic 5.3 KB
bundle
references/patterns/galaxy-interval-patterns.md
source
content/patterns/galaxy-interval-patterns.md
Preview md
---
type: pattern
pattern_kind: moc
evidence: corpus-observed
title: "Galaxy: genomic interval patterns"
aliases:
  - "Galaxy interval pattern MOC"
  - "genomic interval transformation patterns"
  - "IWC interval pattern map"
tags:
  - pattern
  - target/galaxy
  - topic/galaxy-transform
  - topic/interval-transform
status: draft
created: 2026-06-10
revised: 2026-06-10
revision: 1
ai_generated: true
summary: "Use this MOC to choose corpus-grounded Galaxy genomic interval operations and recipes on coordinate features."
related_notes:
  - "[[iwc-interval-operations-survey]]"
related_patterns:
  - "[[interval-overlap-filter]]"
  - "[[interval-coverage]]"
  - "[[interval-merge-overlapping]]"
  - "[[interval-window-flank]]"
  - "[[interval-consensus-by-multi-intersect]]"
  - "[[interval-mask-by-set-algebra]]"
  - "[[interval-windowed-coverage]]"
  - "[[tabular-synthesize-bed-from-3col]]"
related_molds:
  - "[[implement-galaxy-tool-step]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[cwl-summary-to-galaxy-data-flow]]"
  - "[[nextflow-summary-to-galaxy-template]]"
  - "[[cwl-summary-to-galaxy-template]]"
  - "[[paper-summary-to-galaxy-template]]"
  - "[[compare-against-iwc-exemplar]]"
---

# Galaxy: genomic interval patterns

The runtime-facing map for Galaxy **coordinate-feature** choices — operations that understand `chrom/start/end/strand`, as opposed to opaque-column [[galaxy-tabular-patterns]] or container-shaped [[galaxy-collection-patterns]]. Use it before loading raw survey notes; [[iwc-interval-operations-survey]] is the evidence backing, these pages are the actionable references.

This is the smallest of the three data-shape MOCs by design. Interval algebra is a real but moderate cluster in IWC — concentrated in epigenetics peak-consensus and SARS-CoV-2 mask
...
pattern

galaxy-tabular-patterns

packaged

Use corpus-grounded tabular pattern guidance for unresolved skeleton steps.

Trigger: When adding TODO steps for tabular filtering, projection, joins, aggregation, text-processing recipes, or tabular-collection bridges.

on-demand runtime verbatim corpus-observed deterministic 3.1 KB
bundle
references/patterns/galaxy-tabular-patterns.md
source
content/patterns/galaxy-tabular-patterns.md
Preview md
---
type: pattern
pattern_kind: moc
evidence: corpus-observed
title: "Galaxy: tabular patterns"
aliases:
  - "Galaxy tabular pattern MOC"
  - "tabular transformation patterns"
  - "IWC tabular pattern map"
tags:
  - pattern
  - target/galaxy
  - topic/galaxy-transform
  - topic/tabular-transform
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
summary: "Use this MOC to choose corpus-grounded Galaxy tabular transformation patterns."
related_notes:
  - "[[iwc-tabular-operations-survey]]"
related_patterns:
  - "[[tabular-filter-by-column-value]]"
  - "[[tabular-filter-by-regex]]"
  - "[[tabular-cut-and-reorder-columns]]"
  - "[[tabular-compute-new-column]]"
  - "[[tabular-join-on-key]]"
  - "[[tabular-group-and-aggregate-with-datamash]]"
  - "[[tabular-sql-query]]"
  - "[[tabular-prepend-header]]"
  - "[[tabular-synthesize-bed-from-3col]]"
  - "[[tabular-split-taxonomy-string]]"
  - "[[tabular-relabel-by-row-counter]]"
  - "[[tabular-to-collection-by-row]]"
  - "[[tabular-concatenate-collection-to-table]]"
  - "[[tabular-pivot-collection-to-wide]]"
related_molds:
  - "[[implement-galaxy-tool-step]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[cwl-summary-to-galaxy-data-flow]]"
  - "[[nextflow-summary-to-galaxy-template]]"
  - "[[cwl-summary-to-galaxy-template]]"
  - "[[freeform-summary-to-galaxy-template]]"
  - "[[compare-against-iwc-exemplar]]"
---

# Galaxy: tabular patterns

This is the runtime-facing map for Galaxy tabular transformation choices. Use it before loading raw survey notes. The survey remains evidence backing; the operation pages are the actionable references.

## Row And Column Operations

- [[tabular-filter-by-column-value]] — keep/drop rows by string column value with `Filter1`.
- [[tabular-filter-by-regex]] 
...
research

galaxy-collection-semantics

packaged

Preserve Galaxy collection typing and map-over/reduction semantics in the gxformat2 skeleton.

Trigger: When creating workflow inputs, outputs, and placeholder connections involving collections.

on-demand runtime verbatim corpus-observed deterministic 1.9 KB
bundle
references/notes/galaxy-collection-semantics.md
source
content/research/galaxy-collection-semantics.md
Preview md
---
type: research
subtype: component
title: "Galaxy collection semantics"
tags:
  - research/component
  - target/galaxy
status: draft
created: 2026-04-30
revised: 2026-05-05
revision: 3
ai_generated: false
related_notes:
  - "[[galaxy-xsd]]"
  - "[[galaxy-collection-tools]]"
  - "[[galaxy-apply-rules-dsl]]"
  - "[[nextflow-to-galaxy-channel-shape-mapping]]"
  - "[[nextflow-operators-to-galaxy-collection-recipes]]"
  - "[[galaxy-tool-job-failure-reference]]"
  - "[[galaxy-workflow-invocation-failure-reference]]"
  - "[[iwc-transformations-survey]]"
  - "[[galaxy-discover-datasets]]"
sources:
  - "https://github.com/galaxyproject/galaxy/blob/7765fae934fbfdee77e3be5f5b235e43735273ae/lib/galaxy/model/dataset_collections/types/collection_semantics.yml"
companions:
  - "galaxy-collection-semantics.yml"
  - "galaxy-collection-semantics.upstream.myst"
summary: "Vendored formal spec of Galaxy dataset-collection mapping/reduction semantics, with labeled examples and pinned test references."
---

> **Vendored from upstream**, pinned at SHA `7765fae`. Two files live next to this note:
>
> - `galaxy-collection-semantics.yml` — the structured source. **Agents and casting should consume this.** It carries the `tests:` blocks that pin concrete Galaxy test names; the rendered upstream view drops them.
> - `galaxy-collection-semantics.upstream.myst` — Galaxy's auto-generated MyST/LaTeX rendering of the YAML, vendored only so the human view below has something to render. Sync is manual.
>
> **When to consult:** authoring or reasoning about Molds and patterns that touch `data_collection` inputs, map-over / reduction shape changes, sub-collection mapping, `paired_or_unpaired`, or `sample_sheet`.

```vendored-myst
file: galaxy-collection-semantics.upstream.myst
source: https://github.com/g
...
research

galaxy-collection-semantics

packaged

Preserve Galaxy collection typing and map-over/reduction semantics in the gxformat2 skeleton.

Trigger: When creating workflow inputs, outputs, and placeholder connections involving collections.

on-demand runtime verbatim corpus-observed deterministic 33.4 KB
bundle
references/notes/galaxy-collection-semantics.upstream.myst
source
content/research/galaxy-collection-semantics.upstream.myst
Preview myst
# Collection Semantics

This document describes the semantics around working with Galaxy dataset collections.
In particular it describes how they operate within Galaxy tools and workflows.

:::{admonition} You Probably Don't Need to Read This
:class: caution

Any significantly sophisticated workflow language will have ways to collect data
into arrays or vectors or dictionaries and apply operations across this data (mapping)
or reduce the dimensionality of this data (reductions). Typically, this is explicitly
annotated with map functions or for loops. Galaxy however is designed to be a point
and click interface for connecting steps and running tools. It is important that steps
just connect and just do the most natural thing - and this is what Galaxy does.
This document just provides a mathematical formalism to that "what should just
intuitively work" that can be used to document test cases and help with implementation.
This is reference documentation not user documentation, Galaxy should just work.
:::

## Mapping

If a tool consumes a simple dataset parameter and produces a simple dataset parameter,
then any collection type may be "mapped over" the data input to that tool. The result of
that is the tool being applied to each element of the collection and "implicit collections"
being created from the outputs that are produced from those operations. Those implicit
collections have the same element identifiers in the same order as the input collection that is
mapped over. Each element of the implicit collections correspond to their own job and
Galaxy very naturally and intuitively parallelizes jobs without extra work from the user
and without any knowledge of the tool.


(BASIC_MAPPING_PAIRED)=
(BASIC_MAPPING_PAIRED_OR_UNPAIRED_PAIRED)=
(BASIC_MAPPING_PAIRED_OR_UNPAIRED_UN
...
research

galaxy-collection-semantics

packaged

Preserve Galaxy collection typing and map-over/reduction semantics in the gxformat2 skeleton.

Trigger: When creating workflow inputs, outputs, and placeholder connections involving collections.

on-demand runtime verbatim corpus-observed deterministic 43.8 KB
bundle
references/notes/galaxy-collection-semantics.yml
source
content/research/galaxy-collection-semantics.yml
Preview yml
- doc: |
    # Collection Semantics

    This document describes the semantics around working with Galaxy dataset collections.
    In particular it describes how they operate within Galaxy tools and workflows.

    :::{admonition} You Probably Don't Need to Read This
    :class: caution

    Any significantly sophisticated workflow language will have ways to collect data
    into arrays or vectors or dictionaries and apply operations across this data (mapping)
    or reduce the dimensionality of this data (reductions). Typically, this is explicitly
    annotated with map functions or for loops. Galaxy however is designed to be a point
    and click interface for connecting steps and running tools. It is important that steps
    just connect and just do the most natural thing - and this is what Galaxy does.
    This document just provides a mathematical formalism to that "what should just
    intuitively work" that can be used to document test cases and help with implementation.
    This is reference documentation not user documentation, Galaxy should just work.
    :::

    ## Mapping

    If a tool consumes a simple dataset parameter and produces a simple dataset parameter,
    then any collection type may be "mapped over" the data input to that tool. The result of
    that is the tool being applied to each element of the collection and "implicit collections"
    being created from the outputs that are produced from those operations. Those implicit
    collections have the same element identifiers in the same order as the input collection that is
    mapped over. Each element of the implicit collections correspond to their own job and
    Galaxy very naturally and intuitively parallelizes jobs without extra work from the user
    and without any knowledge of the tool.

...
research

galaxy-workflow-testability-design

packaged

Choose stable workflow input/output labels, testable checkpoint outputs, and fixture-compatible workflow interfaces while drafting the skeleton.

Trigger: When the template decides workflow inputs, workflow outputs, promoted checkpoints, or collection output identifiers that future tests will need to address.

on-demand runtime verbatim corpus-observed deterministic 11.4 KB
bundle
references/notes/galaxy-workflow-testability-design.md
source
content/research/galaxy-workflow-testability-design.md
Preview md
---
type: research
subtype: component
tags:
  - research/component
  - target/galaxy
status: draft
created: 2026-05-03
revised: 2026-05-06
revision: 2
ai_generated: true
related_notes:
  - "[[iwc-workflow-testability-survey]]"
  - "[[iwc-test-data-conventions]]"
  - "[[planemo-asserts-idioms]]"
  - "[[iwc-shortcuts-anti-patterns]]"
  - "[[planemo-workflow-test-architecture]]"
  - "[[implement-galaxy-workflow-test]]"
  - "[[gxformat2-schema]]"
  - "[[gxformat2-workflow-inputs]]"
  - "[[galaxy-datatypes-conf]]"
summary: "Design guidance for Galaxy workflow inputs, outputs, and checkpoints that make IWC-style workflow tests possible."
---

# Galaxy workflow testability design

Use this note when authoring or translating a Galaxy workflow **before** the `-tests.yml` file exists. It covers workflow structure choices that make later IWC-style tests meaningful: labels, promoted checkpoints, collection identifiers, and fixture-compatible inputs.

This is not a `content/patterns/` page. It is cross-cutting design guidance for Molds that need testable Galaxy workflows. Assertion syntax lives in [[planemo-asserts-idioms]]. Test YAML fixture shapes live in [[iwc-test-data-conventions]]. Accepted shortcut vs smell calls live in [[iwc-shortcuts-anti-patterns]]. Corpus evidence trail lives in [[iwc-workflow-testability-survey]].

## 1. Treat labels as API

Workflow input and output labels are not cosmetic. Planemo and IWC tests address workflow inputs and outputs by label, and the survey found exact label matches for every asserted output across 114 matched workflow/test pairs. A generated workflow should therefore pick stable, descriptive labels before test authoring starts.

Rules:

- Label every output that may need a test assertion.
- Treat input/output renames as breaking changes
...

SKILL.md


# freeform-summary-to-galaxy-template

Follow the procedure below and use the artifact/reference sections as the runtime contract.

## When To Use

- gxformat2 skeleton with per-step TODOs from a free-form summary and Galaxy design brief.

## Inputs

- Read artifact `freeform-summary`. Produced by `interview-to-freeform-summary`, `summarize-paper`. Free-form source summary emitted by summarize-paper or interview-to-freeform-summary; consulted while emitting placeholder steps.
- Read artifact `freeform-galaxy-interface`. Produced by `freeform-summary-to-galaxy-interface`. Galaxy interface brief from freeform-summary-to-galaxy-interface that pins workflow inputs, outputs, labels.
- Read artifact `freeform-galaxy-data-flow`. Produced by `freeform-summary-to-galaxy-data-flow`. Galaxy data-flow brief from freeform-summary-to-galaxy-data-flow that pins abstract operations and collection choices.
- Read artifact `iwc-comparison-notes`. Produced by `compare-against-iwc-exemplar`. Structural diff guidance from compare-against-iwc-exemplar (run on the design brief); steers the skeleton toward IWC-aligned structure before per-step authoring.

## Outputs

- Write artifact `galaxy-workflow-draft` as `galaxy-workflow-draft.gxwf.yml`. Format: `yaml`. Schema: galaxy-workflow-draft. gxformat2 draft (see galaxy-workflow-draft-format): topology fully resolved (workflow inputs, outputs, step set, edges); tool_id / tool_state / tool_shed_repository and wrapper-determined port names may be TODO with free-text _plan_state / _plan_context / _plan_in / _plan_out per step for later implementation Molds.

## Required Tools

- **`gxwf`** (gxwf). `npm install -g @galaxy-tool-util/cli`.
  Ephemeral run: `npx --package @galaxy-tool-util/cli gxwf`.
  Check: `gxwf --version`.
  Docs: https://github.com/jmchilton/galaxy-tool-util-ts/tree/main/packages/cli

## Load Upfront

- `references/notes/galaxy-data-flow-draft-contract.md`: Research note copied verbatim into the bundle. Respect the handoff from the freeform-to-Galaxy interface and data-flow briefs to the gxformat2 skeleton. Use when: translating abstract nodes, unresolved tool needs, and placeholder transformations into template TODOs.
- `references/notes/galaxy-workflow-draft-format.md`: Research note copied verbatim into the bundle. Emit the gxformat2 draft superset: TODO tool_id, optional tool_state / tool_shed_repository, and per-step _plan_state / _plan_context planning fields.
- `references/schemas/galaxy-workflow-draft.schema.json`: Schema file copied verbatim into the bundle. Output contract: the emitted gxformat2 draft conforms to galaxy-workflow-draft. Cast bundles the JSON Schema so the skill carries its output shape alongside the draft-validate CLI checks.

## Load On Demand

- `references/cli/draft-validate.json`: CLI command reference packaged as a sidecar. Validate the emitted draft against draft-contract rules (sentinel form, topology, _plan_* placement) before handing off. Use when: after writing or modifying the draft workflow file.
- `references/patterns/galaxy-collection-patterns.md`: Pattern note copied verbatim into the bundle. Use corpus-grounded collection pattern guidance for unresolved skeleton steps. Use when: adding TODO steps for collection cleanup, reshaping, relabeling, identifier synchronization, or collection-tabular bridges.
- `references/patterns/galaxy-conditionals-patterns.md`: Pattern note copied verbatim into the bundle. Use corpus-grounded conditional pattern guidance for unresolved skeleton steps. Use when: adding TODO steps for optional steps, gating on non-empty results, routing between alternative outputs, or transform-or-pass-through branches.
- `references/patterns/galaxy-interval-patterns.md`: Pattern note copied verbatim into the bundle. Use corpus-grounded genomic-interval pattern guidance for unresolved skeleton steps. Use when: adding TODO steps for interval overlap, merge, coverage, windowing, masking, or set-algebra on coordinate features.
- `references/patterns/galaxy-tabular-patterns.md`: Pattern note copied verbatim into the bundle. Use corpus-grounded tabular pattern guidance for unresolved skeleton steps. Use when: adding TODO steps for tabular filtering, projection, joins, aggregation, text-processing recipes, or tabular-collection bridges.
- `references/notes/galaxy-collection-semantics.md`: Research note copied verbatim into the bundle. Preserve Galaxy collection typing and map-over/reduction semantics in the gxformat2 skeleton. Use when: creating workflow inputs, outputs, and placeholder connections involving collections.
- `references/notes/galaxy-collection-semantics.upstream.myst`: Companion file copied verbatim into the bundle. Sibling of `references/notes/galaxy-collection-semantics.md`; read it where that note directs.
- `references/notes/galaxy-collection-semantics.yml`: Companion file copied verbatim into the bundle. Sibling of `references/notes/galaxy-collection-semantics.md`; read it where that note directs.
- `references/notes/galaxy-workflow-testability-design.md`: Research note copied verbatim into the bundle. Choose stable workflow input/output labels, testable checkpoint outputs, and fixture-compatible workflow interfaces while drafting the skeleton. Use when: the template decides workflow inputs, workflow outputs, promoted checkpoints, or collection output identifiers that future tests will need to address.

## Validation

- Validate `galaxy-workflow-draft.gxwf.yml` for artifact `galaxy-workflow-draft` against the galaxy-workflow-draft schema when a validator is available.

## Procedure

Read the original free-form source artifact if present, the free-form summary Markdown document, and the freeform-to-Galaxy interface and data-flow briefs. Emit a gxformat2 skeleton with workflow inputs, workflow outputs, placeholder steps, rough connections, and TODO slots for later implementation skills.

The free-form summary does not have a concrete schema yet; treat it as Markdown. Treat the prior-step index as the working context: source transcript or paper, free-form summary, freeform-to-Galaxy interface and data-flow briefs, and any open questions carried forward.

Topology is this skill's job to settle. The output must be concrete gxformat2: workflow inputs with their final collection shapes and formats, workflow outputs, the step set, the producer→consumer edge graph, branches, and `when:` guards are all decided here. The upstream freeform-to-Galaxy interface and data-flow briefs guide those decisions, but if they hedge or leave a topology choice open, this skill makes the call from source evidence, IWC exemplars, and pattern pages — never emit a topology `TODO`. What is deferred to per-step authoring is strictly wrapper-tier: `tool_id`, `tool_version`, `tool_shed_repository`, `tool_state`, and the wrapper-determined port names that surface in `in:` / `out:` / `outputSource`. Capture deferred intent in the `_plan_*` family (`_plan_state`, `_plan_context`, `_plan_in`, `_plan_out`) so the per-step skill has the source evidence and constraints it needs.

Defer thoughtfully. When research surfaces a Foundry pattern page that names the exact recipe — a galaxy-collection-patterns reshape, a conditional-run-optional-step gate, a galaxy-tabular-patterns filter — fill the step in as completely as the pattern allows: concrete `tool_id`, parameters, port names from the pattern's worked example. Pattern pages encode resolved choices; emitting `TODO` over a covered recipe discards real evidence the per-step skill cannot recover. Free-form sources will rarely give you enough to fill a domain tool step concretely — defer those wrappers and parameters, but cite the originating paper section, interview answer, figure, or supplementary table in `_plan_context` and use `_plan_state` to record vague intent ("default settings", "stranded reverse if mentioned, else unstranded") so the per-step skill knows the evidence ceiling.

Output shape is gxformat2 with wrapper-tier relaxations and `_plan_state` / `_plan_context` / `_plan_in` / `_plan_out` per tool step — see galaxy-workflow-draft-format. Refinement open work for those planning fields lives in `refinement.md`.

## Runtime Notes

- Do not read Foundry source files at runtime; use only files packaged in this skill bundle and user-supplied artifacts.
- Preserve declared artifact filenames unless the user or harness supplies explicit paths.
- Carry unresolved assumptions into the output artifact instead of silently inventing missing source evidence.