Agent Skill · cast

debug-galaxy-workflow-output

Triage failing Galaxy run outputs; classify the failure surface and capture evidence before recommending repairs.

Install with Claude Code

/plugin marketplace add galaxyproject/foundry
/plugin install foundry-skills@galaxy-workflow-foundry

Then invoke as:

/foundry-skills:debug-galaxy-workflow-output

Install with Codex

codex plugin marketplace add galaxyproject/foundry
codex plugin add foundry-skills@galaxy-workflow-foundry

Then select with /skills or invoke explicitly as:

$debug-galaxy-workflow-output

Skill Bundle

/ packaged cast

attached files: 9
upfront: 0
on demand: 9
cast rev: n/a
validated: 0

Produces: 1 artifact.

Consumes: 1 artifact.

Artifact Contract

/ skill handoff

Produces

workflow-debug-report

Failure-surface classification with captured job/invocation/collection/assertion evidence and a recommended next step or reference-gap follow-up.

markdownworkflow-debug-report.md

Raw artifact contract

{
  "id": "workflow-debug-report",
  "kind": "markdown",
  "default_filename": "workflow-debug-report.md",
  "description": "Failure-surface classification with captured job/invocation/collection/assertion evidence and a recommended next step or reference-gap follow-up."
}

Consumes

workflow-test-result

Structured run handoff from run-workflow-test: Planemo result, invocation/job/artifact context, and the observed failure modality.

Raw artifact contract

{
  "id": "workflow-test-result",
  "description": "Structured run handoff from run-workflow-test: Planemo result, invocation/job/artifact context, and the observed failure modality.",
  "producers": [
    "run-workflow-test"
  ]
}

Attached Files

/ runtime references

Load on demand

research

galaxy-collection-semantics

packaged

Diagnose collection shape, mapping, reduction, and element-identifier mismatches in failed Galaxy runs.

Trigger: When a failing output is a collection, a mapped output, or an unexpectedly nested/flattened structure.

on-demand runtime verbatim corpus-observed deterministic 1.9 KB

bundle: references/notes/galaxy-collection-semantics.md
source: content/research/galaxy-collection-semantics.md

Preview md

---
type: research
subtype: component
title: "Galaxy collection semantics"
tags:
  - research/component
  - target/galaxy
status: draft
created: 2026-04-30
revised: 2026-05-05
revision: 3
ai_generated: false
license: MIT
license_file: LICENSES/galaxy.LICENSE
related_notes:
  - "[[galaxy-xsd]]"
  - "[[galaxy-collection-tools]]"
  - "[[galaxy-apply-rules-dsl]]"
  - "[[nextflow-to-galaxy-channel-shape-mapping]]"
  - "[[nextflow-operators-to-galaxy-collection-recipes]]"
  - "[[galaxy-tool-job-failure-reference]]"
  - "[[galaxy-workflow-invocation-failure-reference]]"
  - "[[iwc-transformations-survey]]"
  - "[[galaxy-discover-datasets]]"
sources:
  - "https://github.com/galaxyproject/galaxy/blob/7765fae934fbfdee77e3be5f5b235e43735273ae/lib/galaxy/model/dataset_collections/types/collection_semantics.yml"
companions:
  - "galaxy-collection-semantics.yml"
  - "galaxy-collection-semantics.upstream.myst"
summary: "Vendored formal spec of Galaxy dataset-collection mapping/reduction semantics, with labeled examples and pinned test references."
---

> **Vendored from upstream**, pinned at SHA `7765fae`. Two files live next to this note:
>
> - `galaxy-collection-semantics.yml` — the structured source. **Agents and casting should consume this.** It carries the `tests:` blocks that pin concrete Galaxy test names; the rendered upstream view drops them.
> - `galaxy-collection-semantics.upstream.myst` — Galaxy's auto-generated MyST/LaTeX rendering of the YAML, vendored only so the human view below has something to render. Sync is manual.
>
> **When to consult:** authoring or reasoning about Molds and patterns that touch `data_collection` inputs, map-over / reduction shape changes, sub-collection mapping, `paired_or_unpaired`, or `sample_sheet`.

```vendored-myst
file: galaxy-collection-s
...

research

galaxy-collection-semantics

packaged

Diagnose collection shape, mapping, reduction, and element-identifier mismatches in failed Galaxy runs.

Trigger: When a failing output is a collection, a mapped output, or an unexpectedly nested/flattened structure.

on-demand runtime verbatim corpus-observed deterministic 33.4 KB

bundle: references/notes/galaxy-collection-semantics.upstream.myst
source: content/research/galaxy-collection-semantics.upstream.myst

Preview myst

# Collection Semantics

This document describes the semantics around working with Galaxy dataset collections.
In particular it describes how they operate within Galaxy tools and workflows.

:::{admonition} You Probably Don't Need to Read This
:class: caution

Any significantly sophisticated workflow language will have ways to collect data
into arrays or vectors or dictionaries and apply operations across this data (mapping)
or reduce the dimensionality of this data (reductions). Typically, this is explicitly
annotated with map functions or for loops. Galaxy however is designed to be a point
and click interface for connecting steps and running tools. It is important that steps
just connect and just do the most natural thing - and this is what Galaxy does.
This document just provides a mathematical formalism to that "what should just
intuitively work" that can be used to document test cases and help with implementation.
This is reference documentation not user documentation, Galaxy should just work.
:::

## Mapping

If a tool consumes a simple dataset parameter and produces a simple dataset parameter,
then any collection type may be "mapped over" the data input to that tool. The result of
that is the tool being applied to each element of the collection and "implicit collections"
being created from the outputs that are produced from those operations. Those implicit
collections have the same element identifiers in the same order as the input collection that is
mapped over. Each element of the implicit collections correspond to their own job and
Galaxy very naturally and intuitively parallelizes jobs without extra work from the user
and without any knowledge of the tool.


(BASIC_MAPPING_PAIRED)=
(BASIC_MAPPING_PAIRED_OR_UNPAIRED_PAIRED)=
(BASIC_MAPPING_PAIRED_OR_UNPAIRED_UN
...

research

galaxy-collection-semantics

packaged

Diagnose collection shape, mapping, reduction, and element-identifier mismatches in failed Galaxy runs.

Trigger: When a failing output is a collection, a mapped output, or an unexpectedly nested/flattened structure.

on-demand runtime verbatim corpus-observed deterministic 43.8 KB

bundle: references/notes/galaxy-collection-semantics.yml
source: content/research/galaxy-collection-semantics.yml

Preview yml

- doc: |
    # Collection Semantics

    This document describes the semantics around working with Galaxy dataset collections.
    In particular it describes how they operate within Galaxy tools and workflows.

    :::{admonition} You Probably Don't Need to Read This
    :class: caution

    Any significantly sophisticated workflow language will have ways to collect data
    into arrays or vectors or dictionaries and apply operations across this data (mapping)
    or reduce the dimensionality of this data (reductions). Typically, this is explicitly
    annotated with map functions or for loops. Galaxy however is designed to be a point
    and click interface for connecting steps and running tools. It is important that steps
    just connect and just do the most natural thing - and this is what Galaxy does.
    This document just provides a mathematical formalism to that "what should just
    intuitively work" that can be used to document test cases and help with implementation.
    This is reference documentation not user documentation, Galaxy should just work.
    :::

    ## Mapping

    If a tool consumes a simple dataset parameter and produces a simple dataset parameter,
    then any collection type may be "mapped over" the data input to that tool. The result of
    that is the tool being applied to each element of the collection and "implicit collections"
    being created from the outputs that are produced from those operations. Those implicit
    collections have the same element identifiers in the same order as the input collection that is
    mapped over. Each element of the implicit collections correspond to their own job and
    Galaxy very naturally and intuitively parallelizes jobs without extra work from the user
    and without any knowledge of the tool.

...

research

galaxy-tool-job-failure-reference

packaged

Interpret Galaxy job-level failure evidence including stdio rules, exit code, job messages, and output dataset state.

Trigger: When a failed workflow test includes errored jobs, tool stderr/stdout, non-zero exit codes, or red output datasets.

on-demand runtime verbatim corpus-observed deterministic 7.3 KB

bundle: references/notes/galaxy-tool-job-failure-reference.md
source: content/research/galaxy-tool-job-failure-reference.md

Preview md

---
type: research
subtype: component
title: "Galaxy tool and job failure reference"
tags:
  - research/component
  - target/galaxy
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
related_notes:
  - "[[galaxy-workflow-invocation-failure-reference]]"
  - "[[planemo-workflow-test-architecture]]"
  - "[[galaxy-collection-semantics]]"
related_molds:
  - "[[implement-galaxy-tool-step]]"
  - "[[debug-galaxy-workflow-output]]"
sources:
  - "~/projects/repositories/galaxy/lib/galaxy/tools/__init__.py"
  - "~/projects/repositories/galaxy/lib/galaxy/tool_util/parser/xml.py"
  - "~/projects/repositories/galaxy/lib/galaxy/tool_util/output_checker.py"
  - "~/projects/repositories/galaxy/lib/galaxy/jobs"
  - "~/projects/repositories/galaxy/lib/galaxy/webapps/galaxy/api/jobs.py"
summary: "Reference for Galaxy tool stdio rules, job failure detection, job states, and job API failure surfaces."
---

# Galaxy Tool And Job Failure Reference

This is reference material, not a debug recipe. Use it to understand what Galaxy can know about a failed tool job and which API surfaces preserve that evidence.

## Model

Galaxy tool failure handling is layered:

- The tool wrapper defines expected failure semantics through `detect_errors`, `<stdio>`, exit-code checks, regex checks, and command strictness.
- The job runner executes the command and captures exit code plus tool/job stdout and stderr streams.
- Galaxy evaluates configured failure rules and records structured `job_messages`.
- The job reaches a terminal state, output datasets may become `error`, and dependent jobs may pause or fail later.
- Workflow invocation APIs summarize those jobs, but job APIs preserve the most detailed tool-level evidence.

## Tool Wrapper Failure Controls

Important wrapper con
...

research

galaxy-workflow-invocation-failure-reference

packaged

Interpret Galaxy invocation-level failure evidence including invocation state, structured messages, and step job summaries.

Trigger: When a failed workflow test has invocation failure, missing workflow outputs, cancelled/paused steps, subworkflow failures, or collection population errors.

on-demand runtime verbatim corpus-observed deterministic 7.7 KB

bundle: references/notes/galaxy-workflow-invocation-failure-reference.md
source: content/research/galaxy-workflow-invocation-failure-reference.md

Preview md

---
type: research
subtype: component
title: "Galaxy workflow invocation failure reference"
tags:
  - research/component
  - target/galaxy
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
related_notes:
  - "[[galaxy-tool-job-failure-reference]]"
  - "[[planemo-workflow-test-architecture]]"
  - "[[galaxy-collection-semantics]]"
related_molds:
  - "[[run-workflow-test]]"
  - "[[debug-galaxy-workflow-output]]"
  - "[[validate-galaxy-workflow]]"
sources:
  - "~/projects/repositories/galaxy/lib/galaxy/schema/invocation.py"
  - "~/projects/repositories/galaxy/lib/galaxy/workflow/run.py"
  - "~/projects/repositories/galaxy/lib/galaxy/workflow/modules.py"
  - "~/projects/repositories/galaxy/lib/galaxy/webapps/galaxy/api/workflows.py"
summary: "Reference for Galaxy workflow invocation states, messages, failure reasons, and invocation API surfaces."
---

# Galaxy Workflow Invocation Failure Reference

This note describes workflow-level failure surfaces in Galaxy. It is separate from [[galaxy-tool-job-failure-reference]] because invocation state answers whether Galaxy could schedule and drive the workflow, while job state answers whether individual tool jobs succeeded.

## Invocation Versus Job Failure

Important distinction:

- Invocation state says whether Galaxy scheduled, cancelled, failed, or completed the workflow invocation.
- Job state says whether jobs produced by invocation steps succeeded or failed.
- Invocation messages explain scheduler/evaluation/cancellation problems.
- Step states usually describe scheduling progress, not actual job success, unless a legacy serialization mode substitutes job state.

A robust workflow test reference should inspect both invocation APIs and job APIs.

## Invocation States

Galaxy invocation states 
...

research

iwc-shortcuts-anti-patterns

packaged

Decide whether a proposed debug fix aligns with accepted IWC testing shortcuts or masks a real failure.

Trigger: When debugging suggests weakening assertions, widening deltas, switching to existence checks, or changing output labels.

on-demand runtime verbatim corpus-observed deterministic 23.9 KB

bundle: references/notes/iwc-shortcuts-anti-patterns.md
source: content/research/iwc-shortcuts-anti-patterns.md

Preview md

---
type: research
subtype: component
tags:
  - research/component
  - target/galaxy
status: draft
created: 2026-04-30
revised: 2026-05-03
revision: 2
ai_generated: true
related_notes:
  - "[[galaxy-workflow-testability-design]]"
  - "[[iwc-test-data-conventions]]"
  - "[[planemo-asserts-idioms]]"
  - "[[implement-galaxy-workflow-test]]"
  - "[[tests-format]]"
  - "[[iwc-conditionals-survey]]"
  - "[[iwc-map-over-lifecycle-survey]]"
  - "[[iwc-tabular-operations-survey]]"
  - "[[iwc-transformations-survey]]"
summary: "What IWC test suites cut corners on (accepted) vs what's a code smell — existence-only probes, sim_size deltas, image dim checks, label coupling."
---

# IWC test-suite shortcuts and anti-patterns

## Purpose

When an agent translates or authors a Galaxy workflow for IWC submission, the test suite it writes will be reviewed against IWC's *de facto* style — not against an idealized assertion ladder. That style routinely tolerates assertions that look weak in isolation. This note distinguishes the corner-cutting that is **normal and accepted** in the corpus from the patterns that an agent should treat as **smells** worth flagging.

This note owns accepted-vs-smell calls. For positive workflow-structure guidance behind label stability, checkpoint promotion, and collection identifier design, use [[galaxy-workflow-testability-design]].

Grounding: 115 `*-tests.yml` files under `workflow-fixtures/iwc-src/workflows/` (mirror of `galaxyproject/iwc`), prior synthesis in `galaxy-brain/vault/projects/workflow_state/skills/COMPONENT_GALAXY_WORKFLOW_TESTING.md`. Path citations below are relative to `iwc-src/workflows/` unless absolute.

## TL;DR rules of thumb

1. **Default to tolerant assertions.** `compare: sim_size` + `delta:`, `has_image_*` + `delta:`, `has_text` s
...

research

nextflow-operators-to-galaxy-collection-recipes

packaged

Trace collection output failures back to possibly lossy operator translations.

Trigger: When debugging wrong nesting, missing elements, branch merges, bad joins, or gather/reduction mismatches.

on-demand runtime verbatim corpus-observed deterministic 6.5 KB

bundle: references/notes/nextflow-operators-to-galaxy-collection-recipes.md
source: content/research/nextflow-operators-to-galaxy-collection-recipes.md

Preview md

---
type: research
subtype: component
title: "Nextflow operators to Galaxy collection recipes"
tags:
  - research/component
  - source/nextflow
  - target/galaxy
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
related_notes:
  - "[[nextflow-to-galaxy-channel-shape-mapping]]"
  - "[[galaxy-collection-semantics]]"
  - "[[galaxy-collection-tools]]"
  - "[[galaxy-apply-rules-dsl]]"
  - "[[iwc-transformations-survey]]"
  - "[[iwc-tabular-operations-survey]]"
  - "[[galaxy-data-flow-draft-contract]]"
  - "[[iwc-map-over-lifecycle-survey]]"
  - "[[nextflow-patterns]]"
related_molds:
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[implement-galaxy-tool-step]]"
  - "[[debug-galaxy-workflow-output]]"
sources:
  - "https://github.com/galaxyproject/foundry/issues/53"
summary: "Classifies common Nextflow operators as Galaxy wiring, collection semantics, explicit steps, or review triggers."
---

# Nextflow Operators To Galaxy Collection Recipes

Most Nextflow operators are not Galaxy tools. Translate them first as source-side data-flow intent, then decide whether the Galaxy representation is simple wiring, collection semantics, an explicit Galaxy step, or a user-review checkpoint.

## Decision Vocabulary

| Label | Meaning |
|---|---|
| `channel-only rewiring` | The operator disappears into Galaxy connections, labels, branch wiring, or output selection. |
| `Galaxy collection semantics` | Translation relies on collection identifiers, collection type, map-over, reduction, or nesting behavior. |
| `explicit Galaxy step` | Add a collection-operation, tabular, text-processing, or domain tool step. |
| `user review` | Translation is likely lossy or semantically ambiguous. |

## Operator Recipes

| Nextflow operator | Galaxy recipe | Class | Confi
...

research

planemo-asserts-idioms

packaged

Classify whether a failure is an assertion-choice problem, tolerance problem, or real workflow-output regression.

Trigger: When Planemo reports output assertion failures or generated tests are too strict/too weak.

on-demand runtime verbatim corpus-observed deterministic 18.2 KB

bundle: references/notes/planemo-asserts-idioms.md
source: content/research/planemo-asserts-idioms.md

Preview md

---
type: research
subtype: component
tags:
  - research/component
  - target/galaxy
status: draft
created: 2026-04-30
revised: 2026-05-11
revision: 6
ai_generated: true
related_notes:
  - "[[galaxy-workflow-testability-design]]"
  - "[[iwc-test-data-conventions]]"
  - "[[iwc-shortcuts-anti-patterns]]"
  - "[[implement-galaxy-workflow-test]]"
  - "[[tests-format]]"
  - "[[planemo-workflow-test-architecture]]"
  - "[[validate-tests]]"
  - "[[iwc-tabular-operations-survey]]"
  - "[[galaxy-discover-datasets]]"
summary: "Decision and idiom guide for picking planemo workflow-test assertions: which family per output type, how to size tolerances, when to validate."
---

# Planemo asserts: idiom and decision guide

Companion to [[iwc-test-data-conventions]] (input shapes), [[galaxy-workflow-testability-design]] (workflow structure before test YAML exists), and [[iwc-shortcuts-anti-patterns]] (what's accepted vs smell). This note is forward-looking: when authoring a new `<workflow>-tests.yml`, which assertion family fits which output, and what the recommended tolerances and operators are.

The **vocabulary itself is not restated here** — every assertion's parameter list, types, defaults, required fields, and Python docstring is rendered from the test-format JSON Schema at [[tests-format]]. Assertion names below deep-link into that page (e.g. [[tests-format#has_text_model|has_text]] jumps straight to that `$def`).

## 1. Choose by output type

The single most useful decision table. Pick the row that matches the file format the workflow emits; default to the recommended assertion family.

| Output type | Default assertion family | Why | Fallback |
|---|---|---|---|
| **Plain text reports / logs** (FastQC summary, MultiQC text section) | [[tests-format#has_text_model|has_text]] (su
...

research

planemo-workflow-test-architecture

packaged

Locate which Planemo artifact or Galaxy API surface preserves the failure evidence.

Trigger: When Planemo output is ambiguous, structured test JSON is available, or rerunning can be avoided by inspecting an existing invocation.

on-demand runtime verbatim corpus-observed deterministic 8.2 KB

bundle: references/notes/planemo-workflow-test-architecture.md
source: content/research/planemo-workflow-test-architecture.md

Preview md

---
type: research
subtype: component
title: "Planemo workflow-test architecture"
tags:
  - research/component
  - tool/planemo
  - target/galaxy
status: draft
created: 2026-05-02
revised: 2026-05-11
revision: 3
ai_generated: true
related_notes:
  - "[[galaxy-workflow-testability-design]]"
  - "[[galaxy-tool-job-failure-reference]]"
  - "[[galaxy-workflow-invocation-failure-reference]]"
  - "[[planemo-asserts-idioms]]"
related_molds:
  - "[[run-workflow-test]]"
  - "[[debug-galaxy-workflow-output]]"
  - "[[implement-galaxy-workflow-test]]"
sources:
  - "~/projects/repositories/planemo/planemo/commands/cmd_test.py"
  - "~/projects/repositories/planemo/planemo/commands/cmd_run.py"
  - "~/projects/repositories/planemo/planemo/galaxy/activity.py"
  - "~/projects/repositories/planemo/planemo/galaxy/invocations"
  - "~/projects/repositories/planemo/planemo/galaxy/config.py"
summary: "Reference for Planemo workflow test/run architecture, Galaxy modes, API polling, and noisy failure boundaries."
---

# Planemo Workflow-Test Architecture

This note describes Planemo architecture relevant to workflow tests and workflow runs. It is reference material for Molds that need to run tests or interpret Planemo artifacts, not a command-selection recipe.

## Main Commands

| User action | Command | Core behavior |
|---|---|---|
| Full workflow test | `planemo test <workflow>` ([[planemo-test]]) | Finds test definitions, starts or targets Galaxy, stages inputs, invokes workflow, checks assertions, writes reports. |
| Direct run | `planemo run <workflow> <job.yml>` | Runs one workflow/job pair and can download outputs without assertion checks. |
| Recheck assertions | `planemo workflow_test_on_invocation <tests.yml> <invocation_id>` ([[planemo-workflow_test_on_invocation]]) | Runs test asser
...

SKILL.md


# debug-galaxy-workflow-output

Follow the procedure below and use the artifact/reference sections as the runtime contract.

## When To Use

- Triage failing Galaxy run outputs; classify the failure surface and capture evidence before recommending repairs.

## Inputs

- Read artifact `workflow-test-result`. Produced by `run-workflow-test`. Structured run handoff from run-workflow-test: Planemo result, invocation/job/artifact context, and the observed failure modality.

## Outputs

- Write artifact `workflow-debug-report` as `workflow-debug-report.md`. Format: `markdown`. Failure-surface classification with captured job/invocation/collection/assertion evidence and a recommended next step or reference-gap follow-up.

## Required Tools

- None declared. Procedure should not assume external CLIs are present.

## Load Upfront

- None declared.

## Load On Demand

- `references/notes/galaxy-collection-semantics.md`: Research note copied verbatim into the bundle. Diagnose collection shape, mapping, reduction, and element-identifier mismatches in failed Galaxy runs. Use when: a failing output is a collection, a mapped output, or an unexpectedly nested/flattened structure.
- `references/notes/galaxy-collection-semantics.upstream.myst`: Companion file copied verbatim into the bundle. Sibling of `references/notes/galaxy-collection-semantics.md`; read it where that note directs.
- `references/notes/galaxy-collection-semantics.yml`: Companion file copied verbatim into the bundle. Sibling of `references/notes/galaxy-collection-semantics.md`; read it where that note directs.
- `references/notes/galaxy-tool-job-failure-reference.md`: Research note copied verbatim into the bundle. Interpret Galaxy job-level failure evidence including stdio rules, exit code, job messages, and output dataset state. Use when: a failed workflow test includes errored jobs, tool stderr/stdout, non-zero exit codes, or red output datasets.
- `references/notes/galaxy-workflow-invocation-failure-reference.md`: Research note copied verbatim into the bundle. Interpret Galaxy invocation-level failure evidence including invocation state, structured messages, and step job summaries. Use when: a failed workflow test has invocation failure, missing workflow outputs, cancelled/paused steps, subworkflow failures, or collection population errors.
- `references/notes/iwc-shortcuts-anti-patterns.md`: Research note copied verbatim into the bundle. Decide whether a proposed debug fix aligns with accepted IWC testing shortcuts or masks a real failure. Use when: debugging suggests weakening assertions, widening deltas, switching to existence checks, or changing output labels.
- `references/notes/nextflow-operators-to-galaxy-collection-recipes.md`: Research note copied verbatim into the bundle. Trace collection output failures back to possibly lossy operator translations. Use when: debugging wrong nesting, missing elements, branch merges, bad joins, or gather/reduction mismatches.
- `references/notes/planemo-asserts-idioms.md`: Research note copied verbatim into the bundle. Classify whether a failure is an assertion-choice problem, tolerance problem, or real workflow-output regression. Use when: planemo reports output assertion failures or generated tests are too strict/too weak.
- `references/notes/planemo-workflow-test-architecture.md`: Research note copied verbatim into the bundle. Locate which Planemo artifact or Galaxy API surface preserves the failure evidence. Use when: planemo output is ambiguous, structured test JSON is available, or rerunning can be avoided by inspecting an existing invocation.

## Validation

- None declared.

## Procedure

Triage a failing Galaxy workflow test. Take the structured handoff from run-workflow-test, classify the failure surface before proposing any repair, and capture the reference evidence the surface requires. When the failure cannot be classified from existing references, recommend a focused follow-up rather than converting uncertainty into a guessed fix.

Classify before repairing. The same red output can be a tool/job failure, a workflow invocation failure, a collection-output mismatch, a missing workflow output, or an assertion mismatch — and each routes to a different reference surface and a different fix. Locate where the evidence lives first (planemo-workflow-test-architecture).

### Sequence

1. **Classify the first failure surface.** From the run's structured result, decide whether the first failure is a tool/job failure, a workflow invocation failure, a collection-output mismatch, a missing workflow output, or an assertion mismatch. Classify before proposing repairs.
2. **Capture job-failure evidence.** When a job is in `error`/`failed`/`stopped`, record job id, tool id, exit code, job messages, the stdout/stderr distinction, and output dataset state per galaxy-tool-job-failure-reference; check whether the wrapper's failure semantics already explain it.
3. **Capture invocation-failure evidence.** When the invocation state or messages indicate scheduling, materialization, cancellation, conditional, or output-resolution failure, record invocation state, the structured message reason, the affected step, any subworkflow path, and the jobs summary per galaxy-workflow-invocation-failure-reference; note whether Planemo surfaced or hid the relevant Galaxy API detail.
4. **Trace collection mismatches.** When a failing output is a collection or mapped output, diagnose shape, mapping, reduction, and element-identifier mismatches with galaxy-collection-semantics; for workflows translated from Nextflow, trace wrong nesting / missing elements / bad joins back to possibly-lossy operator translations via nextflow-operators-to-galaxy-collection-recipes.
5. **Read assertion failures honestly.** When the failure is an assertion, use planemo-asserts-idioms to decide whether it is an assertion-choice/tolerance problem or a real output regression. Before weakening an assertion, widening a delta, or switching to an existence check, confirm against iwc-shortcuts-anti-patterns that the relaxation is an accepted IWC shortcut and not masking a real failure.
6. **Discover reference gaps.** When the failure cannot be classified confidently from the references above, recommend a focused follow-up — reference documentation, pattern capture, API verification, or eval coverage — rather than emitting a repair recipe built on a guess.

## Runtime Notes

- Do not read Foundry source files at runtime; use only files packaged in this skill bundle and user-supplied artifacts.
- Preserve declared artifact filenames unless the user or harness supplies explicit paths.
- Carry unresolved assumptions into the output artifact instead of silently inventing missing source evidence.