Agent Skill · cast

run-workflow-test

Execute a workflow's tests via Planemo; emit structured pass/fail and outputs.

Install with Claude Code

/plugin marketplace add galaxyproject/foundry
/plugin install foundry-skills@galaxy-workflow-foundry

Then invoke as:

/foundry-skills:run-workflow-test

Install with Codex

codex plugin marketplace add galaxyproject/foundry
codex plugin add foundry-skills@galaxy-workflow-foundry

Then select with /skills or invoke explicitly as:

$run-workflow-test

Skill Bundle

/ packaged cast

attached files: 6
upfront: 2
on demand: 4
cast rev: 1
validated: 0

Produces: 1 artifact.

Artifact Contract

/ skill handoff

Produces

workflow-test-result

Structured pass/fail plus captured evidence — Planemo result, invocation/history/workflow ids, artifact paths, Galaxy mode, and (on failure) the observed modality and next reference surface — for debug-galaxy-workflow-output.

jsonworkflow-test-result.json

Raw artifact contract

{
  "id": "workflow-test-result",
  "kind": "json",
  "default_filename": "workflow-test-result.json",
  "description": "Structured pass/fail plus captured evidence — Planemo result, invocation/history/workflow ids, artifact paths, Galaxy mode, and (on failure) the observed modality and next reference surface — for debug-galaxy-workflow-output."
}

Attached Files

/ runtime references

Load upfront

cli-tool

planemo

packaged

Runtime that executes the Galaxy or CWL workflow test; install before driving the harness.

upfront runtime verbatim corpus-observed deterministic 1.9 KB

bundle: references/cli/planemo.md
source: content/cli/planemo/index.md

Preview md

---
type: cli-tool
tool: planemo
origin: pypi
package: planemo
package_version: "0.75.45"
invoke: planemo
invoke_fallback: "uvx --from planemo==0.75.45 planemo"
availability_check: "planemo --version"
docs_url: "https://planemo.readthedocs.io/"
tags:
  - cli-tool
  - cli/planemo
status: draft
created: 2026-05-10
revised: 2026-07-20
revision: 5
ai_generated: true
summary: "Galaxy tool/workflow runtime testing CLI; used by run-workflow-test and friends."
---

# planemo

Galaxy's runtime testing and authoring CLI. Foundry Molds invoke `planemo test`, `planemo lint`, and friends for end-to-end workflow validation; the cast skill consumes structured JSON output from `--test_output_json` and validates it against [[planemo-test-report]].

## Pin

`package_version` pins to the released `planemo==0.75.45` from PyPI — base upstream planemo, no fork.

[galaxyproject/planemo#1636](https://github.com/galaxyproject/planemo/pull/1636) merged 2026-05-14 and first shipped in released **0.75.42**, so `planemo cli_metadata` and `planemo output_schema` are available in every release at or above this pin. Every Foundry consumer runs off this base pin: the workflow-test phases ([[run-workflow-test]], [[implement-galaxy-workflow-test]]) which need only `planemo test --test_output_json`, the convergence loop in [[convert-nfcore-module-to-galaxy-tool]], and the vendored-artifact regeneration story (`packages/planemo-cli-meta/`, `packages/planemo-test-report-schema/`). No fork pin is required.

## Install

```sh
uvx --from planemo==0.75.45 planemo --version
```

For a persistent install:

```sh
uv tool install planemo==0.75.45
```

Contributor laptops only need planemo when **regenerating** vendored artifacts (the schema JSON in `packages/planemo-test-report-schema/`, the planemo CLI manual page
...

schema

tests-format

packaged

Validate Galaxy workflow test files before starting a Planemo or Galaxy-backed execution.

upfront runtime verbatim corpus-observed deterministic 201.1 KB

bundle: references/schemas/tests-format.schema.json
source: package://@galaxy-foundry/foundry#testsFormatSchema

Preview json

{
  "$defs": {
    "Collection": {
      "additionalProperties": false,
      "properties": {
        "class": {
          "const": "Collection",
          "title": "Class",
          "type": "string"
        },
        "collection_type": {
          "anyOf": [
            {
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Collection Type"
        },
        "elements": {
          "anyOf": [
            {
              "items": {
                "oneOf": [
                  {
                    "oneOf": [
                      {
                        "$ref": "#/$defs/LocationFile"
                      },
                      {
                        "$ref": "#/$defs/PathFile"
                      },
                      {
                        "$ref": "#/$defs/ContentsFile"
                      },
                      {
                        "$ref": "#/$defs/CompositeDataFile"
                      }
                    ]
                  },
                  {
                    "$ref": "#/$defs/Collection"
                  }
                ]
              },
              "type": "array"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Elements"
        },
        "identifier": {
          "anyOf": [
            {
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Identifier"
        },
        "name": {
          "anyOf": [
            {
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Name"
        },
        "rows": {
          "anyOf": [
            {
              "additionalProperties": {
                "items": {},
                "type": "array"
              },
              "type": "object"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Rows"
        }
      },
      "required": [
        "class"
      ],
      "title": "Collection",
      "type": "object"
    },
    "CollectionAttributes": {
      "additionalProperties": false,
...

Load on demand

cli-command

validate-tests

packaged

Run static schema and workflow-label checks before expensive workflow execution.

Trigger: Before invoking Planemo when a Galaxy workflow test file is present.

on-demand runtime sidecar corpus-observed deterministic 2.6 KB

bundle: references/cli/validate-tests.json
source: content/cli/gxwf/validate-tests.md

Preview json

{
  "type": "cli-command",
  "tool": "gxwf",
  "command": "validate-tests",
  "summary": "Validate Galaxy workflow test files and optionally cross-check labels against their workflow.",
  "source_path": "content/cli/gxwf/validate-tests.md",
  "source_revision": 3,
  "package": "@galaxy-tool-util/cli",
  "description": "Validate a workflow-test file (*-tests.yml, *.gxwf-tests.yml) against the Galaxy Tests schema",
  "synopsis": "gxwf validate-tests [options] <file>",
  "args": [
    {
      "raw": "file",
      "name": "file",
      "required": true,
      "variadic": false,
      "description": "Tests file (*-tests.yml)"
    }
  ],
  "options": [
    {
      "flags": "--json",
      "name": "json",
      "description": "Output structured JSON report",
      "takesArgument": false,
      "optionalArgument": false,
      "negatable": false
    },
    {
      "flags": "--workflow <path>",
      "name": "workflow",
      "description": "Cross-check job inputs + output assertions against a workflow (.ga / .gxwf.yml)",
      "takesArgument": true,
      "argumentPlaceholder": "<path>",
      "optionalArgument": false,
      "negatable": false
    }
  ],
  "body": "# `gxwf validate-tests`\n\nValidate a Galaxy workflow test file (`*-tests.yml` or `*.gxwf-tests.yml`) against the Galaxy tests schema. Optionally cross-check the test file against a workflow so missing input labels, missing output labels, and type mismatches fail before a slow Planemo run.\n\n`<file>` is a workflow test YAML file, usually named `<workflow>-tests.yml` or `<workflow>.gxwf-tests.yml`.\n\n## Output\n\nDefault output is human-readable validation diagnostics.\n\nWith `--json`, the command emits a structured report describing schema errors and, when `--workflow` is supplied, workflow-coherence errors such as missing labels or incompatible input values.\n\n## Examples\n\n```bash\ngxwf validate-tests workflow-tests.yml\ngxwf validate-tests workflow-tests.yml --json\ngxwf validate-tests workflow-tests.yml --workflow workflow.gxwf.yml --json\n```\n\n## Gotchas\n\n- This is the cheap static gate before Planemo. It does not execute the workflow and does not prove assertions pass on real outputs.\n- Use `--workflow` whenever the workflow file is available. Schema-valid tests can still reference stale input/output labels after workflow edits.\n- Run this before `planemo workflow_test_on_invocation` or 
...

research

galaxy-workflow-invocation-failure-reference

packaged

Preserve invocation identifiers and state summaries needed to inspect workflow-level runtime failures after Planemo returns.

Trigger: When Planemo reports a failed, cancelled, missing-output, or ambiguous workflow invocation result.

on-demand runtime verbatim corpus-observed deterministic 7.7 KB

bundle: references/notes/galaxy-workflow-invocation-failure-reference.md
source: content/research/galaxy-workflow-invocation-failure-reference.md

Preview md

---
type: research
subtype: component
title: "Galaxy workflow invocation failure reference"
tags:
  - research/component
  - target/galaxy
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
related_notes:
  - "[[galaxy-tool-job-failure-reference]]"
  - "[[planemo-workflow-test-architecture]]"
  - "[[galaxy-collection-semantics]]"
related_molds:
  - "[[run-workflow-test]]"
  - "[[debug-galaxy-workflow-output]]"
  - "[[validate-galaxy-workflow]]"
sources:
  - "~/projects/repositories/galaxy/lib/galaxy/schema/invocation.py"
  - "~/projects/repositories/galaxy/lib/galaxy/workflow/run.py"
  - "~/projects/repositories/galaxy/lib/galaxy/workflow/modules.py"
  - "~/projects/repositories/galaxy/lib/galaxy/webapps/galaxy/api/workflows.py"
summary: "Reference for Galaxy workflow invocation states, messages, failure reasons, and invocation API surfaces."
---

# Galaxy Workflow Invocation Failure Reference

This note describes workflow-level failure surfaces in Galaxy. It is separate from [[galaxy-tool-job-failure-reference]] because invocation state answers whether Galaxy could schedule and drive the workflow, while job state answers whether individual tool jobs succeeded.

## Invocation Versus Job Failure

Important distinction:

- Invocation state says whether Galaxy scheduled, cancelled, failed, or completed the workflow invocation.
- Job state says whether jobs produced by invocation steps succeeded or failed.
- Invocation messages explain scheduler/evaluation/cancellation problems.
- Step states usually describe scheduling progress, not actual job success, unless a legacy serialization mode substitutes job state.

A robust workflow test reference should inspect both invocation APIs and job APIs.

## Invocation States

Galaxy invocation states 
...

research

planemo-asserts-idioms

packaged

Interpret assertion failures and choose the right fast inner-loop command before full reruns.

Trigger: When a workflow test file exists and the task is to run, iterate, or classify its test assertions.

on-demand runtime verbatim corpus-observed deterministic 18.2 KB

bundle: references/notes/planemo-asserts-idioms.md
source: content/research/planemo-asserts-idioms.md

Preview md

---
type: research
subtype: component
tags:
  - research/component
  - target/galaxy
status: draft
created: 2026-04-30
revised: 2026-05-11
revision: 6
ai_generated: true
related_notes:
  - "[[galaxy-workflow-testability-design]]"
  - "[[iwc-test-data-conventions]]"
  - "[[iwc-shortcuts-anti-patterns]]"
  - "[[implement-galaxy-workflow-test]]"
  - "[[tests-format]]"
  - "[[planemo-workflow-test-architecture]]"
  - "[[validate-tests]]"
  - "[[iwc-tabular-operations-survey]]"
  - "[[galaxy-discover-datasets]]"
summary: "Decision and idiom guide for picking planemo workflow-test assertions: which family per output type, how to size tolerances, when to validate."
---

# Planemo asserts: idiom and decision guide

Companion to [[iwc-test-data-conventions]] (input shapes), [[galaxy-workflow-testability-design]] (workflow structure before test YAML exists), and [[iwc-shortcuts-anti-patterns]] (what's accepted vs smell). This note is forward-looking: when authoring a new `<workflow>-tests.yml`, which assertion family fits which output, and what the recommended tolerances and operators are.

The **vocabulary itself is not restated here** — every assertion's parameter list, types, defaults, required fields, and Python docstring is rendered from the test-format JSON Schema at [[tests-format]]. Assertion names below deep-link into that page (e.g. [[tests-format#has_text_model|has_text]] jumps straight to that `$def`).

## 1. Choose by output type

The single most useful decision table. Pick the row that matches the file format the workflow emits; default to the recommended assertion family.

| Output type | Default assertion family | Why | Fallback |
|---|---|---|---|
| **Plain text reports / logs** (FastQC summary, MultiQC text section) | [[tests-format#has_text_model|has_text]] (su
...

research

planemo-workflow-test-architecture

packaged

Run workflow tests while preserving Planemo structured artifacts, Galaxy mode, invocation id, history id, and API follow-up context.

Trigger: When choosing between managed Galaxy, external Galaxy, full test runs, existing invocation checks, or direct workflow runs.

on-demand runtime verbatim corpus-observed deterministic 8.2 KB

bundle: references/notes/planemo-workflow-test-architecture.md
source: content/research/planemo-workflow-test-architecture.md

Preview md

---
type: research
subtype: component
title: "Planemo workflow-test architecture"
tags:
  - research/component
  - tool/planemo
  - target/galaxy
status: draft
created: 2026-05-02
revised: 2026-05-11
revision: 3
ai_generated: true
related_notes:
  - "[[galaxy-workflow-testability-design]]"
  - "[[galaxy-tool-job-failure-reference]]"
  - "[[galaxy-workflow-invocation-failure-reference]]"
  - "[[planemo-asserts-idioms]]"
related_molds:
  - "[[run-workflow-test]]"
  - "[[debug-galaxy-workflow-output]]"
  - "[[implement-galaxy-workflow-test]]"
sources:
  - "~/projects/repositories/planemo/planemo/commands/cmd_test.py"
  - "~/projects/repositories/planemo/planemo/commands/cmd_run.py"
  - "~/projects/repositories/planemo/planemo/galaxy/activity.py"
  - "~/projects/repositories/planemo/planemo/galaxy/invocations"
  - "~/projects/repositories/planemo/planemo/galaxy/config.py"
summary: "Reference for Planemo workflow test/run architecture, Galaxy modes, API polling, and noisy failure boundaries."
---

# Planemo Workflow-Test Architecture

This note describes Planemo architecture relevant to workflow tests and workflow runs. It is reference material for Molds that need to run tests or interpret Planemo artifacts, not a command-selection recipe.

## Main Commands

| User action | Command | Core behavior |
|---|---|---|
| Full workflow test | `planemo test <workflow>` ([[planemo-test]]) | Finds test definitions, starts or targets Galaxy, stages inputs, invokes workflow, checks assertions, writes reports. |
| Direct run | `planemo run <workflow> <job.yml>` | Runs one workflow/job pair and can download outputs without assertion checks. |
| Recheck assertions | `planemo workflow_test_on_invocation <tests.yml> <invocation_id>` ([[planemo-workflow_test_on_invocation]]) | Runs test asser
...

SKILL.md


# run-workflow-test

Follow the procedure below and use the artifact/reference sections as the runtime contract.

## When To Use

- Execute a workflow's tests via Planemo; emit structured pass/fail and outputs.

## Inputs

- No upstream artifact inputs declared. See the procedure for user-supplied runtime inputs.

## Outputs

- Write artifact `workflow-test-result` as `workflow-test-result.json`. Format: `json`. Structured pass/fail plus captured evidence — Planemo result, invocation/history/workflow ids, artifact paths, Galaxy mode, and (on failure) the observed modality and next reference surface — for debug-galaxy-workflow-output.

## Required Tools

- **`gxwf`** (gxwf). `npm install -g '@galaxy-tool-util/cli@^1.8.1'`.
  Ephemeral run: `npx --yes --package @galaxy-tool-util/cli@1.8.1 gxwf`.
  Check: `gxwf --help | grep -q draft-validate`.
  Docs: https://github.com/jmchilton/galaxy-tool-util-ts/tree/main/packages/cli
- **`planemo`** (planemo). `uv tool install planemo==0.75.45` (or `pip install planemo==0.75.45`).
  Ephemeral run: `uvx --from planemo==0.75.45 planemo`.
  Check: `planemo --version`.
  Docs: https://planemo.readthedocs.io/
  Bundled reference: `references/cli/planemo.md`.

## Load Upfront

- `references/cli/planemo.md`: CLI tool reference copied verbatim into the bundle. Runtime that executes the Galaxy or CWL workflow test; install before driving the harness.
- `references/schemas/tests-format.schema.json`: Schema file copied verbatim into the bundle. Validate Galaxy workflow test files before starting a Planemo or Galaxy-backed execution.

## Load On Demand

- `references/cli/validate-tests.json`: CLI command reference packaged as a sidecar. Run static schema and workflow-label checks before expensive workflow execution. Use when: before invoking Planemo when a Galaxy workflow test file is present.
- `references/notes/galaxy-workflow-invocation-failure-reference.md`: Research note copied verbatim into the bundle. Preserve invocation identifiers and state summaries needed to inspect workflow-level runtime failures after Planemo returns. Use when: planemo reports a failed, cancelled, missing-output, or ambiguous workflow invocation result.
- `references/notes/planemo-asserts-idioms.md`: Research note copied verbatim into the bundle. Interpret assertion failures and choose the right fast inner-loop command before full reruns. Use when: a workflow test file exists and the task is to run, iterate, or classify its test assertions.
- `references/notes/planemo-workflow-test-architecture.md`: Research note copied verbatim into the bundle. Run workflow tests while preserving Planemo structured artifacts, Galaxy mode, invocation id, history id, and API follow-up context. Use when: choosing between managed Galaxy, external Galaxy, full test runs, existing invocation checks, or direct workflow runs.

## Validation

- None declared.

## Procedure

Execute an assembled workflow's test file via planemo and emit a structured pass/fail with the artifacts a debug pass needs. One invocation runs the test once, captures the evidence, and — on failure — classifies the failure modality and names the next reference surface to inspect. It does not repair anything; that is debug-galaxy-workflow-output's job.

### Sequence

1. **Validate before running.** When a test file is present, run validate-tests for the static schema and workflow-label checks first. A run is expensive; do not spend one on a test file that fails static validation.
2. **Pick the Galaxy mode.** Run against a Planemo-managed Galaxy or an existing/external Galaxy. **Planemo-managed** is the default and needs no pre-provisioned server: `planemo test` bootstraps its own Galaxy and installs the workflow's tools from the Tool Shed/conda — so the absence of a running Galaxy is not a reason to skip the run. Use an existing/external Galaxy only when you deliberately want to target one (shared instance, pre-installed heavy tools/reference data, specific credentials). The real cost of the managed path is install/runtime weight (large tools or multi-GB reference databases), which is a deliberate deferral, not an impossibility. Record which mode was used, how tools, workflows, and test data were staged, and the URLs or API credentials a follow-up inspection would need. The choice and its consequences are guided by planemo-workflow-test-architecture.
3. **Do not pin a Galaxy version.** Leave `--galaxy_branch` off the command unless the user or the harness supplied a specific branch, and never guess a `release_*` value — Planemo's default targets the newest Galaxy, which is the only version guaranteed to understand every construct a freshly authored workflow can contain. If a branch was supplied, record it in the output so a later failure can be read against it.
4. **Run and capture.** Drive `planemo test` with structured output enabled. Preserve the invocation id, history id, workflow id, the Planemo structured result, and any test-output artifact paths — these are the inputs the debug skill consumes.
5. **Classify on failure.** When the run exits non-zero or reports failed assertions, failed jobs, a failed invocation, missing outputs, or upload/staging problems, identify the observed failure modality and the single next reference surface to open: Planemo result, Galaxy job API, Galaxy invocation API, history contents, or the test assertion report. Use planemo-asserts-idioms to read assertion failures and galaxy-workflow-invocation-failure-reference to preserve invocation identifiers and state.
6. **Hand off.** Emit the structured summary — green, or red with modality + captured artifacts + the named next surface — for debug-galaxy-workflow-output.

Keep this skill's output a faithful record of what happened, not a diagnosis. Mislabeling a staging failure as an assertion failure here sends the debug pass to the wrong reference surface.

## Runtime Notes

- Do not read Foundry source files at runtime; use only files packaged in this skill bundle and user-supplied artifacts.
- Preserve declared artifact filenames unless the user or harness supplies explicit paths.
- Carry unresolved assumptions into the output artifact instead of silently inventing missing source evidence.