implement-galaxy-workflow-test
Assemble a Galaxy workflow test file (tests-format) from the schema-valid Galaxy test plan (galaxy-workflow-test-plan), the concrete gxformat2 workflow (galaxy-workflow.gxwf.yml), and the resolved test-data refs. One invocation produces the companion test file whose job inputs come from the workflow inputs and whose assertions come from the plan’s assertion intent. The output must validate against tests-format and pass the workflow-label cross-check before any Planemo run.
Name the companion off the workflow’s basename. Planemo discovers the test file by stripping only the workflow’s final extension and appending -tests.yml, so the companion of galaxy-workflow.gxwf.yml is galaxy-workflow.gxwf-tests.yml (keep the .gxwf) — not galaxy-workflow-tests.yml. Derive the name from whatever the workflow file is actually called; a .ga workflow would instead pair with <basename>-tests.yml.
The workflow is the contract: input and output labels in the test file must address real workflow input/output labels. The test plan’s bindings may be assumed or unresolved — especially for synthesized (freeform-sourced) plans whose workflow.label_source is interface-brief — so reconcile each plan binding against the real workflow labels here, and resolve unresolved[] entries and storage: unresolved fixtures against the workflow and the resolved test-data refs. When authoring reveals a missing label, an omitted workflow output, or an unstable collection identifier, treat it as testability pressure on the workflow itself — surface it per galaxy-workflow-testability-design rather than asserting around it.
Sequence
-
Bootstrap. Prefer generating the test skeleton from a real invocation, not from scratch:
-
Author job inputs and stage their data. Wire each workflow input to a test-data-refs entry, and make the data the test references actually exist on disk before any Planemo run:
- Prefer a bare remote
location: (iwc-test-data-conventions, remote-URL-first) whenever the ref is a single fetchable artifact — Galaxy fetches it at upload time, so nothing is staged locally. Record the hash when known.
- Materialize a local
test-data/ layout only when the ref needs prep a URL can’t express — a concatenation (e.g. per-isolate chromosome+plasmid into one FASTA), a subset (one chromosome, selected loci), or a column split into named collection elements. In that case the test file’s path: entries point at files this Mold writes: fetch from the ref’s verified source URLs, run the documented prep, and lay the result out in the test-data/ directory addressed by the *-tests.yml, relative to the workflow. The collection element identifiers in the staged layout must match the test file and the workflow labels exactly.
- Never emit a
path: (or location:) the test references but no source produces — an un-materialized path passes the static cross-check and fails only at upload, far from here. If a ref is resolved: false, surface the gap rather than authoring a path to a file that does not exist.
Inputs must match the workflow’s collection shapes and datatypes.
-
Author assertions. Materialize the plan’s assertion intent into concrete output assertions. Choose assertion families and tolerances per planemo-asserts-idioms; check each shortcut against iwc-shortcuts-anti-patterns so an existence-only or size-only assertion is a deliberate choice, not an evasion. Honor the plan’s omissions[] and treat low-confidence synthesized intent as a starting point to tighten against the real invocation.
-
Validate static. Run validate-tests for the schema gate, then the workflow-label cross-check (checkTestsAgainstWorkflow): zero missing input labels, zero missing output labels, no collection/datatype mismatches. Fix before spending a Planemo run.
-
Run green. Drive planemo test with the staged data. “Managed Galaxy” here means Planemo-managed: planemo test bootstraps its own Galaxy and installs the workflow’s tools from the Tool Shed/conda — it does not require a pre-provisioned external server, so absence of a running Galaxy is not a reason to skip this gate (cost/runtime of heavy tool or reference-DB installs may be, but that is a deliberate deferral, not an impossibility). On green, hand off the test file plus enough invocation/job/assertion context for run-workflow-test and debug-galaxy-workflow-output to use if a later run fails.
Author tests with stable labels and artifacts that Planemo can connect back to Galaxy invocations, jobs, and outputs (planemo-workflow-test-architecture) — that traceability is what makes the downstream debug Mold able to locate failure evidence.