Ground collection-shape choices in curated, corpus-observed operation and recipe patterns.
Trigger: When selecting collection cleanup, reshape, identifier, or collection-tabular bridge patterns.
on-demand runtime verbatim corpus-observed deterministic 4.4 KB
- bundle
references/patterns/galaxy-collection-patterns.md - source
content/patterns/galaxy-collection-patterns.md
Preview md
---
type: pattern
pattern_kind: moc
evidence: corpus-observed
title: "Galaxy: collection patterns"
aliases:
- "Galaxy collection pattern MOC"
- "collection transformation patterns"
- "IWC collection pattern map"
tags:
- pattern
- target/galaxy
- topic/galaxy-transform
- topic/collection-transform
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
summary: "Use this MOC to choose corpus-grounded Galaxy collection transformation patterns."
related_notes:
- "[[iwc-transformations-survey]]"
- "[[iwc-conditionals-survey]]"
related_patterns:
- "[[manifest-to-mapped-collection-lifecycle]]"
- "[[cleanup-sync-and-publish-nonempty-results]]"
- "[[reshape-relabel-remap-by-collection-axis]]"
- "[[fan-in-bundle-consume-and-flatten]]"
- "[[collection-cleanup-after-mapover-failure]]"
- "[[sync-collections-by-identifier]]"
- "[[harmonize-by-sortlist-from-identifiers]]"
- "[[regex-relabel-via-tabular]]"
- "[[relabel-via-rules-and-find-replace]]"
- "[[collection-swap-nesting-with-apply-rules]]"
- "[[collection-split-identifier-via-rules]]"
- "[[collection-build-list-paired-with-apply-rules]]"
- "[[tabular-to-collection-by-row]]"
- "[[tabular-concatenate-collection-to-table]]"
- "[[tabular-pivot-collection-to-wide]]"
related_molds:
- "[[implement-galaxy-tool-step]]"
- "[[nextflow-summary-to-galaxy-data-flow]]"
- "[[cwl-summary-to-galaxy-data-flow]]"
- "[[nextflow-summary-to-galaxy-template]]"
- "[[cwl-summary-to-galaxy-template]]"
- "[[freeform-summary-to-galaxy-template]]"
- "[[compare-against-iwc-exemplar]]"
---
# Galaxy: collection patterns
This is the runtime-facing map for Galaxy collection transformation choices. Use it before loading raw survey notes. The survey remains evidence backin
...
Ground conditional-branch and optional-step choices in curated, corpus-observed Galaxy when/pick_value patterns.
Trigger: When data-flow translation needs optional steps, gating on non-empty results, routing between alternative outputs, or transform-or-pass-through branches.
on-demand runtime verbatim corpus-observed deterministic 2.6 KB
- bundle
references/patterns/galaxy-conditionals-patterns.md - source
content/patterns/galaxy-conditionals-patterns.md
Preview md
---
type: pattern
pattern_kind: moc
evidence: corpus-observed
title: "Galaxy: conditionals patterns"
aliases:
- "Galaxy conditional pattern MOC"
- "Galaxy when patterns"
- "conditional workflow patterns"
tags:
- pattern
- target/galaxy
- topic/galaxy-transform
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
summary: "Use this MOC to choose corpus-grounded Galaxy when and pick_value conditional patterns."
related_notes:
- "[[iwc-conditionals-survey]]"
related_patterns:
- "[[conditional-run-optional-step]]"
- "[[conditional-route-between-alternative-outputs]]"
- "[[conditional-gate-on-nonempty-result]]"
- "[[conditional-transform-or-pass-through]]"
- "[[collection-cleanup-after-mapover-failure]]"
related_molds:
- "[[implement-galaxy-tool-step]]"
- "[[nextflow-summary-to-galaxy-data-flow]]"
- "[[cwl-summary-to-galaxy-data-flow]]"
- "[[nextflow-summary-to-galaxy-template]]"
- "[[cwl-summary-to-galaxy-template]]"
- "[[freeform-summary-to-galaxy-template]]"
- "[[compare-against-iwc-exemplar]]"
---
# Galaxy: conditionals patterns
This is the runtime-facing map for Galaxy conditional workflow choices. Use it before loading raw survey notes. The survey remains evidence backing; the operation and recipe pages are the actionable references.
## Direct Gates
- [[conditional-run-optional-step]] — expose or derive a boolean, connect it as `inputs.when`, and use `when: $(inputs.when)` to skip optional steps.
- [[conditional-gate-on-nonempty-result]] — compute a boolean from empty/non-empty dataset or collection state before gating downstream reporting/export. The MGnify recipe is corpus-backed but clunky pending verified-pattern workflow work.
## Routes and Fallbacks
- [[conditional-route-between-alternati
...
Ground genomic-interval operation choices in curated, corpus-observed Galaxy interval recipes.
Trigger: When the workflow operates on genomic intervals (BED/GFF/VCF coordinate features) and data-flow translation needs overlap, merge, coverage, windowing, masking, or set-algebra steps.
on-demand runtime verbatim corpus-observed deterministic 5.3 KB
- bundle
references/patterns/galaxy-interval-patterns.md - source
content/patterns/galaxy-interval-patterns.md
Preview md
---
type: pattern
pattern_kind: moc
evidence: corpus-observed
title: "Galaxy: genomic interval patterns"
aliases:
- "Galaxy interval pattern MOC"
- "genomic interval transformation patterns"
- "IWC interval pattern map"
tags:
- pattern
- target/galaxy
- topic/galaxy-transform
- topic/interval-transform
status: draft
created: 2026-06-10
revised: 2026-06-10
revision: 1
ai_generated: true
summary: "Use this MOC to choose corpus-grounded Galaxy genomic interval operations and recipes on coordinate features."
related_notes:
- "[[iwc-interval-operations-survey]]"
related_patterns:
- "[[interval-overlap-filter]]"
- "[[interval-coverage]]"
- "[[interval-merge-overlapping]]"
- "[[interval-window-flank]]"
- "[[interval-consensus-by-multi-intersect]]"
- "[[interval-mask-by-set-algebra]]"
- "[[interval-windowed-coverage]]"
- "[[tabular-synthesize-bed-from-3col]]"
related_molds:
- "[[implement-galaxy-tool-step]]"
- "[[nextflow-summary-to-galaxy-data-flow]]"
- "[[cwl-summary-to-galaxy-data-flow]]"
- "[[nextflow-summary-to-galaxy-template]]"
- "[[cwl-summary-to-galaxy-template]]"
- "[[paper-summary-to-galaxy-template]]"
- "[[compare-against-iwc-exemplar]]"
---
# Galaxy: genomic interval patterns
The runtime-facing map for Galaxy **coordinate-feature** choices — operations that understand `chrom/start/end/strand`, as opposed to opaque-column [[galaxy-tabular-patterns]] or container-shaped [[galaxy-collection-patterns]]. Use it before loading raw survey notes; [[iwc-interval-operations-survey]] is the evidence backing, these pages are the actionable references.
This is the smallest of the three data-shape MOCs by design. Interval algebra is a real but moderate cluster in IWC — concentrated in epigenetics peak-consensus and SARS-CoV-2 mask
...
Ground tabular bridge and table-operation choices in curated, corpus-observed operation patterns.
Trigger: When data-flow translation needs filtering, joining, aggregation, pivoting, or tabular-collection bridges.
on-demand runtime verbatim corpus-observed deterministic 3.1 KB
- bundle
references/patterns/galaxy-tabular-patterns.md - source
content/patterns/galaxy-tabular-patterns.md
Preview md
---
type: pattern
pattern_kind: moc
evidence: corpus-observed
title: "Galaxy: tabular patterns"
aliases:
- "Galaxy tabular pattern MOC"
- "tabular transformation patterns"
- "IWC tabular pattern map"
tags:
- pattern
- target/galaxy
- topic/galaxy-transform
- topic/tabular-transform
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
summary: "Use this MOC to choose corpus-grounded Galaxy tabular transformation patterns."
related_notes:
- "[[iwc-tabular-operations-survey]]"
related_patterns:
- "[[tabular-filter-by-column-value]]"
- "[[tabular-filter-by-regex]]"
- "[[tabular-cut-and-reorder-columns]]"
- "[[tabular-compute-new-column]]"
- "[[tabular-join-on-key]]"
- "[[tabular-group-and-aggregate-with-datamash]]"
- "[[tabular-sql-query]]"
- "[[tabular-prepend-header]]"
- "[[tabular-synthesize-bed-from-3col]]"
- "[[tabular-split-taxonomy-string]]"
- "[[tabular-relabel-by-row-counter]]"
- "[[tabular-to-collection-by-row]]"
- "[[tabular-concatenate-collection-to-table]]"
- "[[tabular-pivot-collection-to-wide]]"
related_molds:
- "[[implement-galaxy-tool-step]]"
- "[[nextflow-summary-to-galaxy-data-flow]]"
- "[[cwl-summary-to-galaxy-data-flow]]"
- "[[nextflow-summary-to-galaxy-template]]"
- "[[cwl-summary-to-galaxy-template]]"
- "[[freeform-summary-to-galaxy-template]]"
- "[[compare-against-iwc-exemplar]]"
---
# Galaxy: tabular patterns
This is the runtime-facing map for Galaxy tabular transformation choices. Use it before loading raw survey notes. The survey remains evidence backing; the operation pages are the actionable references.
## Row And Column Operations
- [[tabular-filter-by-column-value]] — keep/drop rows by string column value with `Filter1`.
- [[tabular-filter-by-regex]]
...
Preserve per-row metadata on the data-flow side: keep sample_sheet column_definitions wired through identifier-keyed steps instead of dropping into parallel parameter inputs, and re-attach metadata after map-over steps that lose it.
Trigger: When the upstream interface brief carries a sample_sheet[:paired|:paired_or_unpaired|:record] input, or when the Nextflow summary shows tuple(meta, path...) channel shape originating from samplesheetToList or splitCsv(header: true).
on-demand runtime verbatim corpus-observed deterministic 8.4 KB
- bundle
references/notes/galaxy-sample-sheet-collections.md - source
content/research/galaxy-sample-sheet-collections.md
Preview md
---
type: research
subtype: component
title: "Galaxy sample_sheet collection types"
tags:
- research/component
- target/galaxy
status: draft
created: 2026-05-05
revised: 2026-05-06
revision: 2
ai_generated: true
related_notes:
- "[[galaxy-collection-semantics]]"
- "[[galaxy-collection-tools]]"
- "[[nextflow-workflow-io-semantics]]"
- "[[nextflow-params-to-galaxy-inputs]]"
- "[[nextflow-path-glob-to-galaxy-datatype]]"
- "[[nextflow-to-galaxy-channel-shape-mapping]]"
- "[[nextflow-to-galaxy-reference-data-mapping]]"
related_molds:
- "[[nextflow-summary-to-galaxy-interface]]"
- "[[nextflow-summary-to-galaxy-data-flow]]"
sources:
- "Galaxy PR #19305 (Implement Sample Sheets), merged 2025-07-30"
- "lib/galaxy/model/dataset_collections/types/sample_sheet.py"
- "lib/galaxy/model/dataset_collections/types/sample_sheet_util.py"
- "lib/galaxy/model/dataset_collections/type_description.py"
- "lib/galaxy/schema/schema.py (SampleSheetColumnDefinition, SampleSheetRow)"
- "lib/galaxy/tools/wrappers.py (DatasetCollectionWrapper.sample_sheet_row)"
- "lib/galaxy/tools/sample_sheet_to_tabular.xml"
- "lib/galaxy/webapps/galaxy/api/dataset_collections.py (sample_sheet_workbook endpoints)"
- "lib/galaxy/model/migrations/alembic/versions_gxy/3af58c192752_implement_sample_sheets.py"
summary: "Galaxy's sample_sheet collection family: typed column metadata, four variants, mapping rules, validator allowlist."
---
# Galaxy sample_sheet collection types
Reference for the Galaxy backend shape that targets structured per-row metadata — the natural landing zone for Nextflow `samplesheetToList` parameters and for any source-side idiom that pairs typed metadata columns with dataset references.
## Shape
A `sample_sheet` is a list-shaped collection where each el
...
Decide between subworkflow `when:` and inline tool-step `when:` for each source conditional, and pick the right output fan-in primitive (`pick_value` vs twin-cascade) so the data-flow brief carries a coherent conditional disposition forward.
Trigger: When the Nextflow summary's `workflow.conditionals[]` is non-empty, or when subworkflow boundaries in the source align with parameter-driven branches (step, aligner, wes, tools, skip_*, use_*).
on-demand runtime verbatim corpus-observed deterministic 13.7 KB
- bundle
references/notes/nextflow-conditional-to-galaxy-subworkflow-when.md - source
content/research/nextflow-conditional-to-galaxy-subworkflow-when.md
Preview md
---
type: research
subtype: component
title: "Nextflow conditional to Galaxy subworkflow / when"
tags:
- research/component
- source/nextflow
- target/galaxy
status: draft
created: 2026-05-08
revised: 2026-05-08
revision: 1
ai_generated: true
related_notes:
- "[[nextflow-to-galaxy-reference-data-mapping]]"
- "[[nextflow-to-galaxy-channel-shape-mapping]]"
- "[[summary-nextflow]]"
- "[[gxformat2-schema]]"
related_molds:
- "[[nextflow-summary-to-galaxy-data-flow]]"
- "[[nextflow-summary-to-galaxy-template]]"
sources:
- "https://github.com/galaxyproject/gxformat2"
- "https://github.com/iwc-workflows"
summary: "Stub. Translate Nextflow conditionals into Galaxy `when:` (single-workflow v1). Subworkflow vs inline is an aesthetic call, not a rule."
---
# Nextflow conditional to Galaxy subworkflow / when
Stub. Surfaced from sarek emulation (2026-05-08). Companion to [[nextflow-to-galaxy-reference-data-mapping]] — same v1 posture (one Galaxy workflow per source pipeline; trench-coat shape is acceptable as a draft for human review), different gap (control flow rather than reference data).
## Posture
For v1 of the Nextflow-to-Galaxy translation Molds the output is a single Galaxy workflow per source pipeline, even when the source has substantial branching. IWC reviewers historically prefer sibling workflows for what looks like one pipeline with toggles, and we agree; but for the *translation step* a single artifact keeps the Mold pipeline deterministic, the harness simple, and the reviewer's mental model of "this draft maps 1:1 to the source" intact. Sibling-extraction is a polish pass a human or follow-up Mold runs *after* translation, not a decision the translation Mold makes.
The question this note addresses is: given that v1 is one Galaxy workflow, *h
...
Preserve datatype confidence while translating path-like data-flow edges, process output patterns, and published outputs.
Trigger: When choosing or reviewing Galaxy datatype extensions for data-flow edges, collection elements, or output datasets.
on-demand runtime verbatim corpus-observed deterministic 12.8 KB
- bundle
references/notes/nextflow-path-glob-to-galaxy-datatype.md - source
content/research/nextflow-path-glob-to-galaxy-datatype.md
Preview md
---
type: research
subtype: component
title: "Nextflow path/glob to Galaxy datatype mapping"
tags:
- research/component
- source/nextflow
- target/galaxy
status: draft
created: 2026-05-06
revised: 2026-05-06
revision: 1
ai_generated: true
related_notes:
- "[[nextflow-workflow-io-semantics]]"
- "[[gxformat2-workflow-inputs]]"
- "[[galaxy-datatypes-conf]]"
- "[[galaxy-sample-sheet-collections]]"
- "[[nextflow-params-to-galaxy-inputs]]"
- "[[nextflow-to-galaxy-channel-shape-mapping]]"
- "[[summary-nextflow]]"
- "[[nextflow-summary-to-galaxy-interface]]"
- "[[nextflow-summary-to-galaxy-data-flow]]"
related_molds:
- "[[summarize-nextflow]]"
- "[[nextflow-summary-to-galaxy-interface]]"
- "[[nextflow-summary-to-galaxy-data-flow]]"
sources:
- "content/research/datatypes_conf.xml.sample"
- "https://github.com/galaxyproject/galaxy/blob/7765fae934fbfdee77e3be5f5b235e43735273ae/config/datatypes_conf.xml.sample"
- "https://www.nextflow.io/docs/latest/process.html"
- "https://www.nextflow.io/docs/latest/reference/channel.html"
- "https://nextflow-io.github.io/nf-schema/latest/nextflow_schema/nextflow_schema_specification/"
summary: "Rules for mapping Nextflow path, glob, sample-sheet, and output filename evidence to Galaxy datatype extensions."
---
# Nextflow path/glob to Galaxy datatype mapping
Use this note when a Nextflow-to-Galaxy Mold needs a gxformat2 `format` value for a `data` input, collection element, or workflow output. [[nextflow-params-to-galaxy-inputs]] decides whether something is a dataset or collection; this note only decides datatype extension and confidence.
Evidence quality:
- **Corpus-observed** claims cite pinned fixtures under `$NEXTFLOW_FIXTURES`, the shared clone at `/Users/jxc755/projects/repositories/workflow-fixt
...
Cross-check source-side reference-data classifications before deciding how reference assets and optional rebuild branches flow through the Galaxy data-flow draft.
Trigger: When the reference-data or interface brief is silent, low-confidence, or conflicts with source evidence for iGenomes-derived params, coordinated bundles, compute-if-missing branches, multi-DB pick-lists, or cohort-specific assets.
on-demand runtime verbatim corpus-observed deterministic 7.8 KB
- bundle
references/notes/nextflow-reference-data-classification.md - source
content/research/nextflow-reference-data-classification.md
Preview md
---
type: research
subtype: component
title: "Nextflow reference-data classification"
tags:
- research/component
- source/nextflow
status: draft
created: 2026-05-10
revised: 2026-05-10
revision: 3
ai_generated: true
related_notes:
- "[[summary-nextflow]]"
- "[[nextflow-to-galaxy-reference-data-mapping]]"
- "[[nextflow-summary-to-galaxy-reference-data]]"
- "[[nextflow-summary-to-galaxy-interface]]"
- "[[nextflow-summary-to-galaxy-data-flow]]"
- "[[nextflow-summary-to-galaxy-template]]"
related_molds:
- "[[summarize-nextflow]]"
- "[[nextflow-summary-to-galaxy-reference-data]]"
- "[[nextflow-summary-to-galaxy-interface]]"
- "[[nextflow-summary-to-galaxy-data-flow]]"
- "[[nextflow-summary-to-galaxy-template]]"
sources:
- "https://nf-co.re/docs/usage/reference_genomes"
- "https://github.com/nf-core/sarek/blob/master/conf/igenomes.config"
- "https://github.com/nf-core/configs"
- "https://github.com/galaxyproject/foundry/issues/221"
summary: "Source-side taxonomy of how Nextflow pipelines use reference data — eight classifications detectable from a summary-nextflow artifact."
---
# Nextflow reference-data classification
Reference-data shape varies along several roughly orthogonal dimensions: whether the pipeline consumes or produces reference data, the cardinality of the assets, whether they're keyed or per-asset, whether rebuild fallback exists, and whether multiple bundles run in parallel. The classifications below are flags an LLM can detect from a `summary-nextflow` artifact; a single pipeline often matches more than one. Grounded in the complexity bridge fixtures from galaxyproject/foundry#221.
For the Galaxy-side translation of these classifications, see [[nextflow-to-galaxy-reference-data-mapping]].
## None
Pipeline consumes no ref
...
Decide how reference assets and their indexes flow through the Galaxy data-flow draft (preserving dbkey through map-overs, deferring index-building to wrappers vs surfacing as workflow steps).
Trigger: When the upstream interface brief carries reference-data inputs (FASTA, fai, dict, indexes, known sites, intervals, PoN) or when the source pipeline's compute-if-missing branches imply rebuild semantics the data flow has to honor.
on-demand runtime verbatim corpus-observed deterministic 12.1 KB
- bundle
references/notes/nextflow-to-galaxy-reference-data-mapping.md - source
content/research/nextflow-to-galaxy-reference-data-mapping.md
Preview md
---
type: research
subtype: component
title: "Nextflow to Galaxy reference-data mapping"
tags:
- research/component
- source/nextflow
- target/galaxy
status: draft
created: 2026-05-08
revised: 2026-05-10
revision: 5
ai_generated: true
related_notes:
- "[[nextflow-reference-data-classification]]"
- "[[nextflow-params-to-galaxy-inputs]]"
- "[[nextflow-path-glob-to-galaxy-datatype]]"
- "[[summary-nextflow]]"
- "[[nextflow-summary-to-galaxy-reference-data]]"
- "[[nextflow-summary-to-galaxy-template]]"
- "[[galaxy-sample-sheet-collections]]"
- "[[galaxy-datatypes-conf]]"
related_molds:
- "[[summarize-nextflow]]"
- "[[nextflow-summary-to-galaxy-reference-data]]"
- "[[nextflow-summary-to-galaxy-interface]]"
- "[[nextflow-summary-to-galaxy-data-flow]]"
- "[[nextflow-summary-to-galaxy-template]]"
sources:
- "https://github.com/galaxyproject/foundry/issues/221"
summary: "Galaxy-side translation of Nextflow reference-data classifications: idioms available, the v1 posture, datatype defaults, and the in-tool rebuild trade-off."
---
# Nextflow to Galaxy reference-data mapping
Mapping research for [[nextflow-summary-to-galaxy-reference-data]]. Once a Nextflow pipeline's reference-data usage is classified per [[nextflow-reference-data-classification]], this note pins the Galaxy-side translation: idioms available, the v1 posture, datatype defaults, the in-tool rebuild trade-off, and known representation gaps the brief should flag.
## Galaxy side
Galaxy has multiple idioms for surfacing reference data. The bullets below are presented as available shapes; the recommendations that follow narrow them to the v1 posture.
- **`dbkey`-keyed cached lookups.** Workflow inputs carry a `dbkey` annotation; tools consume an admin-pre-loaded data table indexed by
...