Home Pattern

Galaxy: genomic interval patterns

Use this MOC to choose corpus-grounded Galaxy genomic interval operations and recipes on coordinate features.

Revised
2026-06-10
Rev
1

Galaxy: genomic interval patterns

The runtime-facing map for Galaxy coordinate-feature choices — operations that understand chrom/start/end/strand, as opposed to opaque-column galaxy-tabular-patterns or container-shaped galaxy-collection-patterns. Use it before loading raw survey notes; iwc-interval-operations-survey is the evidence backing, these pages are the actionable references.

This is the smallest of the three data-shape MOCs by design. Interval algebra is a real but moderate cluster in IWC — concentrated in epigenetics peak-consensus and SARS-CoV-2 masking — and its highest-value units are recipes, not single operations. Reach for the recipes first when your need is a multi-step construction.

Overlap

  • interval-overlap-filter — keep/drop/annotate features by overlap with a second set (bedtools intersect, or VCF-native vcfvcfintersect).

Set operations

Windows & coverage

  • interval-window-flank — extend features into neighborhood windows (bedtools slop).
  • interval-coverage — genome-wide depth (bedgraph) or reads-in-given-regions counts (bedtools genomecov / coverage).

Recipes

Bridges

Gaps (no corpus exemplar, no page)

Per corpus-first, these have zero IWC uptake and get no pattern page; documented here so the absence is explicit, not an oversight. GTN training-corpus counts below are grounded in iwc-interval-operations-survey §GTN cross-reference (non-IWC signal):

  • closest / proximity — the “nearest feature + distance” operation. The interval-algebra form (bedtools closest, fetch-closest, windowbed) is absent from both IWC and GTN. The task is absent from IWC entirely; in GTN it appears only as a domain step (deeptools computeMatrix reference-point mode, TSS-relative signal), which #268 scopes out. This is the operation that motivated the MOC (#268). The natural tool — bedtools_closestbed (-d for distance) — exists in the same iuc/bedtools suite this MOC’s other tools come from; it simply has no corpus exemplar, so per corpus-first there is no recurring pattern to author, and an agent that needs nearest-feature-plus-distance reaches for it directly. (computeMatrix covers only the narrower ChIP/ATAC TSS-proximity case, not a general substitute.)
  • complement — taught in one GTN tutorial (assembly), zero IWC.
  • coordinate-aware sort — taught across three GTN tutorials; IWC sorts intervals with tabular sort1 instead.
  • makewindows, map, window (-w neighborhood join), annotate, jaccard — zero in both corpora.

These are tracked as IWC-input-blocked candidates (GitHub requires-iwc-inputs); a page follows only when an IWC workflow uses the operation.

See also

Incoming References (10)

  • Interval: consensus regions by multi-intersectrelated pattern— Find features reproducible across replicates: multi-intersect per-replicate sets, threshold by replicate count, then intersect back against the merged call.
  • Interval: compute coveragerelated pattern— Two coverage modes: genome-wide depth as a bedgraph (genomecoveragebed) and reads counted in given regions (coveragebed). Same family, different question.
  • Interval: build a mask by set algebrarelated pattern— Compute regions from regions: concatenate candidate intervals, merge into non-overlapping spans, then subtract the set to keep. The gops_* set-algebra recipe.
  • Interval: merge overlapping featuresrelated pattern— Collapse overlapping or book-ended intervals within one set into single spans; bedtools mergebed or the gops_merge Operate-on-Genomic-Intervals tool.
  • Interval: filter or annotate by overlaprelated pattern— Keep, drop, or annotate coordinate features by overlap with a second feature set; bedtools intersect (BED) or vcfvcfintersect (VCF), mapped over a collection.
  • Interval: window or flank featuresrelated pattern— Extend features by a fixed or fractional amount to build neighborhood windows, clamped to chromosome ends; bedtools slopbed with a genome file.
  • Interval: windowed coverage around featuresrelated pattern— Quantify signal in fixed neighborhoods around point features: window the features (slop), collapse overlaps (merge), then count reads in each window (coverage).
  • Sequence: extract or mask by regionrelated pattern— Turn coordinates into sequence: extract FASTA at BED intervals (getfasta), mask regions by BED (maskfasta), or extract transcripts from a GFF (gffread).
  • Iwc Interval Operations Surveyrelated note— IWC corpus survey of coordinate-aware genomic interval operations; sizing and candidate boundaries for a galaxy-interval-patterns MOC, with hold-if-thin gate.
  • Iwc Sequence Operations Surveyrelated note— IWC survey of record-level FASTA manipulation (interconversion, reformat, merge/dedup, subset, extract-at-intervals); sizes a galaxy-sequence-patterns MOC.