Galaxy Workflow Format 2 Description §
The traditional Galaxy workflow description (.ga) is not meant to be concise and is neither readily human readable or human writable. Format 2 addresses all three of these limitations while also converging (where it makes sense without sacrificing these other goals) with the workflow description with that used by the Common Workflow Language.
This standard is in active development and a moving target in many ways, but we will try to keep what is ingestible by Galaxy backward-compatible going forward.
GalaxyWorkflow §
A Galaxy workflow description. This record corresponds to the description of a workflow that should be executable on a Galaxy server that includes the contained tool definitions.
The workflows API or the user interface of Galaxy instances that are of version 19.09 or newer should be able to import a document defining this record.
A note about label field. §
This is the name of the workflow in the Galaxy user interface. This is the mechanism that
users will primarily identify the workflow using. Legacy support - this may also be called 'name' and Galaxy will
consume the workflow document fine and treat this attribute correctly - however in order to validate against this
workflow definition schema the attribute should be called label.
Fields
inputsDefines the input parameters of the process. The process is ready to run when all required input parameters are associated with concrete values. Input parameters include a schema for each parameter which is used to validate the input object. It may also be used to build a user interface for constructing the input object.
When accepting an input object, all input parameters must have a value.
If an input parameter is missing from the input object, it must be
assigned a value of null (or the value of default for that
parameter, if provided) for the purposes of validation and evaluation
of expressions.
outputsDefines the parameters representing the output of the process. May be used to generate and/or validate the output object.
classGalaxyWorkflowstepsThe individual steps that make up the workflow. Each step is executed when all of its input data links are fulfilled.
docA documentation string for this object, or an array of strings which should be concatenated.
commentsVisual annotations for the workflow editor canvas. Comments are non-functional and do not affect workflow execution. May be specified as a list or as a mapping keyed by label.
creatorWorkflow creators. Can be schema.org Person (https://schema.org/Person) or Organization (https://schema.org/Organization) entities.
releaseIf listed should correspond to the release of the workflow in its source reposiory.
WorkflowDataParameter §
A data input parameter for a Galaxy workflow. Represents one Galaxy dataset.
Normalized gxformat2 output uses type: data. type: File is accepted as
an alias, but should not be confused with workflow test job syntax where
type: File means stage a file as test input data.
Fields
optionalControls whether Galaxy allows invocation of the workflow without a
user-supplied value for this input. If true, the input may be omitted
at invocation time. optional and default are independent: a
required input (optional: false) may still declare a default,
and an optional input may have no default. default supplies a
value when the invocation input is missing or null; optional
controls whether the missing case is even permitted.
docA documentation string for this object, or an array of strings which should be concatenated.
defaultThe default value to use for this parameter if the parameter is missing
from the input object, or if the value of the parameter in the input
object is null. Default values are applied before evaluating expressions
(e.g. dependent valueFrom fields).
Any §
The Any type validates for any non-null value.
Symbols
| symbol | description |
|---|---|
Any |
StepPosition §
This field specifies the location of the step's node when rendered in the workflow editor.
Fields
topRelative vertical position of the step's node when rendered in the workflow editor.
WorkflowCollectionParameter §
A collection input parameter for a Galaxy workflow - represents a dataset collection.
Fields
optionalControls whether Galaxy allows invocation of the workflow without a
user-supplied value for this input. If true, the input may be omitted
at invocation time. optional and default are independent: a
required input (optional: false) may still declare a default,
and an optional input may have no default. default supplies a
value when the invocation input is missing or null; optional
controls whether the missing case is even permitted.
docA documentation string for this object, or an array of strings which should be concatenated.
defaultThe default value to use for this parameter if the parameter is missing
from the input object, or if the value of the parameter in the input
object is null. Default values are applied before evaluating expressions
(e.g. dependent valueFrom fields).
collection_typeCollection type (defaults to list if type is collection). Nested
collection types are separated with colons, e.g. list:list:paired.
column_definitionsColumn schema for sample-sheet collection inputs. Only meaningful when
collection_type begins with sample_sheet - cross-field validation is
applied in the pydantic post-validator.
fieldsField schema for record collection inputs. Only meaningful when
collection_type contains record (e.g. record, list:record,
sample_sheet:record).
SampleSheetColumnDefinition §
Describes one column of a sample-sheet collection input.
Used in column_definitions on a collection_type: sample_sheet[:<type>]
workflow input.
Fields
typeValue type for this column. One of string, int, float, boolean,
or element_identifier. Mirrors Galaxy's runtime
SampleSheetColumnType.
default_valueDefault value used when a row omits this column. Type must be
compatible with type - validated by the pydantic post-validator.
validatorsGalaxy-style parameter validators. Modelled as opaque records here - full validator schema lives in galaxy.tool_util_models.
restrictionsClosed set of permitted values for this column. Item type must be
compatible with the column type (post-validated).
RecordFieldDefinition §
Describes one field of a record collection input.
Used in fields on a collection_type containing record (e.g.
record, list:record, sample_sheet:record). Mirrors a subset of
the CWL InputRecordSchema shape that Galaxy persists on
DatasetCollection.fields.
Fields
nameField name. Must equal the corresponding element identifier in the materialized record collection.
typeField value type. A subset of the CWL primitive types: File,
null, boolean, int, float, string. May be a list to
express a union (e.g. ["File", "null"] for an optional file).
WorkflowIntegerParameter §
A scalar integer workflow parameter. Normalized gxformat2 output uses
type: int. type: integer is accepted for compatibility with native
Galaxy parameter state and Galaxy tool XML terminology.
Fields
optionalControls whether Galaxy allows invocation of the workflow without a
user-supplied value for this input. If true, the input may be omitted
at invocation time. optional and default are independent: a
required input (optional: false) may still declare a default,
and an optional input may have no default. default supplies a
value when the invocation input is missing or null; optional
controls whether the missing case is even permitted.
docA documentation string for this object, or an array of strings which should be concatenated.
defaultThe default value to use for this parameter if the parameter is missing
from the input object, or if the value of the parameter in the input
object is null. Default values are applied before evaluating expressions
(e.g. dependent valueFrom fields).
WorkflowFloatParameter §
A float input parameter for a Galaxy workflow.
Fields
optionalControls whether Galaxy allows invocation of the workflow without a
user-supplied value for this input. If true, the input may be omitted
at invocation time. optional and default are independent: a
required input (optional: false) may still declare a default,
and an optional input may have no default. default supplies a
value when the invocation input is missing or null; optional
controls whether the missing case is even permitted.
docA documentation string for this object, or an array of strings which should be concatenated.
defaultThe default value to use for this parameter if the parameter is missing
from the input object, or if the value of the parameter in the input
object is null. Default values are applied before evaluating expressions
(e.g. dependent valueFrom fields).
WorkflowTextParameter §
A scalar text workflow parameter. Normalized gxformat2 output uses
type: string. type: text is accepted for compatibility with native
Galaxy parameter state and Galaxy tool XML terminology.
Fields
optionalControls whether Galaxy allows invocation of the workflow without a
user-supplied value for this input. If true, the input may be omitted
at invocation time. optional and default are independent: a
required input (optional: false) may still declare a default,
and an optional input may have no default. default supplies a
value when the invocation input is missing or null; optional
controls whether the missing case is even permitted.
docA documentation string for this object, or an array of strings which should be concatenated.
defaultThe default value to use for this parameter if the parameter is missing
from the input object, or if the value of the parameter in the input
object is null. Default values are applied before evaluating expressions
(e.g. dependent valueFrom fields).
restrictionsClosed set of permitted values. When present, Galaxy renders the
runtime input as a select. Items may be plain strings or
{value, label} records.
suggestionsOpen suggestion list. Galaxy still treats the input as text but offers these as suggestions.
restrictOnConnectionsAsk Galaxy to derive valid choices from connected tool or subworkflow select inputs at runtime. Falls back to free text when derivation fails.
WorkflowTextOption §
A {value, label} option used in restrictions or suggestions on a
text workflow parameter. Plain strings are also accepted in those
arrays as shorthand for {value: <str>, label: <str>}.
Fields
WorkflowBooleanParameter §
A boolean input parameter for a Galaxy workflow.
Fields
optionalControls whether Galaxy allows invocation of the workflow without a
user-supplied value for this input. If true, the input may be omitted
at invocation time. optional and default are independent: a
required input (optional: false) may still declare a default,
and an optional input may have no default. default supplies a
value when the invocation input is missing or null; optional
controls whether the missing case is even permitted.
docA documentation string for this object, or an array of strings which should be concatenated.
defaultThe default value to use for this parameter if the parameter is missing
from the input object, or if the value of the parameter in the input
object is null. Default values are applied before evaluating expressions
(e.g. dependent valueFrom fields).
WorkflowInputParameter §
An input parameter to a Galaxy workflow. This is the catch-all type used by the Schema Salad codegen. The pydantic layer uses a discriminated union of the specific parameter types instead.
Fields
optionalControls whether Galaxy allows invocation of the workflow without a
user-supplied value for this input. If true, the input may be omitted
at invocation time. optional and default are independent: a
required input (optional: false) may still declare a default,
and an optional input may have no default. default supplies a
value when the invocation input is missing or null; optional
controls whether the missing case is even permitted.
typeSpecify valid types of data that may be assigned to this parameter.
docA documentation string for this object, or an array of strings which should be concatenated.
defaultThe default value to use for this parameter if the parameter is missing
from the input object, or if the value of the parameter in the input
object is null. Default values are applied before evaluating expressions
(e.g. dependent valueFrom fields).
collection_typeCollection type (defaults to list if type is collection). Nested
collection types are separated with colons, e.g. list:list:paired.
column_definitionsColumn schema for sample-sheet collection inputs. Only meaningful when
collection_type begins with sample_sheet.
fieldsField schema for record collection inputs. Only meaningful when
collection_type contains record.
restrictionsClosed set of permitted values for text-typed inputs. See
WorkflowTextParameter.restrictions.
restrictOnConnectionsFor text-typed inputs - derive runtime choices from connected tool/subworkflow select inputs.
GalaxyType §
Extends primitive types with the native Galaxy concepts such as datasets and collections.Normalized gxformat2 workflow input declaration spellings are data, collection, string, int, float, and boolean. Other spellings are accepted as compatibility aliases on import but normalized gxformat2 output emits the normalized spellings.
Symbols
| symbol | description |
|---|---|
null | no value |
boolean | a binary value |
int | normalized gxformat2 spelling for native Galaxy integer workflow parameters. |
long | 64-bit signed integer |
float | single precision (32-bit) IEEE 754 floating-point number |
double | double precision (64-bit) IEEE 754 floating-point number |
string | normalized gxformat2 spelling for native Galaxy text workflow parameters. |
null | no value |
boolean | a binary value |
int | normalized gxformat2 spelling for native Galaxy integer workflow parameters. |
long | 64-bit signed integer |
float | single precision (32-bit) IEEE 754 floating-point number |
double | double precision (64-bit) IEEE 754 floating-point number |
string | normalized gxformat2 spelling for native Galaxy text workflow parameters. |
integer | accepted alias for ``int`` because native Galaxy parameter state and Galaxy tool XML terminology use ``integer``. |
text | accepted alias for ``string`` because native Galaxy parameter state and Galaxy tool XML terminology use ``text``. |
File | accepted alias for ``data``, but normalized gxformat2 output emits ``data``. Note: workflow **test job** YAML uses ``type: File`` to mean 'stage this file as test input data', which is a separate concept from workflow input declaration. |
data | one Galaxy dataset input. Native Galaxy ``data_input`` converts to this spelling. |
collection | one Galaxy dataset collection input. Native Galaxy ``data_collection_input`` converts to this spelling. |
WorkflowOutputParameter §
Describe an output parameter of a workflow. The parameter must be connected to one parameter defined in the workflow that will provide the value of the output parameter. It is legal to connect a WorkflowInputParameter to a WorkflowOutputParameter.
Fields
docA documentation string for this object, or an array of strings which should be concatenated.
outputSourceSpecifies workflow parameter that supply the value of to the output parameter.
WorkflowStep §
This represents a non-input step a Galaxy Workflow.
A note about state and tool_state fields. §
Only one or the other should be specified. These are two ways to represent the "state" of a tool at this workflow step. Both are essentially maps from parameter names to parameter values.
tool_state is much more low-level and expects a flat dictionary with each value a JSON
dump. Nested tool structures such as conditionals and repeats should have all their values
in the JSON dumped string. In general tool_state may be present in workflows exported from
Galaxy but shouldn't be written by humans.
state can contained a typed map. Repeat values can be represented as YAML arrays. An alternative
to representing state this way is defining inputs with default values.
Fields
outDefines the parameters representing the output of the process. May be used to generate and/or validate the output object.
This can also be called 'outputs' for legacy reasons - but the resulting workflow document is not a valid instance of this schema.
docA documentation string for this object, or an array of strings which should be concatenated.
tool_idThe tool ID used to run this step of the workflow (e.g. 'cat1' or 'toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/4.0').
tool_shed_repositoryThe Galaxy Tool Shed repository that should be installed in order to use this tool.
tool_versionThe tool version corresponding used to run this step of the workflow. For tool shed installed tools, the ID generally uniquely specifies a version and this field is optional.
errorsDuring Galaxy export there may be some problem validating the tool state, tool used, etc.. that will be indicated by this field. The Galaxy user should be warned of these problems before the workflow can be used in Galaxy.
This field should not be used in human written Galaxy workflow files.
A typical problem is the referenced tool is not installed, this can be fixed by installed the tool and re-saving the workflow and then re-exporting it.
inDefines the input parameters of the workflow step. The process is ready to run when all required input parameters are associated with concrete values. Input parameters include a schema for each parameter which is used to validate the input object. It may also be used build a user interface for constructing the input object.
post_job_actionsOptional dict of post-job actions keyed by {ActionType}{OutputName}
compound strings. Same shape as the native post_job_actions field;
each value is a record with action_type, output_name,
action_arguments. Use the out: shorthand (rename:,
hide:, change_datatype:, etc.) for common actions; this
explicit form covers actions without an out: shorthand
(ValidateOutputsAction, etc.) and any case where the typed
record is preferred.
runSpecifies a subworkflow to run. May be an inline workflow definition, a URL string, or an @import reference dict.
whenIf defined, only run the step when the expression evaluates to
true. If false the step is skipped. A skipped step
produces a null on each output.
Expression should be an ecma5.1 expression.
WorkflowStepInput §
TODO:
Fields
sourceSpecifies one or more workflow parameters that will provide input to the underlying step parameter.
defaultThe default value for this parameter to use if either there is no
source field, or the value produced by the source is null. The
default must be applied prior to scattering or evaluating valueFrom.
WorkflowStepOutput §
Associate an output parameter of the underlying process with a workflow
parameter. The workflow parameter (given in the id field) be may be used
as a source to connect with input parameters of other workflow steps, or
with an output parameter of the process.
A unique identifier for this workflow output parameter. This is
the identifier to use in the source field of WorkflowStepInput
to connect the output value to downstream parameters.
Fields
ToolShedRepository §
Fields
changeset_revisionThe revision of the tool shed repository this tool can be found in.
tool_shedThe URI of the tool shed containing the repository this tool can be found in - typically this should be toolshed.g2.bx.psu.edu.
WorkflowStepType §
Module types used by Galaxy steps. Galaxy's native format allows additional types such as data_input, data_input_collection, and parameter_type
but these should be represented as inputs in Format2.
Symbols
| symbol | description |
|---|---|
tool | Run a tool. |
subworkflow | Run a subworkflow. |
pause | Pause computation on this branch of workflow until user allows it to continue. |
pick_value | Select the first non-null value from multiple inputs. Used to merge branches of conditional or optional workflow paths. |
Report §
Definition of an invocation report for this workflow. Currently the only field is 'markdown'.