Mapping Jobs to Destinations using TPV
Author(s) | Nate Coraor Björn Grüning Nuwan Goonasekera Mira Kuntz |
Editor(s) | Helena Rasche Enis Afgan |
Tester(s) | Catherine Bromhead Edwin den Haas |
OverviewQuestions:Objectives:
How can I configure job dependent resources, like cores, memory for my DRM?
How can I map jobs to resources and destinations
Requirements:
Know how to map tools to job destinations
Be able to use the dynamic job runner to make arbitrary destination mappings
Understand the job resource selector config and dynamic rule creation
The various ways in which tools can be mapped to destinations, both statically and dynamically
How to write a dynamic tool destination (DTD)
How to write a dynamic python function destination
How to use the job resource parameter selection feature
- slides Slides: Connecting Galaxy to a compute cluster
- tutorial Hands-on: Connecting Galaxy to a compute cluster
Time estimation: 2 hoursSupporting Materials:Published: Jan 17, 2021Last modification: Jun 14, 2024License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MITpurl PURL: https://gxy.io/GTN:T00012rating Rating: 4.6 (0 recent ratings, 11 all time)version Revision: 36
This tutorial heavily builds on the Connecting Galaxy to a compute cluster and it’s expected you have completed this tutorial first.
Now that you have a working scheduler, we will start configuring which jobs are sent to which destinations.
Agenda
Comment: Galaxy Admin Training PathThe yearly Galaxy Admin Training follows a specific ordering of tutorials. Use this timeline to help keep track of where you are in Galaxy Admin Training.
Step 1ansible-galaxy Step 2backup-cleanup Step 3customization Step 4tus Step 5cvmfs Step 6apptainer Step 7tool-management Step 8reference-genomes Step 9data-library Step 10dev/bioblend-api Step 11connect-to-compute-cluster Step 12job-destinations Step 13pulsar Step 14celery Step 15gxadmin Step 16reports Step 17monitoring Step 18tiaas Step 19sentry Step 20ftp Step 21beacon
Mapping jobs to destinations
In order to run jobs in Galaxy, you need to assign them to a resource manager that can handle the task. This involves specifying the appropriate amount of memory and CPU cores. For production installations, the jobs must be routed to a resource manager like SLURM, HTCondor, or Pulsar. Some tools may need specific resources such as GPUs or multi-core machines to work efficiently.
Sometimes, your available resources are spread out across multiple locations and resource managers. In such cases, you need a way to route your jobs to the appropriate location. Galaxy offers several methods for routing jobs, ranging from simple static mappings to custom Python functions via dynamic job destinations.
Recently, the Galaxy project has introduced a library named Total Perspective Vortex (TPV) to simplify this process. TPV provides a admin-friendly YAML configuration that works for most scenarios. For more complex cases, TPV also allows you to embed Python code into the configuration YAML file and implement fine-grained control over jobs.
Lastly, TPV offers a shared global database of default resource requirements (more below). By leveraging this database, admins don’t have to figure out the requirements for each tool separately.
Writing a testing tool
To demonstrate a real-life scenario and TPV’s role in it, let’s plan on setting up a configuration where the VM designated for training jobs doesn’t run real jobs and hence doesn’t get overloaded. To start, we’ll create a “testing” tool that we’ll use in our configuration. This testing tool can run quickly, and without overloading our small machines.
Hands-on: Deploying a Tool
Create the directory
files/galaxy/tools/
if it doesn’t exist and edit a new file infiles/galaxy/tools/testing.xml
with the following contents:--- /dev/null +++ b/files/galaxy/tools/testing.xml @@ -0,0 +1,11 @@ +<tool id="testing" name="Testing Tool"> + <command> + <![CDATA[echo "Running with '\${GALAXY_SLOTS:-1}' threads" > "$output1"]]> + </command> + <inputs> + <param name="input1" type="data" format="txt" label="Input Dataset"/> + </inputs> + <outputs> + <data name="output1" format="txt" /> + </outputs> +</tool>
If you haven’t worked with diffs before, this can be something quite new or different.
If we have two files, let’s say a grocery list, in two files. We’ll call them ‘a’ and ‘b’.
Input: Old$ cat old
🍎
🍐
🍊
🍋
🍒
🥑Output: New$ cat new
🍎
🍐
🍊
🍋
🍍
🥑We can see that they have some different entries. We’ve removed 🍒 because they’re awful, and replaced them with an 🍍
Diff lets us compare these files
$ diff old new
5c5
< 🍒
---
> 🍍Here we see that 🍒 is only in a, and 🍍 is only in b. But otherwise the files are identical.
There are a couple different formats to diffs, one is the ‘unified diff’
$ diff -U2 old new
--- old 2022-02-16 14:06:19.697132568 +0100
+++ new 2022-02-16 14:06:36.340962616 +0100
@@ -3,4 +3,4 @@
🍊
🍋
-🍒
+🍍
🥑This is basically what you see in the training materials which gives you a lot of context about the changes:
--- old
is the ‘old’ file in our view+++ new
is the ‘new’ file- @@ these lines tell us where the change occurs and how many lines are added or removed.
- Lines starting with a - are removed from our ‘new’ file
- Lines with a + have been added.
So when you go to apply these diffs to your files in the training:
- Ignore the header
- Remove lines starting with - from your file
- Add lines starting with + to your file
The other lines (🍊/🍋 and 🥑) above just provide “context”, they help you know where a change belongs in a file, but should not be edited when you’re making the above change. Given the above diff, you would find a line with a 🍒, and replace it with a 🍍
Added & Removed Lines
Removals are very easy to spot, we just have removed lines
--- old 2022-02-16 14:06:19.697132568 +0100
+++ new 2022-02-16 14:10:14.370722802 +0100
@@ -4,3 +4,2 @@
🍋
🍒
-🥑And additions likewise are very easy, just add a new line, between the other lines in your file.
--- old 2022-02-16 14:06:19.697132568 +0100
+++ new 2022-02-16 14:11:11.422135393 +0100
@@ -1,3 +1,4 @@
🍎
+🍍
🍐
🍊Completely new files
Completely new files look a bit different, there the “old” file is
/dev/null
, the empty file in a Linux machine.$ diff -U2 /dev/null old
--- /dev/null 2022-02-15 11:47:16.100000270 +0100
+++ old 2022-02-16 14:06:19.697132568 +0100
@@ -0,0 +1,6 @@
+🍎
+🍐
+🍊
+🍋
+🍒
+🥑And removed files are similar, except with the new file being /dev/null
--- old 2022-02-16 14:06:19.697132568 +0100
+++ /dev/null 2022-02-15 11:47:16.100000270 +0100
@@ -1,6 +0,0 @@
-🍎
-🍐
-🍊
-🍋
-🍒
-🥑Add the tool to the Galaxy group variables under the new item
galaxy_local_tools
:--- a/group_vars/galaxyservers.yml +++ b/group_vars/galaxyservers.yml @@ -155,6 +155,9 @@ galaxy_config_templates: galaxy_extra_dirs: - /data +galaxy_local_tools: +- testing.xml + # Certbot certbot_auto_renew_hour: "{{ 23 |random(seed=inventory_hostname) }}" certbot_auto_renew_minute: "{{ 59 |random(seed=inventory_hostname) }}"
Run the Galaxy playbook.
Input: Bashansible-playbook galaxy.yml
Reload Galaxy in your browser and the new tool should now appear in the tool panel. If you have not already created a dataset in your history, upload a random text dataset. Once you have a dataset, click the tool’s name in the tool panel, then click Execute.
QuestionWhat is the tool’s output?
Running with '1' threads
1.sh
Of course, this tool doesn’t actually use the allocated number of cores. In a real tool, you would call the tools’s underlying command with whatever flag that tool provides to control the number of threads or processes it starts, such as samtools sort -@ \${GALAXY_SLOTS:-1}
.
Safeguard: TPV Linting
If we want to change something in production, it is always a good idea to have a safeguard in place. In our case, we would like to check the TPV configuration files for syntax errors, so nothing will break when we deploy broken yaml files or change them quickly on the server. TPV-lint-and-copy works with two separate locations:
- one where you can safely edit your files
- and the actual production config directory that galaxy reads.
Once you are done with your changes, you can run the script and it will automatically lint and copy over the files, if they are correct and mentioned in your job_conf.yml file, or in your group_vars/galaxyservers.yml
inline job_conf
.
And of course, Galaxy has an Ansible Role for that.
Hands-on: Adding automated TPV-lind-and-copy-script
Add the role to your
requirements.yml
.--- a/requirements.yml +++ b/requirements.yml @@ -28,3 +28,6 @@ version: 0.0.3 - src: galaxyproject.slurm version: 1.0.2 +# TPV Linting +- name: usegalaxy_eu.tpv_auto_lint + version: 0.4.3
Install the missing role
Input: Bashansible-galaxy install -p roles -r requirements.yml
Change your
group_vars/galaxyservers.yml
. We need to create a new directory where the TPV configs will be stored after linting, and add that directory name as variable for the role. The default name is ‘TPV_DO_NOT_TOUCH’ for extra safety 😉. If you want a different name, you need to change thetpv_config_dir_name
variable, too. We also need to create a directory,tpv_mutable_dir
(a role default variable), where TPV configs are copied before linting.--- a/group_vars/galaxyservers.yml +++ b/group_vars/galaxyservers.yml @@ -138,6 +138,8 @@ galaxy_config: - job-handlers - workflow-schedulers +galaxy_job_config_file: "{{ galaxy_config_dir }}/galaxy.yml" + galaxy_config_files_public: - src: files/galaxy/welcome.html dest: "{{ galaxy_mutable_config_dir }}/welcome.html" @@ -154,6 +156,11 @@ galaxy_config_templates: galaxy_extra_dirs: - /data + - "{{ galaxy_config_dir }}/{{ tpv_config_dir_name }}" + +galaxy_extra_privsep_dirs: + - "{{ tpv_mutable_dir }}" +tpv_privsep: true galaxy_local_tools: - testing.xml
Add the role to your
galaxy.yml
playbook.--- a/galaxy.yml +++ b/galaxy.yml @@ -38,6 +38,7 @@ - galaxyproject.slurm - usegalaxy_eu.apptainer - galaxyproject.galaxy + - usegalaxy_eu.tpv_auto_lint - role: galaxyproject.miniconda become: true become_user: "{{ galaxy_user_name }}"
At this point we won’t run the modified playbook just yet. Because TPV itself has not yet been installed, the tpv_auto_lint role would fail at this point. So first, we’ll have to install and configure TPV itself before the linter can work.
Configuring TPV
We want our tool to run with more than one core. To do this, we need to instruct Slurm to allocate more cores for this job. First however, we need to configure Galaxy to use TPV.
Hands-on: Adding TPV to your job configuration
Edit
group_vars/galaxyservers.yml
and add the following destination under yourjob_config
section to route all jobs to TPV.--- a/group_vars/galaxyservers.yml +++ b/group_vars/galaxyservers.yml @@ -24,34 +24,18 @@ galaxy_job_config: handling: assign: ['db-skip-locked'] execution: - default: slurm + default: tpv_dispatcher environments: local_env: runner: local_runner tmp_dir: true - slurm: - runner: slurm - singularity_enabled: true - env: - - name: LC_ALL - value: C - - name: APPTAINER_CACHEDIR - value: /tmp/singularity - - name: APPTAINER_TMPDIR - value: /tmp - singularity: - runner: local_runner - singularity_enabled: true - env: - # Ensuring a consistent collation environment is good for reproducibility. - - name: LC_ALL - value: C - # The cache directory holds the docker containers that get converted - - name: APPTAINER_CACHEDIR - value: /tmp/singularity - # Apptainer uses a temporary directory to build the squashfs filesystem - - name: APPTAINER_TMPDIR - value: /tmp + tpv_dispatcher: + runner: dynamic + type: python + function: map_tool_to_destination + rules_module: tpv.rules + tpv_config_files: + - "{{ tpv_config_dir }}/tpv_rules_local.yml" tools: - class: local # these special tools that aren't parameterized for remote execution - expression tools, upload, etc environment: local_env @@ -147,6 +131,8 @@ galaxy_config_files_public: galaxy_config_files: - src: files/galaxy/themes.yml dest: "{{ galaxy_config.galaxy.themes_config_file }}" + - src: files/galaxy/config/tpv_rules_local.yml + dest: "{{ tpv_mutable_dir }}/tpv_rules_local.yml" galaxy_config_templates: - src: templates/galaxy/config/container_resolvers_conf.yml.j2
Note that we set the default execution environment to the tpv_dispatcher, added the tpv_dispatcher itself as a dynamic runner, and removed all other destinations. Adding TPV as a runner will cause Galaxy to automatically install the
total-perspective-vortex
package on startup as a conditional dependency. Finally, we added a new config file namedtpv_rules_local.yml
, which we will create next.
Create a new file named
tpv_rules_local.yml
in thefiles/galaxy/config/
folder of your ansible playbook, so that it is copied to the config folder on the target. The file should contain the following content:--- /dev/null +++ b/files/galaxy/config/tpv_rules_local.yml @@ -0,0 +1,30 @@ +tools: + .*testing.*: + cores: 2 + mem: cores * 4 + +destinations: + local_env: + runner: local_runner + max_accepted_cores: 1 + params: + tmp_dir: true + singularity: + runner: local_runner + max_accepted_cores: 1 + params: + singularity_enabled: true + env: + # Ensuring a consistent collation environment is good for reproducibility. + LC_ALL: C + # The cache directory holds the docker containers that get converted + APPTAINER_CACHEDIR: /tmp/singularity + # Singularity uses a temporary directory to build the squashfs filesystem + APPTAINER_TMPDIR: /tmp + slurm: + inherits: singularity + runner: slurm + max_accepted_cores: 16 + params: + native_specification: --nodes=1 --ntasks=1 --cpus-per-task={cores} +
In this TPV config, we have specified that the testing tool should use
2
cores. Memory has been defined as an expression and should be 4 times as much as cores, which, in this case, is 8GB. Note that the tool id is matched via a regular expression against the full tool id. For example, a full tool id for hisat may look like:toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy7
This enables complex matching, including matching against specific versions of tools.Destinations must also be defined in TPV itself. Importantly, note that any destinations defined in the job conf are ignored by TPV. Therefore, we have moved all destinations from the job conf to TPV. In addition, we have removed some redundancy by using the “inherits” clause in the
slurm
destination. This means that slurm will inherit all of the settings defined for singularity, but selectively override some settings. We have additionally defined thenative_specification
param for SLURM, which is what SLURM uses to allocate resources per job. Note the use of the{cores}
parameter within the native specification, which TPV will replace at runtime with the value of cores assigned to the tool.Finally, we have also defined a new property named
max_accepted_cores
, which is the maximum amount of cores this destination will accept. Since the testing tool requests 2 cores, but only theslurm
destination is able to accept jobs greater than 1 core, TPV will automatically route the job to the best matching destination, in this case, slurm.Run the Galaxy playbook.
Input: Bashansible-playbook galaxy.yml
Click the rerun button on the last history item, or click Testing Tool in the tool panel, and then click the tool’s Run Tool button.
QuestionWhat is the tool’s output?
Running with '2' threads
2.sh
Configuring defaults
Now that we’ve configured the resource requirements for a single tool, let’s see how we can configure defaults for all tools, and reuse those defaults to reduce repetition.
Hands-on: Configuring defaults and inheritance
Edit your
files/galaxy/config/tpv_rules_local.yml
and add the following settings.--- a/files/galaxy/config/tpv_rules_local.yml +++ b/files/galaxy/config/tpv_rules_local.yml @@ -1,4 +1,11 @@ +global: + default_inherits: default + tools: + default: + abstract: true + cores: 1 + mem: cores * 4 .*testing.*: cores: 2 mem: cores * 4 @@ -27,4 +34,3 @@ destinations: max_accepted_cores: 16 params: native_specification: --nodes=1 --ntasks=1 --cpus-per-task={cores} -
We have defined a
global
section specifying that all tools and destinations should inherit from a specifieddefault
. We have then defined a tool nameddefault
, whose properties are implicitly inherited by all tools at runtime. This means that ourtesting
tool will also inherit from this default tool, but it explicitly overrides cores. We can also explicitly specify aninherits
clause if we wish to extend a specific tool or destination, as previously shown in the destinations section.Run the Galaxy playbook. When the new
tpv_rules_local.yml
is copied, TPV will automatically pickup the changes without requiring a restart of Galaxy.Input: Bashansible-playbook galaxy.yml
TPV reference documentation
Please see TPV’s dedicated documentation for more information.
Configuring the TPV shared database
The Galaxy Project maintains a shared database of TPV rules so that admins do not have to independently rediscover ideal resource allocations for specific tools. These rules are based on settings that have worked well in the usegalaxy.* federation. The rule file can simply be imported directly, with local overrides applied on top.
Edit
group_vars/galaxyservers.yml
and add the location of the TPV shared rule file to thetpv_dispatcher
destination.--- a/group_vars/galaxyservers.yml +++ b/group_vars/galaxyservers.yml @@ -35,6 +35,7 @@ galaxy_job_config: function: map_tool_to_destination rules_module: tpv.rules tpv_config_files: + - https://gxy.io/tpv/db.yml - "{{ tpv_config_dir }}/tpv_rules_local.yml" tools: - class: local # these special tools that aren't parameterized for remote execution - expression tools, upload, etc
Note how TPV allows the file to be imported directly via its http url. As many local and remote rule files as necessary can be combined, with rule files specified later overriding any previously specified rule files. The TPV shared database does not define destinations, only cores and mem settings, as well as any required environment vars. Take a look at the shared database of rules and note that some tools have very large recommended memory settings, which may or may not be available within your local cluster. Nevertheless, you may still wish to execute these tools with memory adjusted to suit your cluster’s capabilities.
Edit your
files/galaxy/config/tpv_rules_local.yml
and make the following changes.--- a/files/galaxy/config/tpv_rules_local.yml +++ b/files/galaxy/config/tpv_rules_local.yml @@ -31,6 +31,9 @@ destinations: slurm: inherits: singularity runner: slurm - max_accepted_cores: 16 + max_accepted_cores: 24 + max_accepted_mem: 256 + max_cores: 2 + max_mem: 8 params: native_specification: --nodes=1 --ntasks=1 --cpus-per-task={cores}
These changes indicate that the destination will accept jobs that are up to
max_accepted_cores: 24
andmax_accepted_mem: 256
. If the tool requests resources that exceed these limits, the tool will be rejected by the destination. However, once accepted, the resources will be forcibly clamped down to 2 and 8 at most because of themax_cores
andmax_mem
clauses. (E.g. a tool requesting 24 cores would only be submitted with 16 cores at maximum.) Therefore, a trick that can be used here to support job resource requirements in the shared database that are much larger than your destination can actually support, is to combinemax_accepted_cores/mem/gpus
withmax_cores/mem/gpus
to accept the job and then clamp it down to a supported range. This allows even the largest resource requirement in the shared database to be accomodated.Comment: Clamping in practiceFor the purposes of this tutorial, we’ve clamped down from 16 cores to 2 cores, and mem from 256 to 8, which is unlikely to work in practice. In production, you will probably need to manually test any tools that exceed your cluster’s capabilities, and decide whether you want those tools to run in the first place.
Run the Galaxy playbook.
Input: Bashansible-playbook galaxy.yml
Basic access controls
You may wish to apply some basic restrictions on which users are allowed to run specific tools. TPV accomodates user and role specific rules. In addition, TPV supports tagging of tools, users, roles and destinations. These tags can be matched up so that only desired combinations are compatible with each other. While these mechanisms are detailed in the TPV documentation, we will choose a different problem that highlights some other capabilities
- restricting a tool so that only an admin can execute that tool.
Hands-on: Using conditionals to restrict a tool to admins only
Edit your
files/galaxy/config/tpv_rules_local.yml
and add the following rule.--- a/files/galaxy/config/tpv_rules_local.yml +++ b/files/galaxy/config/tpv_rules_local.yml @@ -9,6 +9,15 @@ tools: .*testing.*: cores: 2 mem: cores * 4 + rules: + - id: admin_only_testing_tool + if: | + # Only allow the tool to be executed if the user is an admin + admin_users = app.config.admin_users + # last line in block must evaluate to a value - which determines whether the TPV if conditional matches or not + not user or user.email not in admin_users + fail: Unauthorized. Only admins can execute this tool. + destinations: local_env:
Note the use of the
if
rule, which allows for conditional actions to be taken in TPV. Anif
block is evaluated as a multi-line python block, and can execute arbitrary code, but the last line of the block must evaluate to a value. That value determines whether theif
condtional is matched or not. If the conditional is matched, thefail
clause is executed in this case, and the message specified in thefail
clause is displayed to the user. It is similarly possible to conditionally add job parameters, modify cores/mem/gpus and take other complex actions.
Run the Galaxy playbook to update the TPV rules.
Input: Bashansible-playbook galaxy.yml
Try running the tool as both an admin user and a non-admin user, non-admins should not be able to run it. You can start a private browsing session to test as a non-admin, anonymous user. Anonymous users were enabled in your Galaxy configuration.
3.sh
Job Resource Selectors
Certain tools can benefit from allowing users to select appropriate job resource parameters, instead of having admins decide resource allocations beforehand. For example, a user might know that a particular set of parameters and inputs to a certain tool needs a larger memory allocation than the standard amount given to that tool. Galaxy provides functionality to have extra form elements in the tool execution form to specify these additional job resource parameters. This of course assumes that your users are well behaved enough not to choose the maximum whenever available, although such concerns can be mitigated somewhat by the use of concurrency limits on larger memory destinations.
Such form elements can be added to tools without modifying each tool’s configuration file through the use of the job resource parameters configuration file
Hands-on: Configuring a Resource Selector
Create and open
templates/galaxy/config/job_resource_params_conf.xml.j2
--- /dev/null +++ b/templates/galaxy/config/job_resource_params_conf.xml.j2 @@ -0,0 +1,7 @@ +<parameters> + <param label="Cores" name="cores" type="select" help="Number of cores to run job on."> + <option value="1">1 (default)</option> + <option value="2">2</option> + </param> + <param label="Time" name="time" type="integer" size="3" min="1" max="24" value="1" help="Maximum job time in hours, 'walltime' value (1-24). Leave blank to use default value." /> +</parameters>
This defines two resource fields, a select box where users can choose between 1 and 2 cores, and a text entry field where users can input an integer value from 1-24 to set the walltime for a job.
As usual, we need to instruct Galaxy of where to find this file:
--- a/group_vars/galaxyservers.yml +++ b/group_vars/galaxyservers.yml @@ -37,9 +37,17 @@ galaxy_job_config: tpv_config_files: - https://gxy.io/tpv/db.yml - "{{ tpv_config_dir }}/tpv_rules_local.yml" + resources: + default: default + groups: + default: [] + testing: [cores, time] tools: - class: local # these special tools that aren't parameterized for remote execution - expression tools, upload, etc environment: local_env + - id: testing + environment: tpv_dispatcher + resources: testing galaxy_config: galaxy: @@ -59,6 +67,7 @@ galaxy_config: object_store_store_by: uuid id_secret: "{{ vault_id_secret }}" job_config: "{{ galaxy_job_config }}" # Use the variable we defined above + job_resource_params_file: "{{ galaxy_config_dir }}/job_resource_params_conf.xml" # SQL Performance slow_query_log_threshold: 5 enable_per_request_sql_debugging: true @@ -140,6 +149,8 @@ galaxy_config_templates: dest: "{{ galaxy_config.galaxy.container_resolvers_config_file }}" - src: templates/galaxy/config/dependency_resolvers_conf.xml dest: "{{ galaxy_config.galaxy.dependency_resolvers_config_file }}" + - src: templates/galaxy/config/job_resource_params_conf.xml.j2 + dest: "{{ galaxy_config.galaxy.job_resource_params_file }}" galaxy_extra_dirs: - /data
We have added a resources section. The group ID will be used to map a tool to job resource parameters, and the text value of the
<group>
tag is a comma-separated list ofname
s fromjob_resource_params_conf.xml
to include on the form of any tool that is mapped to the defined<group>
.We have also listed the
testing
tool as a tool which uses resource parameters. When listed here, the resource parameter selection form will be displayed.Finally, we have specified that the
job_resource_params_conf.xml.j2
should be copied across.
This will set everything up to use the function. We have:
- A set of “job resources” defined which will let the user select the number of cores and walltime.
- A job configuration which says:
- that our testing tool should allow selection of the cores and time parameters
- directs it to TPV’s
tpv_dispatcher
destination
This is a lot but we’re still missing the last piece for it to work:
Configuring TPV to process resource parameters
Lastly, we need to write a rule in TPV that will read the value of the job resource parameter form fields and decide how to submit the job.
Hands-on: Processing job resource parameters in TPV
Create and edit
files/galaxy/config/tpv_rules_local.yml
. Create it with the following contents:--- a/files/galaxy/config/tpv_rules_local.yml +++ b/files/galaxy/config/tpv_rules_local.yml @@ -6,6 +6,8 @@ tools: abstract: true cores: 1 mem: cores * 4 + params: + walltime: 8 .*testing.*: cores: 2 mem: cores * 4 @@ -17,7 +19,13 @@ tools: # last line in block must evaluate to a value - which determines whether the TPV if conditional matches or not not user or user.email not in admin_users fail: Unauthorized. Only admins can execute this tool. - + - id: resource_params_defined + if: | + param_dict = job.get_param_values(app) + param_dict.get('__job_resource', {}).get('__job_resource__select') == 'yes' + cores: int(job.get_param_values(app)['__job_resource']['cores']) + params: + walltime: "{int(job.get_param_values(app)['__job_resource']['time'])}" destinations: local_env: @@ -45,4 +53,4 @@ destinations: max_cores: 2 max_mem: 8 params: - native_specification: --nodes=1 --ntasks=1 --cpus-per-task={cores} + native_specification: --nodes=1 --ntasks=1 --cpus-per-task={cores} --time={params['walltime']}:00:00
We define a conditional rule and check that the job_resource_params have in fact been defined by the user. If defined, we override the cores value with the one specified by the user. We also define a walltime param, setting it to 8 hours by default. We also override walltime if the user has specified it. It is important to note that you are responsible for parameter validation, including the job resource selector. This function only handles the job resource parameter fields, but it could do many other things - examine inputs, job queues, other tool parameters, etc.
Finally, we pass the walltime as part of the native specification.
Comment: Rules for TPV code evaluationNotice how the
walltime
parameter is wrapped in braces, whereas thecores
value isn’t, yet they are both expressions. The reason for this difference is that when TPV evaluates an expression, all string fields (e.g. env, params) are evaluated as Python f-strings, while all non-string fields (integer fields likecores
andgpus
, float fields likemem
, boolean fields likeif
) are evaluated as Python code-blocks. This rule enables the YAML file to be much more readable, but requires you to keep this simple rule in mind. Note also that Python code-blocks can be multi-line, and that the final line must evaluate to a value that can be assigned to the field.Run the Galaxy playbook to update the TPV rules.
Input: Bashansible-playbook galaxy.yml
Run the Testing Tool with various resource parameter selections
- Use default job resource parameters
- Specify job resource parameters:
- 1 core
- 2 cores
- Some value for walltime from 1-24
The cores parameter can be verified from the output of the tool. The walltime can be verified with scontrol
:
Input: BashYour job number may be different.
scontrol show job 24
OutputYour output may look slightly different. Note that the
TimeLimit
for this job (which I gave a 12 hour time limit) was set to12:00:00
.JobId=24 JobName=g24_multi_anonymous_10_0_2_2 UserId=galaxy(999) GroupId=galaxy(999) Priority=4294901747 Nice=0 Account=(null) QOS=(null) JobState=COMPLETED Reason=None Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:05 TimeLimit=12:00:00 TimeMin=N/A SubmitTime=2016-11-05T22:01:09 EligibleTime=2016-11-05T22:01:09 StartTime=2016-11-05T22:01:09 EndTime=2016-11-05T22:01:14 PreemptTime=None SuspendTime=None SecsPreSuspend=0 Partition=debug AllocNode:Sid=gat2016:1860 ReqNodeList=(null) ExcNodeList=(null) NodeList=localhost BatchHost=localhost NumNodes=1 NumCPUs=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=1,node=1 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0 Features=(null) Gres=(null) Reservation=(null) Shared=OK Contiguous=0 Licenses=(null) Network=(null) Command=(null) WorkDir=/srv/galaxy/server/database/jobs/000/24 StdErr=/srv/galaxy/server/database/jobs/000/24/galaxy_24.e StdIn=StdIn=/dev/null StdOut=/srv/galaxy/server/database/jobs/000/24/galaxy_24.o Power= SICP=0
Comment: Got lost along the way?If you missed any steps, you can compare against the reference files, or see what changed since the previous tutorial.
If you’re using
git
to track your progress, remember to add your changes and commit with a good commit message!
More on TPV
The goal of this tutorial is to provide a quick overview of some of the basic capabilities of TPV. However, there are numerous features that we have not convered such as: a. Custom code blocks - execute arbitrary python code blocks, access additional context variables etc. a. User and Role Handling - Add scheduling constraints based on the user’s email or role b. Metascheduling support - Perform advanced querying and filtering prior to choosing an appropriate destination c. Job resubmissions - Resubmit a job if it fails for some reason d. Linting, formatting and dry-run - Automatically format tpv rule files, catch potential syntax errors and perform a dry-run to check where a tool would get scheduled.
These features are covered in detail in the TPV documentation.
Further Reading
- The sample dynamic tool destination config file fully describes the configuration language
- Dynamic destination documentation
- Job resource parameters are not as well documented as they could be, but the sample configuration file shows some of the possibilities.
- usegalaxy.org’s job_conf.yml is publicly available for reference.
- usegalaxy.eu’s job_conf.xml is likewise (see the
group_vars/galaxy.yml
result)
Comment: Galaxy Admin Training PathThe yearly Galaxy Admin Training follows a specific ordering of tutorials. Use this timeline to help keep track of where you are in Galaxy Admin Training.
Step 1ansible-galaxy Step 2backup-cleanup Step 3customization Step 4tus Step 5cvmfs Step 6apptainer Step 7tool-management Step 8reference-genomes Step 9data-library Step 10dev/bioblend-api Step 11connect-to-compute-cluster Step 12job-destinations Step 13pulsar Step 14celery Step 15gxadmin Step 16reports Step 17monitoring Step 18tiaas Step 19sentry Step 20ftp Step 21beacon