Monitoring Galaxy and Pulsar with Sentry

Author(s)	Marius van den Beek
Editor(s)	Helena Rasche
Reviewers

Overview
Questions:

Objectives:

Have an understanding of Sentry

Install Sentry

Configure Galaxy and Pulsar to send errors to Sentry

Monitor performance with Sentry

Requirements:

slides Slides: Ansible

tutorial Hands-on: Ansible

slides Slides: Galaxy Installation with Ansible

tutorial Hands-on: Galaxy Installation with Ansible

slides Slides: Running Jobs on Remote Resources with Pulsar

tutorial Hands-on: Running Jobs on Remote Resources with Pulsar

Time estimation: 1 hour

Supporting Materials:

Published: Apr 19, 2023

Last modification: Mar 18, 2024

License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MIT

purl PURL: https://gxy.io/GTN:T00330

version Revision: 6

Overview

Sentry is an error tracking software that helps admins and developers monitor and diagnose issues in their applications. It provides real-time alerts for errors and allows users to capture context information about each error, such as stack traces and user feedback. It is often possible to find and fix errors before users report them. Galaxy and Pulsar can log issues and failing tool runs to Sentry.

Agenda

Overview

Installing and Configuring

Installing and Configuring

Generate an error

Sending tool error reports to Sentry

Generating a tool error

Reporting errors from the Pulsar server

Comment: Galaxy Admin Training Path

The yearly Galaxy Admin Training follows a specific ordering of tutorials. Use this timeline to help keep track of where you are in Galaxy Admin Training.

Step 1

ansible-galaxy

Step 2

backup-cleanup

Step 3

customization

Step 4

tus

Step 5

cvmfs

Step 6

apptainer

Step 7

tool-management

Step 8

reference-genomes

Step 9

data-library

Step 10

dev/bioblend-api

Step 11

connect-to-compute-cluster

Step 12

job-destinations

Step 13

pulsar

Step 14

celery

Step 15

gxadmin

Step 16

reports

Step 17

monitoring

Step 18

tiaas

Step 19

sentry

Step 20

ftp

Step 21

beacon

We’re going to set up a local Sentry instance using docker-compose and connect Galaxy and Pulsar to that Sentry instance. Alternatively, you can use the hosted Sentry at https://sentry.io/.

Installing and Configuring

To proceed from here it is expected that:

Comment: Requirements for Running This Tutorial

You have set up a working Galaxy instance as described in the ansible-galaxy tutorial.

Installing and Configuring

First we need to add our new Ansible role to requirements.yml:

Hands On: Set up Sentry with Ansible
In your working directory, add the roles to your requirements.yml
--- a/requirements.yml
+++ b/requirements.yml
@@ -54,3 +54,6 @@
 # Training Infrastructure as a Service
 - src: galaxyproject.tiaas2
   version: 2.1.5
+# Sentry
+- name: mvdbeek.sentry_selfhosted
+  src: https://github.com/mvdbeek/ansible-role-sentry/archive/main.tar.gz
   
If you haven’t worked with diffs before, this can be something quite new or different.

If we have two files, let’s say a grocery list, in two files. We’ll call them ‘a’ and ‘b’.
Code In: Old
$ cat old
🍎
🍐
🍊
🍋
🍒
🥑
Code Out: New
$ cat new
🍎
🍐
🍊
🍋
🍍
🥑
We can see that they have some different entries. We’ve removed 🍒 because they’re awful, and replaced them with an 🍍

Diff lets us compare these files
$ diff old new
5c5
< 🍒
---
> 🍍
Here we see that 🍒 is only in a, and 🍍 is only in b. But otherwise the files are identical.

There are a couple different formats to diffs, one is the ‘unified diff’
$ diff -U2 old new
--- old	2022-02-16 14:06:19.697132568 +0100
+++ new	2022-02-16 14:06:36.340962616 +0100
@@ -3,4 +3,4 @@
 🍊
 🍋
-🍒
+🍍
 🥑
This is basically what you see in the training materials which gives you a lot of context about the changes:

--- old is the ‘old’ file in our view

+++ new is the ‘new’ file

@@ these lines tell us where the change occurs and how many lines are added or removed.

Lines starting with a - are removed from our ‘new’ file

Lines with a + have been added.

So when you go to apply these diffs to your files in the training:

Ignore the header

Remove lines starting with - from your file

Add lines starting with + to your file

The other lines (🍊/🍋 and 🥑) above just provide “context”, they help you know where a change belongs in a file, but should not be edited when you’re making the above change. Given the above diff, you would find a line with a 🍒, and replace it with a 🍍

Added & Removed Lines

Removals are very easy to spot, we just have removed lines
--- old	2022-02-16 14:06:19.697132568 +0100
+++ new	2022-02-16 14:10:14.370722802 +0100
@@ -4,3 +4,2 @@
 🍋
 🍒
-🥑
And additions likewise are very easy, just add a new line, between the other lines in your file.
--- old	2022-02-16 14:06:19.697132568 +0100
+++ new	2022-02-16 14:11:11.422135393 +0100
@@ -1,3 +1,4 @@
 🍎
+🍍
 🍐
 🍊
Completely new files

Completely new files look a bit different, there the “old” file is /dev/null, the empty file in a Linux machine.
$ diff -U2 /dev/null old
--- /dev/null	2022-02-15 11:47:16.100000270 +0100
+++ old	2022-02-16 14:06:19.697132568 +0100
@@ -0,0 +1,6 @@
+🍎
+🍐
+🍊
+🍋
+🍒
+🥑
And removed files are similar, except with the new file being /dev/null
--- old	2022-02-16 14:06:19.697132568 +0100
+++ /dev/null	2022-02-15 11:47:16.100000270 +0100
@@ -1,6 +0,0 @@
-🍎
-🍐
-🍊
-🍋
-🍒
-🥑
Install the roles with:
Code In: Bash
ansible-galaxy install -p roles -r requirements.yml
Create a new playbook, sentry.yml with the following:
--- /dev/null
+++ b/sentry.yml
@@ -0,0 +1,7 @@
+- hosts: sentryservers
+  become: true
+  pre_tasks:
+    - pip:
+        name: docker-compose
+  roles:
+    - mvdbeek.sentry_selfhosted
   
During this tutorial we will install everything on the same host, but often one keeps the monitoring infrastructure (Grafana, InfluxDB, Sentry) on a separate host.
Edit the inventory file (hosts) an add a group for Sentry like:
--- a/hosts
+++ b/hosts
@@ -6,3 +6,6 @@ galaxyservers
 gat-0.oz.galaxy.training ansible_user=ubuntu
 [monitoring]
 gat-0.eu.galaxy.training ansible_connection=local ansible_user=ubuntu
+
+[sentryservers]
+gat-0.eu.training.galaxyproject.eu ansible_connection=local ansible_user=ubuntu
   
Ensure that the hostname is the full hostname of your machine.

Sentry requires its own (sub)domain. For the admin training we have set up the sentry.gat-N.eu.galaxy.training subdomain. If you run this tutorial outside of the training and you cannot obtain a domain or subdomain for sentry you can use the free Duck DNS service to map an IP address to a domain name.
Edit the file group_vars/sentryservers.yml and set the following variables:
--- /dev/null
+++ b/group_vars/sentryservers.yml
@@ -0,0 +1,6 @@
+sentry_version: 23.3.1
+sentry_url: "https://{{ sentry_domain }}"
+sentry_docker_compose_project_folder: /srv/sentry
+sentry_superusers:
+  - email:  admin@example.com
+    password: "{{ vault_sentry_password }}"
   
We will add an associated admin password to the vault, do that now:
Code In: Bash
ansible-vault edit group_vars/secret.yml
vault_sentry_password: 'some-super-secret-password'
Add the nginx routes
--- /dev/null
+++ b/templates/nginx/sentry.j2
@@ -0,0 +1,20 @@
+server {
+	# Listen on port 443
+	listen        *:443 ssl;
+	# The virtualhost is our domain name
+	server_name   "{{ sentry_domain }}";
+
+	# Our log files will go here.
+	access_log  syslog:server=unix:/dev/log;
+	error_log   syslog:server=unix:/dev/log;
+
+	location / {
+		# This is the backend to send the requests to.
+		proxy_pass "http://localhost:9000";
+
+		proxy_set_header Host $http_host;
+		proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+		proxy_set_header X-Forwarded-Proto $scheme;
+		proxy_set_header Upgrade $http_upgrade;
+	}
+}
   
And make sure the sentry nginx configuration is deployed
--- a/group_vars/galaxyservers.yml
+++ b/group_vars/galaxyservers.yml
@@ -219,6 +219,7 @@ nginx_servers:
   - redirect-ssl
 nginx_ssl_servers:
   - galaxy
+  - sentry
 nginx_enable_default_server: false
 nginx_conf_http:
   client_max_body_size: 1g
   
Run the sentry playbook to deploy sentry and the galaxy playbook to update the nginx configuration.
Code In: Bash
ansible-playbook sentry.yml galaxy.yml
Generate a project for Galaxy in Sentry Go to the domain you configured for your Sentry instance. You need to log in with the username and admin you’ve set up in group_vars/sentryservers.yml. Click “continue” on the next page. Click “Projects”, “Create Project”, “Python”, select “I’ll create my own alerts later”, and set “galaxy” as the Project Name. You’ll see your project dsn that will look like https://b0022427ee5345a8ad4cb072c73e62f4@sentry.gat-N.eu.galaxy.training/2. We will need this string to let Galaxy know where to send data to. To avoid requesting an additional certificate for communication between Galaxy and Sentry we’ve set up communication via localhost:9000, so you can manually change the @ portion to localhost:9000.
We will add the galaxy project dsn to the vault. Edit your group_vars/secret.yml and add the sentry dsn.
Code In: Bash
ansible-vault edit group_vars/secret.yml
vault_galaxy_sentry_dsn: 'https://b0022427ee5345a8ad4cb072c73e62f4@localhost:9000/2'
Edit group_vars/galaxyservers.yml to reference the new vault secret:

This will let Galaxy know that captured logs should be sent to our Sentry instance. We will also enable sending performance metrics to Sentry by setting the sentry_traces_sample_rate to 0.5. This will send half of all transactions to Sentry. In a production environment you would reduce this to a smaller percentage of transactions.
--- a/group_vars/galaxyservers.yml
+++ b/group_vars/galaxyservers.yml
@@ -119,6 +119,8 @@ galaxy_config:
     # Monitoring
     statsd_host: localhost
     statsd_influxdb: true
+    sentry_dsn: "{{ vault_galaxy_sentry_dsn }}"
+    sentry_traces_sample_rate: 0.5
   gravity:
     process_manager: systemd
     galaxy_root: "{{ galaxy_root }}/server"
   
Run the galaxy playbook.
Code In: Bash
ansible-playbook galaxy.yml

Generate an error

Galaxy has a built in route that intentionally generates and error. Just visit: /error

Hands On: Open the Galaxy Project in Sentry

Go to your Sentry instance and click on issues. You should see a couple of issues, one them should be the “Fake error” exception we generated by visiting https://galaxy.example.org/error.

Sending tool error reports to Sentry

In addition to sending logging errors to Sentry you can also collect failing tool runs in Sentry. For this we will set up the error reporting configuration file and reference it in galaxy.yml. The user_submission parameter controls whether all reports will be collected in Sentry (when set to false) or only those that have been reported manually (when set to true). For testing purposes we’ll also add a tool that will fail running so we can test that submitting tool errors to Sentry works as expected.

Hands On: Update Galaxy config to send tool error reports

Create the files/galaxy/config/error_reports.yml file.

--- /dev/null
+++ b/files/galaxy/config/error_reports.yml
@@ -0,0 +1,2 @@
+- type: sentry
+  user_submission: false
   

Create a testing tool in files/galaxy/tools/job_properties.xml.

--- /dev/null
+++ b/files/galaxy/tools/job_properties.xml
@@ -0,0 +1,65 @@
+<tool id="job_properties" name="Test Job Properties" version="1.0.0">
+    <stdio>
+        <exit_code range="127" level="fatal" description="Failing exit code." />
+    </stdio>
+    <version_command>echo 'v1.1'</version_command>
+    <command><![CDATA[
+#if $thebool
+    echo 'The bool is true' &&
+    echo 'The bool is really true' 1>&2 &&
+    echo 'This is a line of text.' > '$out_file1' &&
+    cp '$out_file1' '$one' &&
+    cp '$out_file1' '$two' &&
+    sleep $sleepsecs
+#else
+    echo 'The bool is not true' &&
+    echo 'The bool is very not true' 1>&2 &&
+    echo 'This is a different line of text.' > '$out_file1' &&
+    sleep $sleepsecs &&
+    sh -c 'exit 2'
+#end if
+#if $failbool
+    ## use ';' to concatenate commands so that the next one is run independently
+    ## of the exit code of the previous one
+    ; exit 127
+#end if
+    ]]></command>
+    <inputs>
+        <param name="sleepsecs" type="integer" value="0" label="Sleep this many seconds"/>
+        <param name="thebool" type="boolean" label="The boolean property" />
+        <param name="failbool" type="boolean" label="The failure property" checked="false" />
+    </inputs>
+    <outputs>
+        <data name="out_file1" format="txt" />
+        <collection name="list_output" type="list" label="A list output">
+            <data name="one" format="txt" />
+                <has_line line="The bool is true" />
+            </assert_stdout>
+            <assert_stderr>
+                <has_line line="The bool is really true" />
+            </assert_stderr>
+            <assert_command_version>
+                <has_text text="v1.1" />
+            </assert_command_version>
+        </test>
+        <test expect_exit_code="2">
+            <param name="thebool" value="false" />
+            <output name="out_file1" file="simple_line_alternative.txt" />
+            <assert_command>
+                <has_text text="very not" />
+            </assert_command>
+            <assert_stdout>
+                <has_line line="The bool is not true" />
+            </assert_stdout>
+            <assert_stderr>
+                <has_line line="The bool is very not true" />
+            </assert_stderr>
+        </test>
+        <test expect_exit_code="127" expect_failure="true">
+            <param name="thebool" value="true" />
+            <param name="failbool" value="true" />
+        </test>
+    </tests>
+    <help>
+    </help>
+</tool>
   

Edit group_vars/galaxyservers.yml to reference the error_reports.yml file and the new testing tool.

--- a/group_vars/galaxyservers.yml
+++ b/group_vars/galaxyservers.yml
@@ -121,6 +121,7 @@ galaxy_config:
     statsd_influxdb: true
     sentry_dsn: "{{ vault_galaxy_sentry_dsn }}"
     sentry_traces_sample_rate: 0.5
+    error_report_file: "{{ galaxy_config_dir }}/error_reports_file.yml"
   gravity:
     process_manager: systemd
     galaxy_root: "{{ galaxy_root }}/server"
@@ -173,6 +174,8 @@ galaxy_config_files:
     dest: "{{ galaxy_config.galaxy.themes_config_file }}"
   - src: files/galaxy/config/tpv_rules_local.yml
     dest: "{{ tpv_mutable_dir }}/tpv_rules_local.yml"
+  - src: files/galaxy/config/error_reports.yml
+    dest: "{{ galaxy_config.galaxy.error_report_file }}"
    
 galaxy_config_templates:
   - src: templates/galaxy/config/container_resolvers_conf.yml.j2
@@ -194,6 +197,7 @@ tpv_privsep: true
    
 galaxy_local_tools:
 - testing.xml
+- job_properties.xml
    
 # Certbot
 certbot_auto_renew_hour: "{{ 23 |random(seed=inventory_hostname)  }}"
   

Run the galaxy playbook.
Code In: Bash
```
ansible-playbook galaxy.yml
```

Generating a tool error

To generate a tool error, run the job properties testing tool and set the failbool parameter to true.

Hands On: Open the Galaxy Project in Sentry

Go to your Sentry instance and click on issues. You should see an issue for the tool run error.

Reporting errors from the Pulsar server

It is also possible to report errors from the Pulsar server. You can either use the Galaxy project we created before in Sentry, or we can create a new project for Pulsar. We recommend creating a separate Pulsar project. Since the Pulsar server runs on a remote VM for this to work you need a valid certificate for the Sentry domain and you cannot use localhost.

Hands On: Add Sentry connection to Pulsar
Create a new dsn by creating a new pulsar project in Sentry.
We will add the project dsn to the vault. Edit your group_vars/secret.yml and add the sentry dsn.
Code In: Bash
ansible-vault edit group_vars/secret.yml
vault_pulsar_sentry_dsn: 'https://f2a8a00d30224c2c9800a8f79194a32a@/3'
Add the sentry dsn to the pulsar group variables.
--- a/group_vars/pulsarservers.yml
+++ b/group_vars/pulsarservers.yml
@@ -45,6 +45,7 @@ pulsar_yaml_config:
       - type: conda
         auto_init: true
         auto_install: true
+  sentry_dsn: "{{ vault_pulsar_sentry_dsn }}"
    
 # Pulsar should use the same job metrics plugins as Galaxy. This will automatically set `job_metrics_config_file` in
 # `pulsar_yaml_config` and create `{{ pulsar_config_dir }}/job_metrics_conf.yml`.
   
Run the pulsar playbook.
Code In: Bash
ansible-playbook galaxy.yml

Pulsar should now be set up to report errors to Sentry.

Comment: Got lost along the way?

If you missed any steps, you can compare against the reference files, or see what changed since the previous tutorial.

If you’re using git to track your progress, remember to add your changes and commit with a good commit message!

Comment: Galaxy Admin Training Path

The yearly Galaxy Admin Training follows a specific ordering of tutorials. Use this timeline to help keep track of where you are in Galaxy Admin Training.

Step 1

ansible-galaxy

Step 2

backup-cleanup

Step 3

customization

Step 4

tus

Step 5

cvmfs

Step 6

apptainer

Step 7

tool-management

Step 8

reference-genomes

Step 9

data-library

Step 10

dev/bioblend-api

Step 11

connect-to-compute-cluster

Step 12

job-destinations

Step 13

pulsar

Step 14

celery

Step 15

gxadmin

Step 16

reports

Step 17

monitoring

Step 18

tiaas

Step 19

sentry

Step 20

ftp

Step 21

beacon

You've Finished the Tutorial

Frequently Asked Questions

Have questions about this tutorial? Have a look at the available FAQ pages and support channels

Feedback

Did you use this material as an instructor? Feel free to give us feedback on how it went.
Did you use this material as a learner or student? Click the form below to leave feedback.

Citing this Tutorial

Marius van den Beek, Monitoring Galaxy and Pulsar with Sentry (Galaxy Training Materials). https://training.galaxyproject.org/training-material/topics/admin/tutorials/sentry/tutorial.html Online; accessed TODAY
Hiltemann, Saskia, Rasche, Helena et al., 2023 Galaxy Training: A Powerful Framework for Teaching! PLOS Computational Biology 10.1371/journal.pcbi.1010752
Batut et al., 2018 Community-Driven Data Analysis Training for Biology Cell Systems 10.1016/j.cels.2018.05.012

@misc{admin-sentry,
author = "Marius van den Beek",
	title = "Monitoring Galaxy and Pulsar with Sentry (Galaxy Training Materials)",
	year = "",
	month = "",
	day = "",
	url = "\url{https://training.galaxyproject.org/training-material/topics/admin/tutorials/sentry/tutorial.html}",
	note = "[Online; accessed TODAY]"
}
@article{Hiltemann_2023,
	doi = {10.1371/journal.pcbi.1010752},
	url = {https://doi.org/10.1371%2Fjournal.pcbi.1010752},
	year = 2023,
	month = {jan},
	publisher = {Public Library of Science ({PLoS})},
	volume = {19},
	number = {1},
	pages = {e1010752},
	author = {Saskia Hiltemann and Helena Rasche and Simon Gladman and Hans-Rudolf Hotz and Delphine Larivi{\`{e}}re and Daniel Blankenberg and Pratik D. Jagtap and Thomas Wollmann and Anthony Bretaudeau and Nadia Gou{\'{e}} and Timothy J. Griffin and Coline Royaux and Yvan Le Bras and Subina Mehta and Anna Syme and Frederik Coppens and Bert Droesbeke and Nicola Soranzo and Wendi Bacon and Fotis Psomopoulos and Crist{\'{o}}bal Gallardo-Alba and John Davis and Melanie Christine Föll and Matthias Fahrner and Maria A. Doyle and Beatriz Serrano-Solano and Anne Claire Fouilloux and Peter van Heusden and Wolfgang Maier and Dave Clements and Florian Heyl and Björn Grüning and B{\'{e}}r{\'{e}}nice Batut and},
	editor = {Francis Ouellette},
	title = {Galaxy Training: A powerful framework for teaching!},
	journal = {PLoS Comput Biol}
}

                   

Congratulations on successfully completing this tutorial!

You can use Ephemeris's shed-tools install command to install the tools used in this tutorial.
shed-tools install [-g GALAXY] [-a API_KEY] -t <(curl https://training.galaxyproject.org/training-material/api/topics/admin/tutorials/sentry/tutorial.json | jq .admin_install_yaml -r)
Alternatively you can copy and paste the following YAML
---
install_tool_dependencies: true
install_repository_dependencies: true
install_resolver_dependencies: true
tools: []

No feedback has been recieved yet for this training. Be the first one by filling in the feedback form.