Server Maintenance: Cleanup, Backup, and Restoration

Author(s)	Helena Rasche Lucille Delisle Nate Coraor
Reviewers

Overview
Questions:

How can I back up my Galaxy?

What data should be included?

How can I ensure jobs get cleaned up appropriately?

How do I maintain a Galaxy server?

What happens if I lose everything?

Objectives:

Learn about different maintenance steps

Setup postgres backups

Setup cleanups

Learn what to back up and how to recover

Requirements:

slides Slides: Galaxy Installation with Ansible

tutorial Hands-on: Galaxy Installation with Ansible

A VM with at least 2 vCPUs and 4 GB RAM, preferably running Ubuntu 18.04 - 20.04.

Time estimation: 30 minutes

Supporting Materials:

Slides

FAQs

Published: Apr 16, 2023

Last modification: Jul 13, 2023

License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MIT

purl PURL: https://gxy.io/GTN:T00324

rating Rating: 4.5 (0 recent ratings, 2 all time)

version Revision: 5

Keeping your Galaxy cleaned up is an important way to retain space, especially since for many groups that is the limiting factor in their deployment.

Additionally, backups are necessary to ensure that if you ever experience system level failures, you can safely recover from these.

Agenda

Cleanups

User Created Files

Galaxy Created Files

Backups

Galaxy

Database Backups

Data Backup

Restoration

Restoring the Database

Restoring Galaxy

Restoring User Data

Comment: Galaxy Admin Training Path

The yearly Galaxy Admin Training follows a specific ordering of tutorials. Use this timeline to help keep track of where you are in Galaxy Admin Training.

Step 1

ansible-galaxy

Step 2

backup-cleanup

Step 3

customization

Step 4

tus

Step 5

cvmfs

Step 6

apptainer

Step 7

tool-management

Step 8

reference-genomes

Step 9

data-library

Step 10

dev/bioblend-api

Step 11

connect-to-compute-cluster

Step 12

job-destinations

Step 13

pulsar

Step 14

celery

Step 15

gxadmin

Step 16

reports

Step 17

monitoring

Step 18

tiaas

Step 19

sentry

Step 20

ftp

Step 21

beacon

Cleanups

There are two kinds of data that are produced when running a Galaxy: files users create and then delete or purge, and then files Galaxy creates itself. Both of these can be cleaned to save space.

User Created Files

You can use gxadmin to cleanup user created files. gxadmin is covered in more detail in its own dedicated tutorial.

Hands On: Installing gxadmin with Ansible

Edit your requirements.yml and add the following:

--- a/requirements.yml
+++ b/requirements.yml
@@ -11,3 +11,6 @@
   version: 0.3.1
 - src: usegalaxy_eu.certbot
   version: 0.1.11
+# gxadmin (used in cleanup, and later monitoring.)
+- src: galaxyproject.gxadmin
+  version: 0.0.12
   

Install the role with:

Code In: Bash

ansible-galaxy install -p roles -r requirements.yml

Add the role to your playbook:

--- a/galaxy.yml
+++ b/galaxy.yml
@@ -27,3 +27,4 @@
       become: true
       become_user: "{{ galaxy_user_name }}"
     - galaxyproject.nginx
+    - galaxyproject.gxadmin
   

Setup a cleanup task to run regularly:

--- a/galaxy.yml
+++ b/galaxy.yml
@@ -28,3 +28,11 @@
       become_user: "{{ galaxy_user_name }}"
     - galaxyproject.nginx
     - galaxyproject.gxadmin
+  post_tasks:
+    - name: Setup gxadmin cleanup task
+      ansible.builtin.cron:
+        name: "Cleanup Old User Data"
+        user: galaxy # Run as the Galaxy user
+        minute: "0"
+        hour: "0"
+        job: "SHELL=/bin/bash source {{ galaxy_venv_dir }}/bin/activate &&  GALAXY_LOG_DIR=/tmp/gxadmin/ GALAXY_ROOT={{ galaxy_root }}/server GALAXY_CONFIG_FILE={{ galaxy_config_file }} /usr/local/bin/gxadmin galaxy cleanup 60"
   

This will cause datasets deleted for more than 60 days to be purged.

Run the playbook
Code In: Bash
```
ansible-playbook galaxy.yml
```

Whenever gxadmin runs, it will create logs you can read in /tmp/gxadmin which you can check later.

Galaxy Created Files

Before we begin backing up our Galaxy data, let’s set up automated cleanups to ensure we backup the minimal required set of data.

Hands On: Configuring PostgreSQL Backups

Edit galaxy.yml to install tmpwatch (if using RHEL/CentOS/Rocky) and tmpreaper if using Debian/Ubuntu

--- a/galaxy.yml
+++ b/galaxy.yml
@@ -21,6 +21,14 @@
     - name: Install Dependencies
       package:
         name: ['acl', 'bzip2', 'git', 'make', 'tar', 'python3-venv', 'python3-setuptools']
+    - name: Install RHEL/CentOS/Rocky specific dependencies
+      package:
+        name: ['tmpwatch']
+      when: ansible_os_family == 'RedHat'
+    - name: Install Debian/Ubuntu specific dependencies
+      package:
+        name: ['tmpreaper']
+      when: ansible_os_family == 'Debian'
   roles:
     - galaxyproject.galaxy
     - role: galaxyproject.miniconda
   

Edit group_vars/galaxyservers.yml and add some variables to configure PostgreSQL:

--- a/group_vars/galaxyservers.yml
+++ b/group_vars/galaxyservers.yml
@@ -2,6 +2,7 @@
 galaxy_create_user: true # False by default, as e.g. you might have a 'galaxy' user provided by LDAP or AD.
 galaxy_separate_privileges: true # Best practices for security, configuration is owned by 'root' (or a different user) than the processes
 galaxy_manage_paths: true # False by default as your administrator might e.g. have root_squash enabled on NFS. Here we can create the directories so it's fine.
+galaxy_manage_cleanup: true
 galaxy_layout: root-dir
 galaxy_root: /srv/galaxy
 galaxy_user: {name: "{{ galaxy_user_name }}", shell: /bin/bash}
   

Code In: Bash
```
ansible-playbook galaxy.yml
```
Check out the cleanup task which has been generated in: /etc/cron.d/ansible_galaxy_tmpclean

This will setup tmpwatch to cleanup a few folders:

the job working directory, important if you set cleanup: onsuccess, to cleanup old failed jobs once you’re done debugging their failures.
the new file upload path, to catch uploaded temporary files that are no longer necessary.

Backups

There are a few important things to back up with your Ansible Galaxy:

Galaxy
- The Galaxy-managed config files
- The playbooks
The Database
The Data

Galaxy

By using Ansible, as long as you are storing your playbooks on another system, you are generally safe from failues of the Galaxy node, and you’ll be able to re-run your playbook at a later date.

However, playbooks often do not include:

Which tools you’ve installed (have you ever installed a tool outside of ephemeris? This might be lost!)
Conda environments, which will not always resolve identically over time. If strong guarantees of reproducibility are important, then consider backing these up as well.

Database Backups

We’re setting a couple of variables to control the automatic backups, they’ll be placed in the /data/backups folder next to our user uploaded Galaxy data.

Hands On: Configuring PostgreSQL Backups

Edit group_vars/galaxyservers.yml and add some variables to configure PostgreSQL:

--- a/group_vars/dbservers.yml
+++ b/group_vars/dbservers.yml
@@ -5,3 +5,7 @@ postgresql_objects_users:
 postgresql_objects_databases:
   - name: "{{ galaxy_db_name }}"
     owner: "{{ galaxy_user_name }}"
+
+# PostgreSQL Backups
+postgresql_backup_dir: /data/backups
+postgresql_backup_local_dir: "{{ '~postgres' | expanduser }}/backups"
   

This will setup our backups to run as a cron job.

Data Backup

With Galaxy it is technically only necessary to backup your inputs, as the downstream files should, in theory be re-createable due to the reproducibility of Galaxy.

In practice, some groups either choose to not backup, or to backup everything, often to extremely cheap and slow storage like Glacier or a tape library.

Most groups choose to implement this as a custom cron job, e.g.

post_tasks:
  - name: Setup backup cron job
    ansible.builtin.cron:
      name: "Backup User Data"
      minute: "0"
      hour: "5,2"
      job: "rsync -avr /data/galaxy/ backup@backup.example.org:/backups/$(date -I)/"

People who, let’s say, care strongly about backups will often insist that you need to version files. This is of course unnecessary in the Galaxy case as files are essentially Write Once Read Many (WORM)s, which is a really good file storage practice. Files can get removed so it isn’t a true WORM strategy that you’d use for e.g. audit logs, but it is close. That said, since files never get changed, keeping multiple versions is unnecesary.

Please consider communicating very well with your users what the data backup policy is.

Comment: Got lost along the way?

If you missed any steps, you can compare against the reference files, or see what changed since the previous tutorial.

If you’re using git to track your progress, remember to add your changes and commit with a good commit message!

Restoration

Sometimes failures happen! We’re sorry you have to read this section.

Restoring the Database

This procedure is more complicated, you can read about the restoration procedure in the associated PR.

This step assumes you have pre-existing backups in place, you must check this first:

ls /data/backups/

If you have backups, you’re ready to restore:

# Stop Galaxy, you do NOT want galaxy to connect mid-restoration in case it
# tries to modify the database.
sudo systemctl stop galaxy

# Stop the database
sudo systemctl stop postgresql
# Ensure that it is stopped
sudo systemctl status postgresql

# Begin the backup procedure by becoming postgres:
sudo su - postgres

# Move the current, live database to a backup location just in case:
mkdir /tmp/test/

# ====
# NOTE THAT THIS NUMBER MAY BE DIFFERENT FOR YOU!
# You will need to change 12 to whatever version of postgres you're running
# in every subsequent command
# ====
mv /var/lib/postgresql/12/main/* /tmp/test/

# Add backup
rsync -av /data/backups/YOUR_LATEST_BACKUP/ /var/lib/postgresql/12/main
# Add the restore_command, to your backup file:
# restore_command = 'cp "/tmp/backup/current/wal/%f" "%p"'
$EDITOR ./12/main/postgresql.auto.conf

# Touch a recovery file
touch /var/lib/postgresql/12/main/recovery.signal

# As $username (with sudo right)
sudo systemctl restart postgresql
sudo systemctl status postgresql
# Restart Galaxy
sudo systemctl start galaxy

If you encounter issues, we suggest reading Lucille’s log of her experiences restoring as you might encounter similar issues.

Restoring Galaxy

Restoring Galaxy is easy via Ansible (maybe ensuring users cannot login by disabling the routes in nginx)

ansible-playbook galaxy.yml

And if you are following best practices, you probably have your tools stored in a YAML file to use with Ephemeris:

shed-tools install -g https://galaxy.example.org -a <api-key> -t our_tools.yml

Restoring User Data

This should simply be rsyncing your data from the backup location back into /data/galaxy.

Comment: Galaxy Admin Training Path

The yearly Galaxy Admin Training follows a specific ordering of tutorials. Use this timeline to help keep track of where you are in Galaxy Admin Training.

Step 1

ansible-galaxy

Step 2

backup-cleanup

Step 3

customization

Step 4

tus

Step 5

cvmfs

Step 6

apptainer

Step 7

tool-management

Step 8

reference-genomes

Step 9

data-library

Step 10

dev/bioblend-api

Step 11

connect-to-compute-cluster

Step 12

job-destinations

Step 13

pulsar

Step 14

celery

Step 15

gxadmin

Step 16

reports

Step 17

monitoring

Step 18

tiaas

Step 19

sentry

Step 20

ftp

Step 21

beacon

You've Finished the Tutorial

Key points

Use configuration management (e.g. Ansible)

Store configuration management in git

Back up the parts of Galaxy that can’t be recreated

Frequently Asked Questions

Have questions about this tutorial? Have a look at the available FAQ pages and support channels

Glossary

WORM: Write Once Read Many

Feedback

Did you use this material as an instructor? Feel free to give us feedback on how it went.
Did you use this material as a learner or student? Click the form below to leave feedback.

Citing this Tutorial

Helena Rasche, Lucille Delisle, Nate Coraor, Server Maintenance: Cleanup, Backup, and Restoration (Galaxy Training Materials). https://training.galaxyproject.org/training-material/topics/admin/tutorials/backup-cleanup/tutorial.html Online; accessed TODAY
Hiltemann, Saskia, Rasche, Helena et al., 2023 Galaxy Training: A Powerful Framework for Teaching! PLOS Computational Biology 10.1371/journal.pcbi.1010752
Batut et al., 2018 Community-Driven Data Analysis Training for Biology Cell Systems 10.1016/j.cels.2018.05.012

@misc{admin-backup-cleanup,
author = "Helena Rasche and Lucille Delisle and Nate Coraor",
	title = "Server Maintenance: Cleanup, Backup, and Restoration (Galaxy Training Materials)",
	year = "",
	month = "",
	day = "",
	url = "\url{https://training.galaxyproject.org/training-material/topics/admin/tutorials/backup-cleanup/tutorial.html}",
	note = "[Online; accessed TODAY]"
}
@article{Hiltemann_2023,
	doi = {10.1371/journal.pcbi.1010752},
	url = {https://doi.org/10.1371%2Fjournal.pcbi.1010752},
	year = 2023,
	month = {jan},
	publisher = {Public Library of Science ({PLoS})},
	volume = {19},
	number = {1},
	pages = {e1010752},
	author = {Saskia Hiltemann and Helena Rasche and Simon Gladman and Hans-Rudolf Hotz and Delphine Larivi{\`{e}}re and Daniel Blankenberg and Pratik D. Jagtap and Thomas Wollmann and Anthony Bretaudeau and Nadia Gou{\'{e}} and Timothy J. Griffin and Coline Royaux and Yvan Le Bras and Subina Mehta and Anna Syme and Frederik Coppens and Bert Droesbeke and Nicola Soranzo and Wendi Bacon and Fotis Psomopoulos and Crist{\'{o}}bal Gallardo-Alba and John Davis and Melanie Christine Föll and Matthias Fahrner and Maria A. Doyle and Beatriz Serrano-Solano and Anne Claire Fouilloux and Peter van Heusden and Wolfgang Maier and Dave Clements and Florian Heyl and Björn Grüning and B{\'{e}}r{\'{e}}nice Batut and},
	editor = {Francis Ouellette},
	title = {Galaxy Training: A powerful framework for teaching!},
	journal = {PLoS Comput Biol}
}

                   

Congratulations on successfully completing this tutorial!

You can use Ephemeris's shed-tools install command to install the tools used in this tutorial.
shed-tools install [-g GALAXY] [-a API_KEY] -t <(curl https://training.galaxyproject.org/training-material/api/topics/admin/tutorials/backup-cleanup/tutorial.json | jq .admin_install_yaml -r)
Alternatively you can copy and paste the following YAML
---
install_tool_dependencies: true
install_repository_dependencies: true
install_resolver_dependencies: true
tools: []

t{ hist[0] | to_stars }} 1

t{ hist[0] | to_stars }} 1

No feedback has been recieved yet for this training. Be the first one by filling in the feedback form.