Obis marine indicators

Overview
Creative Commons License: CC-BY Questions:
  • How to retrieve Obis data with Galaxy ?

  • How can you analyse and visualise them ?

Objectives:
  • Calculating and vizualizing marine biodiversity indicators

Requirements:
Time estimation: 1 hour
Supporting Materials:
Published: Apr 11, 2023
Last modification: Jun 27, 2024
License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MIT
purl PURL: https://gxy.io/GTN:T00332
rating Rating: 5.0 (1 recent ratings, 1 all time)
version Revision: 8

OBIS is a global open-access data and information clearing-house on marine biodiversity for science, conservation and sustainable development. In order to visualize their marine data OBIS created the package obisindicators.

Obisindicators is an R library developed during the 2022 IOOS Code Sprint. The purpose was to create an ES50 diversity index within hexagonal grids following the diversity indicators notebook by Pieter Provoost linked above. The package includes several examples, limited to 1M occurrences, that demonstrate uses of the package. This tutorial will guide you on getting obis marine data and processing them in order to calculate and visualize multiple indicators.

This tool for obisindicators is composed of 5 indicators : Number of record, Shannon, Simpson, Es50 and Hill which will be explained in more details later on.

In this tutorial, highly based on OBIS indicators documentation we’ll be working on OBIS data extracted from OBIS mapper. First those data will be prepared.

Agenda

In this tutorial, we will cover:

  1. Upload of the data
  2. Use the galaxy tool
  3. Clean data Advanced Cut
  4. Get to the Portal
  5. Convert data csv-to-tabular
  6. Clean data Advanced Cut
  7. Ocean biodiversity indicators
  8. Number of records
  9. Shannon
  10. Simpson
  11. ES50
  12. Maxp
  13. Hill
  14. Index file
  15. Conclusion

Upload of the data

This first step consist of downloading and uploading obis data onto galaxy.

Hands-on: Data upload

Create a new history for this tutorial and give it a name (example: “Obisindicators tutorial”) for you to find it again later if needed.

To create a new history simply click the new-history icon at the top of the history panel:

UI for creating new history

Hands-on: Choose Your Own Tutorial

This is a "Choose Your Own Tutorial" section, where you can select between multiple paths. Click one of the buttons below to select how you want to follow the tutorial

You can use a tool to download the data you want to study or you can download them from the Obis portal.

Use the galaxy tool

Hands-on: Download your data
  1. OBIS occurences ( Galaxy version 0.0.2) with the following parameters:
    • “Scientific name of the species”: Empty
    • “Taxon ID”: Empty
    • “Input latitude min (+north/-south)”: 41.5
    • “Input latitude max (+north/-south)”: 45.0
    • “Input longitude min (+east/-west)”: 7.5
    • “Input longitude max (+east/-west)”: 10.0
  2. Run Tool

Clean data Advanced Cut

Hands-on: Clean your data
  1. Advanced Cut ( Galaxy version 1.1.0) with the following parameters: -“File to cut”: Species occurences
    • “Cut by”: fields
      • “List of Fields”: basisOfRecord decimalLatitude decimalLongitude species individualCount You should always have at least the 4 folowing columns: decimalLatitude, decimalLongitude, species and individualCount

Get to the Portal

  1. Download the files from Obis
Obis portal welcome page. Open image in new tab

Figure 1: Obis portal
  • Go on the right panel and enter the criteria of your choise here in “Area” write down Mediterranean and select “France: Mediterranean Sea”.
  • Click on save on the top right
Guide to download obis data. Open image in new tab

Figure 2: Obis download
  • Then on the 3 green lines top right press download
  • Enter your email on the pop-up screen and press Yes, procced
The obis portal while you wait for the download. Open image in new tab

Figure 3: Waiting download
  • The download can take a while depending on the size of your dataset (here less than 15min)
  • Then click on Download ZIP file
  • Don’t forget to unzip your file on your machine.

In the downloaded folder you should have your data either csv format (Occurence.csv) and you must have at least 4 columns containing: latitude, longitude, species and record.

  1. Upload obis data
  • Open the Galaxy Upload Manager galaxy-upload
  • Select Choose local files
  • Browse in your computer and get the downloaded zip folder

  • Press Start (it can take a few seconds to get ready)
  1. Rename the datasets “obis data” for example and preview your dataset

  2. Check the datatype must be csv or tabular

  • Click on the galaxy-pencil pencil icon for the dataset to edit its attributes
  • In the central panel, click galaxy-chart-select-data Datatypes tab on the top
  • In the galaxy-chart-select-data Assign Datatype, select datatypes from “New type” dropdown
    • Tip: you can start typing the datatype into the field to filter the dropdown menu
  • Click the Save button

Convert data csv-to-tabular

Hands-on: Convert your data
  1. On your data in your history pannel click on param-text
  2. In the top click on galaxy-gear Convert
  3. Press exchange Create Dataset

Clean data Advanced Cut

Hands-on: Clean your data
  1. Advanced Cut ( Galaxy version 9.3+galaxy0) with the following parameters: -“File to cut”: obis data
    • “Cut by”: fields
      • “Delimited by”: Tab -“Is there a header for the data’s columns ?”: Yes
        • “List of Fields”: basisOfRecord decimalLatitude decimalLongitude species individualCount You should always have at least the 4 folowing columns: decimalLatitude, decimalLongitude, species and individualCount

You are now all set to use your obis data in order to do a diversity analysis.

Ocean biodiversity indicators

Hands-on: Ocean biodiversity indicators
  1. Ocean biodiversity indicators ( Galaxy version 0.0.1) with the following parameters:
    • “Input table”: The result tabular of the Advanced Cut tool
    • “What character is the separator in your data? (Mostlikely a comma for a csv file and t for a tabular)”: Tabulator (\t)
    • “Select column containing the decimal value of the longitude “: c2
    • “Select column containing the decimal value of the latitude “: c3
    • “Select column containing the species “: c4
    • “Select column containing the number of records”: c5
    • “Type of projection for the map : select your coordinate reference system (CRS)”: Robinson Projection
    • “Choose a resolution for the discrete global grid”: 9

Note that you can rerun the tool and modify the resolution of the maps you want to create.

  1. Click on Execute
  2. You will see 5 outputs appear on the history pannel. one for each of the indicators

Number of records

Map of the number of records. Open image in new tab

Figure 4: Records map

In the table index the records correspond to n.

Shannon

The Shannon index expresses the uncertainty associated with the prediction of the species the next sampled individual belongs to. It assumes that individuals are randomly sampled from an infinitely large community, and that all species are represented in the sample.

Warning: OBIS uses records as a proxy for individuals and sampling is generally not random, the community is not infinitely large and not all species are represented in the sample.

The Shannon diversity index, also known as the Shannon-Wiener diversity index, is defined in OBIS as the sum over all species of $-fi*log(fi)$ with fi defined as $n/ni$ with n as the total number of records in the raster cell and ni as the total number of records for the ith-species in the raster cell.

Map showing the shannon indice. Open image in new tab

Figure 5: Shannon map

Simpson

The measure equals the probability that two entities taken at random from the dataset of interest represent the same type. It equals:

where is richness (the total number of types in the dataset) and is the proportional abundances of the types of interest.

Simpson’s index expresses the probability that any two individuals drawn at random from an infinitely large community belong to the same species. Note that small values are obtained in cells of high diversity and large values in cells of low diversity. This counterintuitive behavior is adressed with the Hill 2 number, which is the inverse of the Simpson index.

The Simpson biodiversity index is defined in OBIS as the sum over all species of $(ni/n)^2$ with n as the total number of records in the cell and ni the total number of records for the ith species.

Warning: The Simpson index has the same assumptions as the Shannon index.

Map showing the simpson indice. Open image in new tab

Figure 6: Simpson map

ES50

The expected number of marine species in a random sample of 50 individuals (records) is an indicator on marine biodiversity richness.

The ES50 is defined in OBIS as the \(sum(esi)\) over all species of the following per species calculation:

when $$ n - ni$ <= 50 $$ (with n as the total number of records in the cell and ni the total number of records for the ith-species)
    $$ esi = 1 - exp(lngamma(n-ni+1) + lngamma(n-50+1) - lngamma(n-ni-50+1) - lngamma(n+1)) $$
when $$ n >= 50 $$
    $$ esi = 1 $$
else
    $$ esi = NULL $$

Warning: ES50 assumes that individuals are randomly distributed, the sample size is sufficiently large, the samples are taxonomically similar, and that all of the samples have been taken in the same manner.

Map showing the ES50 indice. Open image in new tab

Figure 7: ES50 map

Maxp

Maxp is the maximum of the total number of records for the ith-species ni divided by the total number of records in the cell n, ie, \(Maxp = max(ni / n)\).

Map showing the Maxp indice. Open image in new tab

Figure 8: Maxp map

Hill

Hill 1

The Hill biodiversity index accounts for species’ relative abundance (number of records in OBIS) and Hill1 can be roughly interpreted as the number of species with “typical” abundances, and is a commonly used indicator for marine biodiversity richness. It is defined as:

Warning: The Simpson index has the same assumptions as the Shannon index.

Hill 2

The Hill biodiversity index accounts for species’ relative abundance (number of records in OBIS) and discounts rare species, so Hill2 can be interpreted as the equivalent to the number of more dominant species and so is less sensitive to sample size than Hill1. The Hill index is a commonly used indicator for marine biodiversity richness. It is defined as:

Warning: The Simpson index has the same assumptions as the Shannon index.

They are calculated as shown below: - \(hill\_1 = exp(shannon)\) - \(hill\_2 = 1 / simpson\) - \(hill\_inf = 1 / maxp\)

Index file

You also have a tabular file that sums up each indicators.

Summary tabular of all the indices. Open image in new tab

Figure 9: Tabular

Nb: the column sp is the count of the number of observations in a dataset. It is the number of records in the dataset.

Conclusion

You here learn how to select and download OBIS dataset for your region of interest, to handle data to finally compute diversity indicators and display it in maps.