View markdown source on GitHub

Getting data into Galaxy

Contributors

Questions

last_modification Published: May 24, 2017
last_modification Last Updated: Oct 6, 2022

Getting data into Galaxy

Speaker Notes

-> Pressing P will toggle presenter mode.


Many ways to get data into your workspace

  1. Import using Get Data sources e.g. UCSC, SRA
  2. Import from a Galaxy Data Library
  3. Import using Upload File
    • Import from your computer
    • Directly enter text
    • Import from a URL
    • Import using FTP
    • Import directly into Collection
    • Import using Rule Builder

Speaker Notes


Best method depends on where the data is, and how big it is

flowchart for getting data into galaxy. SRA datasets should use the upload tool, if you have many or big datasets use FTP, if they're from the web use the URL upload.

.footnote[Source: Galaxy Community Hub]


1. The Get Data toolbox section


Click on Get Data to expand it


.pull-right[.image-75[A typical list of data sources]]

Speaker Notes

This shows the list of data sources that were available on usegalaxy.org in mid 2017.


Two large data sources you can access through Galaxy are UCSC and SRA

.pull-right[Screenshot of toolbox with ucsc entered in search] .pull-left[galaxy toolbox with sra entered in search box.]


2. Import from Shared Data Library


.pull-left[

.pull-right[galaxy top menu dropdown shared data, showing Data Libraries]


You can select the files you want and send to History as datasets or collection

.image-75[data library screenshot with a number of datasets selected and export to history menu open]


3. Upload from your computer


.image-50[click on upload button]

Upload file form

Speaker Notes

This is probably the most commonly used tool for bringing data into Galaxy, and it is installed on almost every Galaxy server.


Choose files

Options for importing files from your laptop

Speaker Notes


Set Metadata

upload dialog from galaxy with a number of files queued.

Speaker Notes


Set datatype for all imported datasets


Manually set datatype for one dataset

Speaker Notes


Start upload process:

Ready to upload files. Click on start

Speaker Notes


You can then close the form

Ready to upload files. Click on start

Speaker Notes


All the items will appear in your history

Files are loaded into your current history.

and are ready to use when green.


Directly enter text


Select Paste/Fetch data


Enter the data by typing (or pasting) it in the input box:

Select Paste/Fetch data

You can also set the datatype and build. Click Start, and then Close, and the new item shows up as Pasted Entry in your history.


Import using URL


The data might already be available on a web server somewhere. To avoid downloading data to your computer and uploading to Galaxy in two steps, you can instruct Galaxy to directly fetch the data from a given URL.

Select Paste/Fetch data

Select Paste/Fetch data


Enter the URLs (one per line) into the input box:

Select Paste/Fetch data

Click Start, and then Close, and the new items show up in your history with the URL as their name.


Import using FTP



Make sure you have an FTP client installed

.pull-right[.image-25[FileZilla]]


Establish FTP connection to your Galaxy server

FTP Connection Params


Successfully connect

Successfully connected


Navigate to the files you want to transfer

Right click on the files and upload them.


FTP transfer in progress

FTP Transfer in progress…


FTP transfer complete

… and transfer complete.


Where did my files go?

choose FTP files


choose FTP files

Speaker Notes

As you can see, this dialog gives connection settings too


Import directly into Collection


Direct collection Start


Direct collection Build


Direct collection Name


Direct collection History


Import using Rule Based uploader


Rule Uploader


Import using Rule Based uploader

Learn how to use it in the dedicated Rule Based Uploader tutorial


Thank you!

This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors! Galaxy Training Network Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.