{
  "metadata": {
  },
  "nbformat": 4,
  "nbformat_minor": 5,
  "cells": [
    {
      "id": "metadata",
      "cell_type": "markdown",
      "source": "<div style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em; padding: 0.5em;\">\n\n# Python - Lists &amp; Strings &amp; Dictionaries\n\nby [The Carpentries](https://training.galaxyproject.org/hall-of-fame/carpentries/), [Helena Rasche](https://training.galaxyproject.org/hall-of-fame/hexylena/), [Donny Vrins](https://training.galaxyproject.org/hall-of-fame/dirowa/), [Bazante Sanders](https://training.galaxyproject.org/hall-of-fame/bazante1/)\n\nCC-BY licensed content from the [Galaxy Training Network](https://training.galaxyproject.org/)\n\n**Objectives**\n\n- How can I store multiple values?\n\n**Objectives**\n\n- Explain why programs need collections of values.\n- Write programs that create flat lists, index them, slice them, and modify them through assignment and method calls.\n\n**Time Estimation: 1H**\n</div>\n",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-0",
      "source": "<blockquote class=\"agenda\" style=\"border: 2px solid #86D486;display: none; margin: 1em 0.2em\">\n<div class=\"box-title agenda-title\" id=\"agenda\">Agenda</div>\n<p>In this tutorial, we will cover:</p>\n<ol id=\"markdown-toc\">\n<li><a href=\"#lists\" id=\"markdown-toc-lists\">Lists</a></li>\n</ol>\n</blockquote>\n<h1 id=\"lists\">Lists</h1>\n<p>Doing calculations with a hundred variables called <code style=\"color: inherit\">pressure_001</code>, <code style=\"color: inherit\">pressure_002</code>, etc. would be at least as slow as doing them by hand. Using a <em>list</em> to store many values together solves that problems. Lists are surrounded by square brackets: <code style=\"color: inherit\">[</code>, <code style=\"color: inherit\">]</code>, with values separated by commas:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-1",
      "source": [
        "pressures = [0.273, 0.275, 0.277, 0.275, 0.276]\n",
        "print(f'pressures: {pressures}')\n",
        "print(f'length: {len(pressures)}')"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-2",
      "source": "<h2 id=\"indexing\">Indexing</h2>\n<p>You can use an item’s index to fetch it from a list.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-3",
      "source": [
        "print(f'zeroth item of pressures: {pressures[0]}')\n",
        "print(f'fourth item of pressures: {pressures[4]}')"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-4",
      "source": "<h2 id=\"replacement\">Replacement</h2>\n<p>Lists’ values can be changed or replaced by assigning a new value to the position in the list.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-5",
      "source": [
        "pressures[0] = 0.265\n",
        "print(f'pressures is now: {pressures}')"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-6",
      "source": "<p>Note how the first item has changed from <code style=\"color: inherit\">0.273</code></p>\n<h2 id=\"appending\">Appending</h2>\n<p>Appending items to a list lengthens it. You can do <code style=\"color: inherit\">list_name.append()</code> to add items to the end of a list.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-7",
      "source": [
        "primes = [2, 3, 5]\n",
        "print(f'primes is initially: {primes}')\n",
        "primes.append(7)\n",
        "print(f'primes has become: {primes}')"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-8",
      "source": "<p><code class=\"language-plaintext highlighter-rouge\">.append()</code> is a <em>method</em> of lists. It’s like a function, but tied to a particular object. You use <code style=\"color: inherit\">object_name.method_name</code> to call methods, which deliberately resembles the way we refer to things in a library.</p>\n<p>We will meet other methods of lists as we go along. Use <code style=\"color: inherit\">help(list)</code> for a preview. <code style=\"color: inherit\">extend</code> is similar to <code style=\"color: inherit\">append</code>, but it allows you to combine two lists.  For example:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-9",
      "source": [
        "teen_primes = [11, 13, 17, 19]\n",
        "middle_aged_primes = [37, 41, 43, 47]\n",
        "print(f'primes is currently: {primes}')\n",
        "primes.extend(teen_primes)\n",
        "print(f'primes has now become: {primes}')\n",
        "primes.append(middle_aged_primes)\n",
        "print(f'primes has finally become: {primes}')"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-10",
      "source": "<p>Note that while <code style=\"color: inherit\">extend</code> maintains the “flat” structure of the list, appending a list to a list makes the result two-dimensional - the last element in <code style=\"color: inherit\">primes</code> is a list, not an integer.</p>\n<p>This starts to become a more complicated data structure, and we’ll use more of these later. A list containing both integers and a list can be called a “hetereogenous” list, since it has multiple different data types. This is relatively uncommon, most of the lists you’ll encounter will have a single data type inside of them. Sometimes you’ll see a list of lists, which can be used to store positions, like a chessboard.</p>\n<h2 id=\"list-indices\">List Indices</h2>\n<p>In computer science and programming we number the positions within a list starting from <code style=\"color: inherit\">0</code>, rather than from <code style=\"color: inherit\">1</code>.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-11",
      "source": [
        "# Position  0         1          2            3           4\n",
        "weekdays = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']\n",
        "print(weekdays[0])\n",
        "print(weekdays[4])\n",
        "print(weekdays[3])"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-12",
      "source": "<p>But if you try an access a position that is outside of the list, you’ll get an error</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-13",
      "source": [
        "print(weekdays[9])"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-14",
      "source": "<p>returns a <code style=\"color: inherit\">IndexError: list index out of range</code>.</p>\n<blockquote class=\"tip\" style=\"border: 2px solid #FFE19E; margin: 1em 0.2em\">\n<div class=\"box-title tip-title\" id=\"tip-reading-error-messages\"><button class=\"gtn-boxify-button tip\" type=\"button\" aria-controls=\"tip-reading-error-messages\" aria-expanded=\"true\"><i class=\"far fa-lightbulb\" aria-hidden=\"true\" ></i> <span>Tip: Reading Error Messages</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<p>So how do you read this?</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">1 | ---------------------------------------------------------------------------\n2 | IndexError                                Traceback (most recent call last)\n3 | /tmp/ipykernel_648319/137030145.py in &lt;module&gt;\n4 | ----&gt; 1 print(weekdays[9])\n5 |\n6 | IndexError: list index out of range\n</code></pre></div>  </div>\n<ol>\n<li>This is just a line of <code style=\"color: inherit\">-</code>s as a separator</li>\n<li><code style=\"color: inherit\">IndexError</code>, here Jupyter/CoCalc/etc are trying to be helpful and highlight the error for us. This is the important bit of information!</li>\n<li>This is the path to where the code is, Jupyter/CoCalc/etc create temporary files to execute your code.</li>\n<li>Here an arrow points to the line number where something has broken. 1 shows that it’s the first line within the cell, and it points to the print statement. Really it’s pointing at the <code style=\"color: inherit\">weekdays[9]</code> within the print statement.</li>\n<li>Blank</li>\n<li>This is where we normally look for the <strong>most important part of the Traceback</strong>. The error message. An <code style=\"color: inherit\">IndexError</code>, namely that the list index (9) is out of the range of possible values (the length of the list.)</li>\n</ol>\n</blockquote>\n<p>However, sometimes you want to access the very end of a list! You can either start at the beginning and count along to find the last item or second to last item, or you can use Negative Indices</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-15",
      "source": [
        "# Position  0         1          2            3           4\n",
        "weekdays = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']\n",
        "# Position  -5        -4         -3           -2          -1\n",
        "\n",
        "print(weekdays[-1])\n",
        "print(weekdays[-2])\n",
        "print(weekdays[-4])"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-16",
      "source": "<p>If you wanted to find the last value in a list, you could also use <code style=\"color: inherit\">len(elements)</code> and then subtract back to find the index you want</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-17",
      "source": [
        "elements[len(elements)-1]"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-18",
      "source": "<p>This is essentially how negative indexes work, except you don’t have to use <code style=\"color: inherit\">len(elements)</code>, that’s done for you automatically.</p>\n<h2 id=\"removing-items\">Removing Items.</h2>\n<p>You can use <code style=\"color: inherit\">del</code> to remove items from a list entirely. We use <code style=\"color: inherit\">del list_name[index]</code> to remove an element from a list (in the example, 9 is not a prime number) and thus shorten it. <code style=\"color: inherit\">del</code> is not a function or a method, but a statement in the language.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-19",
      "source": [
        "primes = [2, 3, 5, 7, 9]\n",
        "print(f'primes before removing last item: {primes}')\n",
        "del primes[4]\n",
        "print(f'primes after removing last item: {primes}')"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-20",
      "source": "<h2 id=\"empty-lists\">Empty Lists</h2>\n<p>The empty list contains no values. When you want to make a new list, use <code style=\"color: inherit\">[]</code> on its own to represent a list that doesn’t contain any values. This is helpful as a starting point for collecting values, which we’ll see soon.</p>\n<h2 id=\"heterogeneous-lists\">Heterogeneous Lists</h2>\n<p>Lists may contain values of different types. A single list may contain numbers, strings, and anything else.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-21",
      "source": [
        "goals = [1, 'Create lists.', 2, 'Extract items from lists.', 3, 'Modify lists.']"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-22",
      "source": "<h2 id=\"strings-are-like-lists\">Strings are like Lists</h2>\n<p>Text is often called a “string” in the programming world. Strings of text like <code style=\"color: inherit\">name = \"Helena\"</code> or <code style=\"color: inherit\">patient_id = \"19237zud830\"</code> are very similar conceptually to lists. Except instead of being a list of numbers, they’re a lists of characters.</p>\n<p>In a number of older programming languages, strings are indeed arrays of numbers internally. However python hides a lot of that complexity from us, so we can just work with text.</p>\n<p>Still, many of the operations you use on lists, can also be used on strings as well! Strings can be indexed like lists so you can get single elements from lists.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-23",
      "source": [
        "element = 'carbon'\n",
        "print(f'zeroth character: {element[0]}')\n",
        "print(f'third character: {element[3]}')"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-24",
      "source": "<p>Strings, however, cannot be modified, you can’t change a single letter in a string. Things that cannot be modified after creation are called <em>immutable</em> or sometimes <em>frozen</em>, compared to things which can be modified which are called <em>mutable</em>.\nPython considers the string to be a single value with parts, not a collection of values.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-25",
      "source": [
        "element[0] = 'C'"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-26",
      "source": "<h2 id=\"bounds\">Bounds</h2>\n<p>You cannot access values beyond the end of the list, this will result in an error. Python reports an <code style=\"color: inherit\">IndexError</code> if we attempt to access a value that doesn’t exist. This is a kind of <strong>runtime error</strong>, as it cannot be detected as the code is parsed. Imagine if you had a script which let you read in a file, depending on how many lines were in the file, whether index 90 was valid or invalid, would depend on how big your file was.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-27",
      "source": [
        "print(f'99th element of element is: {element[99]}')"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-28",
      "source": "<h2 id=\"exercises\">Exercises</h2>\n<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<div class=\"box-title question-title\" id=\"question-checking-suffixes\"><i class=\"far fa-question-circle\" aria-hidden=\"true\" ></i> Question: Checking suffixes</div>\n<ol>\n<li>How could you check that the extension of a filename is <code style=\"color: inherit\">.csv</code></li>\n<li>Can you find another way? Maybe check the help page for <code style=\"color: inherit\">str</code></li>\n</ol>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<div class=\"box-title solution-title\" id=\"solution\"><button class=\"gtn-boxify-button solution\" type=\"button\" aria-controls=\"solution\" aria-expanded=\"true\"><i class=\"far fa-eye\" aria-hidden=\"true\" ></i> <span>Solution</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<ol>\n<li><code style=\"color: inherit\">a[-4:] == \"csv\"</code> (Here we use <code style=\"color: inherit\">==</code> for comparing two values)</li>\n<li><code style=\"color: inherit\">a.endswith('.csv')</code></li>\n</ol>\n</details>\n</blockquote>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-29",
      "source": [
        "# Test code here!\n",
        "a = \"1234.csv\"\n",
        "b = \"1273.tsv\"\n",
        "c = \"9382.csv\"\n",
        "d = \"1239.csv\""
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-30",
      "source": "<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<div class=\"box-title question-title\" id=\"question-say-it-loud\"><i class=\"far fa-question-circle\" aria-hidden=\"true\" ></i> Question: Say it loud!</div>\n<ol>\n<li>Can you find a method in the <code style=\"color: inherit\">str</code>’s help that converts the string to upper case</li>\n<li>or lower case?</li>\n<li>Can you use it to fix mixed case DNA sequence?</li>\n</ol>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<div class=\"box-title solution-title\" id=\"solution-1\"><button class=\"gtn-boxify-button solution\" type=\"button\" aria-controls=\"solution-1\" aria-expanded=\"true\"><i class=\"far fa-eye\" aria-hidden=\"true\" ></i> <span>Solution</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<ol>\n<li><code style=\"color: inherit\">\"shout it out\".upper()</code></li>\n<li><code style=\"color: inherit\">\"WHISPER THIS\".lower()</code></li>\n<li><code style=\"color: inherit\">terrible_sequence.upper()</code></li>\n</ol>\n</details>\n</blockquote>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-31",
      "source": [
        "# Test answers here!\n",
        "print(\"shout it out\")\n",
        "print(\"WHISPER THIS\")\n",
        "# Fix this mess to be all capital\n",
        "terrible_sequence = \"AcTGAGccGGTt\"\n",
        "print(terrible_sequence)"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-32",
      "source": "<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<div class=\"box-title question-title\" id=\"question-splitting\"><i class=\"far fa-question-circle\" aria-hidden=\"true\" ></i> Question: Splitting</div>\n<ol>\n<li>We use <code style=\"color: inherit\">.split()</code> to split a string by some character. Here we have a comma separated list of values, try splitting that up by a comma, but we actually wanted it separated by <code style=\"color: inherit\">|</code> characters. Can you split it up, and then re-join it with that new character?</li>\n<li>Does <code style=\"color: inherit\">help(str)</code> give you another option for replacing a character like that.</li>\n<li>What happens if you split by another value like <code style=\"color: inherit\">3</code>?</li>\n</ol>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<div class=\"box-title solution-title\" id=\"solution-2\"><button class=\"gtn-boxify-button solution\" type=\"button\" aria-controls=\"solution-2\" aria-expanded=\"true\"><i class=\"far fa-eye\" aria-hidden=\"true\" ></i> <span>Solution</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<ol>\n<li><code style=\"color: inherit\">data.split(\",\")</code></li>\n<li><code style=\"color: inherit\">data.replace(\",\", \"|\")</code></li>\n<li>Those characters will disappear! If you want to reconstruct the same string</li>\n</ol>\n</details>\n</blockquote>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-33",
      "source": [
        "# Split me\n",
        "data = \"0,0,1,3,1,2,4,7,8,3,3,3,10,5,7,4,7,7,12,18,6,13,11,11,7,7,4,6,8,8,4,4,5,7,3,4,2,3,0,0\""
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-34",
      "source": "<h1 id=\"slicing--dicing\">Slicing &amp; Dicing</h1>\n<p>All of the data types we’ve talked about today can be sliced, and this will be a key part of using lists.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-35",
      "source": [
        "elements = ['H', 'He', 'Li', 'Be', 'B', 'C', 'N', 'O', 'F']\n",
        "# Instead of accessing a single element\n",
        "print(elements[0])\n",
        "# We'll access a range\n",
        "print(elements[0:4])"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-36",
      "source": "<p>Accessing only a portion of a list is commonly used, say if you have a list of FastQ files from paired end sequencing, perhaps you want two of them at a time. You could access those with <code style=\"color: inherit\">[0:2]</code>.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-37",
      "source": [
        "# You don't need to start at 0\n",
        "print(elements[6:8])"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-38",
      "source": "\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-39",
      "source": [
        "# But your end should be bigger than your start.\n",
        "# What do you think this will return?\n",
        "# Make a guess before you run it\n",
        "print(elements[6:5])"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-40",
      "source": "<p>If you don’t supply an end value, Python will default to going to the end of the list. Likewise, if you don’t provide a start value, Python will use <code style=\"color: inherit\">0</code> as the start by default, until whatever end value you provide.</p>\n<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<div class=\"box-title question-title\" id=\"question-valid-and-invalid-slices\"><i class=\"far fa-question-circle\" aria-hidden=\"true\" ></i> Question: Valid and Invalid Slices</div>\n<p>Which of these do you think will be valid? Which are invalid? Predict what they will return:</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\"># 1\nelements = ['H', 'He', 'Li', 'Be', 'B', 'C', 'N', 'O', 'F']\n# 2\nelements[0:3]\n# 3\nelements[:3]\n# 4\nelements[-3:3]\n# 5\nelements[-8:-3]\n# 6\nelements[:]\n# 7\nelements[0:20]\n# 8\nelements['H':'Li']\n# 9\nelements[1.5:]\n</code></pre></div>  </div>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<div class=\"box-title solution-title\" id=\"solution-3\"><button class=\"gtn-boxify-button solution\" type=\"button\" aria-controls=\"solution-3\" aria-expanded=\"true\"><i class=\"far fa-eye\" aria-hidden=\"true\" ></i> <span>Solution</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<p>All of these are valid except the last two.</p>\n<ol>\n<li>If you dont’ fill in a position, Python will use the default. 0 for the left hand side of the <code style=\"color: inherit\">:</code>, and <code style=\"color: inherit\">len(elements)</code> for the right hand side.</li>\n<li>You can request a slice longer than your list (e.g. up to 20), but Python may not give you that many items back.</li>\n<li>List slicing can only be done with integers, not floats.</li>\n</ol>\n</details>\n</blockquote>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-41",
      "source": [
        "# Check your answers here!"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-42",
      "source": "<h2 id=\"stride\">Stride</h2>\n<p>However, list slicing can be more complicated. You can additionally use a ‘stride’ parameter, which is how Python should strep through the list. To take every other element from a list:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-43",
      "source": [
        "values = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]\n",
        "print(values[0:12:2]) # every other value\n",
        "print(values[1:12:2]) # every other value from the second value\n",
        "print(values[::2]) # the start and end are optional\n",
        "print(values[::3]) # every third value in the list."
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-44",
      "source": "<p>So list slicing together is either <code style=\"color: inherit\">list[low:high]</code> or <code style=\"color: inherit\">list[low:high:stride]</code>, where low and high are optional if you just want to go to the end of the list.</p>\n<h2 id=\"sorting\">Sorting</h2>\n<p>Lists occasionally need to be sorted. For example, you have a list of students you might want to alphabetise, and here you can use the function <code style=\"color: inherit\">sorted</code> to help you.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-45",
      "source": [
        "students = [\n",
        "    'Koos Christabella',\n",
        "    'Zackary Habiba',\n",
        "    'Jumana Rostam',\n",
        "    'Sorina Gaia',\n",
        "    'Kalyani Bessarion',\n",
        "    'Enéas Nirmala',\n",
        "    '王奕辰',\n",
        "    '刘依诺',\n",
        "]\n",
        "students = sorted(students)\n",
        "print(students)"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-46",
      "source": "<blockquote class=\"tip\" style=\"border: 2px solid #FFE19E; margin: 1em 0.2em\">\n<div class=\"box-title tip-title\" id=\"tip-sorting-names-is-hard\"><button class=\"gtn-boxify-button tip\" type=\"button\" aria-controls=\"tip-sorting-names-is-hard\" aria-expanded=\"true\"><i class=\"far fa-lightbulb\" aria-hidden=\"true\" ></i> <span>Tip: Sorting names is hard!</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<p>Some people have 1 name, some have 4 or more! Some cultures have surnames first, some not. Sorting names is a complex situation, so be sure you consider your data before sorting and assuming it’s correct. Test with multiple values to make sure your code works!</p>\n</blockquote>\n<blockquote class=\"tip\" style=\"border: 2px solid #FFE19E; margin: 1em 0.2em\">\n<div class=\"box-title tip-title\" id=\"tip-results-can-be-dependent-on-analysis-order\"><button class=\"gtn-boxify-button tip\" type=\"button\" aria-controls=\"tip-results-can-be-dependent-on-analysis-order\" aria-expanded=\"true\"><i class=\"far fa-lightbulb\" aria-hidden=\"true\" ></i> <span>Tip: Results can be dependent on analysis order!</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<p>Some analyses (especially simultaions) can be dependent on data input order or data sorting. This was recently seen in {% cite Bhandari_Neupane_2019 %} where the data files used were sorted one way on Windows, and another on Linux, resulting in different results for the same code and the same datasets! Yikes!</p>\n<p>If you know your analyses are dependent on file ordering, then you can use <code style=\"color: inherit\">sorted()</code> to make sure the data is provided in a uniform way every time.</p>\n<p>If you’re not sure if your results will be dependent, you can try sorting anyway. Or better yet, randomising the list of inputs to make sure your code behaves properly in any scenario.</p>\n</blockquote>\n<h2 id=\"type-conversion\">Type Conversion</h2>\n<p>Just list with converting <code style=\"color: inherit\">\"1.5\"</code> to an float with the <code style=\"color: inherit\">float()</code> function, or <code style=\"color: inherit\">3.1</code> to a string with <code style=\"color: inherit\">str()</code>, we can do the same with lists using the <code style=\"color: inherit\">list()</code> function, and sets with <code style=\"color: inherit\">set()</code>:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-47",
      "source": [
        "# Convert text to a list\n",
        "print(list(\"sometext\"))"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-48",
      "source": "<p>Converting a list back into text is likewise possible, but you need to use the special function <code style=\"color: inherit\">join</code>. Join is a function of a <code style=\"color: inherit\">str</code>, which accepts a list</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-49",
      "source": [
        "word = ['c', 'a', 'f', 'e']\n",
        "print(\"-\".join(word))"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-50",
      "source": "<p>It takes the string you called it on, and uses that as a separator. Then for the list that you provide, it joins that together with the separator.</p>\n<h1 id=\"exercise-time\">Exercise Time</h1>\n<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<div class=\"box-title question-title\" id=\"question-fill-in-the-blanks\"><i class=\"far fa-question-circle\" aria-hidden=\"true\" ></i> Question: Fill in the Blanks</div>\n<p>Fill in the blanks so that the program below produces the output shown.</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">values = ____\nvalues.____(1)\nvalues.____(3)\nvalues.____(5)\nprint(f'first time: {values}')\nvalues = values[____]\nprint(f'second time: {values}')\n</code></pre></div>  </div>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">first time: [1, 3, 5]\nsecond time: [3, 5]\n</code></pre></div>  </div>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<div class=\"box-title solution-title\" id=\"solution-4\"><button class=\"gtn-boxify-button solution\" type=\"button\" aria-controls=\"solution-4\" aria-expanded=\"true\"><i class=\"far fa-eye\" aria-hidden=\"true\" ></i> <span>Solution</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">values = []\nvalues.append(1)\nvalues.append(3)\nvalues.append(5)\nprint(f'first time: {values}')\nvalues = values[1:]\nprint(f'second time: {values}')\n</code></pre></div>    </div>\n</details>\n</blockquote>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-51",
      "source": [
        "# Fill in the blanks here!\n",
        "\n",
        "values = ____\n",
        "values.____(1)\n",
        "values.____(3)\n",
        "values.____(5)\n",
        "print(f'first time: {values}') # Should print [1, 3, 5]\n",
        "values = values[____]\n",
        "print(f'second time: {values}') # should print [3, 5]"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-52",
      "source": "<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<h2 id=\"how-large-is-a-slice\">How Large is a Slice?</h2>\n<p>If <code style=\"color: inherit\">start</code> and <code style=\"color: inherit\">stop</code> are both non-negative integers,\nhow long is the list <code style=\"color: inherit\">values[start:stop]</code>?</p>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<h2 id=\"solution\">Solution</h2>\n<p>The list <code style=\"color: inherit\">values[start:stop]</code> has up to <code style=\"color: inherit\">stop - start</code> elements.  For example,\n<code style=\"color: inherit\">values[1:4]</code> has the 3 elements <code style=\"color: inherit\">values[1]</code>, <code style=\"color: inherit\">values[2]</code>, and <code style=\"color: inherit\">values[3]</code>.\nWhy ‘up to’?\nIf <code style=\"color: inherit\">stop</code> is greater than the total length of the list <code style=\"color: inherit\">values</code>,\nwe will still get a list back but it will be shorter than expected.</p>\n</details>\n</blockquote>\n<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<h2 id=\"from-strings-to-lists-and-back\">From Strings to Lists and Back</h2>\n<p>Given this:</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">print(f'string to list: {list('tin')}')\nprint(f'list to string: {''.join(['g', 'o', 'l', 'd'])}')\n</code></pre></div>  </div>\n<ol>\n<li>What does <code style=\"color: inherit\">list('some string')</code> do?</li>\n<li>What does <code style=\"color: inherit\">'-'.join(['x', 'y', 'z'])</code> generate?</li>\n</ol>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<h2 id=\"solution-1\">Solution</h2>\n<ol>\n<li><a href=\"https://docs.python.org/3/library/stdtypes.html#list\"><code class=\"language-plaintext highlighter-rouge\">list('some string')</code></a> converts a string into a list containing all of its characters.</li>\n<li><a href=\"https://docs.python.org/3/library/stdtypes.html#str.join\"><code class=\"language-plaintext highlighter-rouge\">join</code></a> returns a string that is the <em>concatenation</em>\nof each string element in the list and adds the separator between each element in the list. This results in\n<code style=\"color: inherit\">x-y-z</code>. The separator between the elements is the string that provides this method.</li>\n</ol>\n</blockquote>\n</blockquote>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-53",
      "source": [
        "# Test code here"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-54",
      "source": "<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<h2 id=\"working-with-the-end\">Working With the End</h2>\n<p>What does the following program print?</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">element = 'helium'\nprint(element[-1])\n</code></pre></div>  </div>\n<ol>\n<li>How does Python interpret a negative index?</li>\n<li>If a list or string has N elements,\nwhat is the most negative index that can safely be used with it,\nand what location does that index represent?</li>\n<li>If <code style=\"color: inherit\">values</code> is a list, what does <code style=\"color: inherit\">del values[-1]</code> do?</li>\n<li>How can you display all elements but the last one without changing <code style=\"color: inherit\">values</code>?\n(Hint: you will need to combine slicing and negative indexing.)</li>\n</ol>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<h2 id=\"solution\">Solution</h2>\n<p>The program prints <code style=\"color: inherit\">m</code>.</p>\n<ol>\n<li>Python interprets a negative index as starting from the end (as opposed to\nstarting from the beginning).  The last element is <code style=\"color: inherit\">-1</code>.</li>\n<li>The last index that can safely be used with a list of N elements is element\n<code style=\"color: inherit\">-N</code>, which represents the first element.</li>\n<li><code style=\"color: inherit\">del values[-1]</code> removes the last element from the list.</li>\n<li><code style=\"color: inherit\">values[:-1]</code></li>\n</ol>\n</details>\n</blockquote>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-55",
      "source": [
        "# Test code here"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-56",
      "source": "<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<h2 id=\"stepping-through-a-list\">Stepping Through a List</h2>\n<p>What does the following program print?</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">element = 'fluorine'\nprint(element[::2])\nprint(element[::-1])\n</code></pre></div>  </div>\n<ol>\n<li>If we write a slice as <code style=\"color: inherit\">low:high:stride</code>, what does <code style=\"color: inherit\">stride</code> do?</li>\n<li>What expression would select all of the even-numbered items from a collection?</li>\n</ol>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<h2 id=\"solution\">Solution</h2>\n<p>The program prints</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">furn\neniroulf\n</code></pre></div>    </div>\n<ol>\n<li><code style=\"color: inherit\">stride</code> is the step size of the slice.</li>\n<li>The slice <code style=\"color: inherit\">1::2</code> selects all even-numbered items from a collection: it starts\nwith element <code style=\"color: inherit\">1</code> (which is the second element, since indexing starts at <code style=\"color: inherit\">0</code>),\ngoes on until the end (since no <code style=\"color: inherit\">end</code> is given), and uses a step size of <code style=\"color: inherit\">2</code>\n(i.e., selects every second element).</li>\n</ol>\n</details>\n</blockquote>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-57",
      "source": [
        "# Test code here"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-58",
      "source": "<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<h2 id=\"slice-bounds\">Slice Bounds</h2>\n<p>What does the following program print?</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">element = 'lithium'\nprint(element[0:20])\nprint(element[-1:3])\n</code></pre></div>  </div>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<h2 id=\"solution\">Solution</h2>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">lithium\n</code></pre></div>    </div>\n<p>The first statement prints the whole string, since the slice goes beyond the total length of the string.\nThe second statement returns an empty string, because the slice goes “out of bounds” of the string.</p>\n</details>\n</blockquote>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-59",
      "source": [
        "# Test code here"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-60",
      "source": "<h1 id=\"dictionaries\">Dictionaries</h1>\n<p>When you think of a Dictionary, you should think of a real life Dictionary, they map some key to a value. Like a term to it’s definition</p>\n<table>\n<thead>\n<tr>\n<th>Key</th>\n<th>Value</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td><code style=\"color: inherit\">Eichhörnchen</code></td>\n<td>Squirrel</td>\n</tr>\n<tr>\n<td><code style=\"color: inherit\">火锅</code></td>\n<td>Hot Pot</td>\n</tr>\n</tbody>\n</table>\n<p>Or a Country to it’s population</p>\n<table>\n<thead>\n<tr>\n<th>Key</th>\n<th>Value</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>South Sudan</td>\n<td>492,970</td>\n</tr>\n<tr>\n<td>Australia</td>\n<td>411,667</td>\n</tr>\n<tr>\n<td>Guinea</td>\n<td>1,660,973</td>\n</tr>\n<tr>\n<td>Morocco</td>\n<td>573,895</td>\n</tr>\n<tr>\n<td>Maldives</td>\n<td>221,678</td>\n</tr>\n<tr>\n<td>Wallis and Futuna</td>\n<td>1,126</td>\n</tr>\n<tr>\n<td>Eswatini</td>\n<td>94,874</td>\n</tr>\n<tr>\n<td>Namibia</td>\n<td>325,858</td>\n</tr>\n<tr>\n<td>Turkmenistan</td>\n<td>1,031,992</td>\n</tr>\n</tbody>\n</table>\n<p>In Python we create a dictionary with <code style=\"color: inherit\">{}</code> and use <code style=\"color: inherit\">:</code> to separate keys and values. Turning the above list into a Python dictionary, it would look like:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-61",
      "source": [
        "populations = {\n",
        "  \"South Sudan\": 492970,\n",
        "  \"Australia\": 411667,\n",
        "  \"Guinea\": 1660973,\n",
        "  \"Morocco\": 573895,\n",
        "  \"Maldives\": 221678,\n",
        "  \"Wallis and Futuna\": 1126,\n",
        "  \"Eswatini\": 94874,\n",
        "  \"Namibia\": 325858,\n",
        "  \"Turkmenistan\": 1031992,\n",
        "}"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-62",
      "source": "<p>You can see a string (the country name) being used for the key, and then the number (an integer) as the value. (Would a float make sense? Why or why not?)</p>\n<blockquote class=\"tip\" style=\"border: 2px solid #FFE19E; margin: 1em 0.2em\">\n<div class=\"box-title tip-title\" id=\"tip-other-names\"><button class=\"gtn-boxify-button tip\" type=\"button\" aria-controls=\"tip-other-names\" aria-expanded=\"true\"><i class=\"far fa-lightbulb\" aria-hidden=\"true\" ></i> <span>Tip: Other Names</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<p>They’re also sometimes called associative arrays (because they’re an array or list of values that associate a key to a value) or maps (because they map a key to a value), depending on what you’re reading.</p>\n</blockquote>\n<h2 id=\"methods\">Methods</h2>\n<p>You can access both the keys, and the values</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-63",
      "source": [
        "print(populations.keys())\n",
        "print(populations.values())"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-64",
      "source": "<p>These will print out two list-like objects. They will become more useful in the future when we talk about looping over dictionaries and processing all of the values within.</p>\n<h2 id=\"accessing-values\">Accessing Values</h2>\n<p>Just like lists where you access by the position in the list</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-65",
      "source": [
        "print(populations[\"Namibia\"])"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-66",
      "source": "<p>And just like lists, if you try an access a key that isn’t there or an index outside of the range of the list:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-67",
      "source": [
        "print(populations[\"Mars\"])"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-68",
      "source": "<blockquote class=\"tip\" style=\"border: 2px solid #FFE19E; margin: 1em 0.2em\">\n<div class=\"box-title tip-title\" id=\"tip-dictionaries-are-faster-than-lists-for-looking-up-values\"><button class=\"gtn-boxify-button tip\" type=\"button\" aria-controls=\"tip-dictionaries-are-faster-than-lists-for-looking-up-values\" aria-expanded=\"true\"><i class=\"far fa-lightbulb\" aria-hidden=\"true\" ></i> <span>Tip: Dictionaries are faster than lists for looking up values</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<p>Just like in real life, searching a dictionary for a specific term is quite fast. Often a lot faster than searching a list for a specific value.</p>\n<p>For those of you old enough to remember the paper version of a dictionary, you knew that As would be at the start and Zs at the end, and probably Ms around the middle. And if you were looking for a word like “Squirrel”, you’d open the dictionary in the middle, maybe decide it was in the second half of the book, randomly choose a page in the second half, and you could keep deciding if it was “before” or “after” the current page, never even bothering to search the first half.</p>\n<p>Conceptually, compared with a list, you can’t make this guess of if the item is in the first or second half. You need to search item by item, it would be like reading page by page until you get to Squirrel in the dictionary.</p>\n</blockquote>\n<h2 id=\"modifying-dictionaries\">Modifying Dictionaries</h2>\n<p>Adding new values to a dictionary is easy, it’s very similar to replacing a value in a list.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-69",
      "source": [
        "# For lists we did\n",
        "x = ['x', 'y', 'z']\n",
        "x[0] = 'a'\n",
        "print(x)"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-70",
      "source": "<p>For dictionaries, it’s essentially the same, we access the ‘place’ in the dictionary just like we did with a list, and set it to a value</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-71",
      "source": [
        "populations[\"Mars\"] = 6 # robots\n",
        "print(populations)"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-72",
      "source": "<p>And similarly, removing items is the same as it was for lists:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-73",
      "source": [
        "print(x)\n",
        "del x[0] # Removes the first item\n",
        "print(x)"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-74",
      "source": "<p>And with dictionaries you delete by specifying which position/key you want to remove</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-75",
      "source": [
        "del populations['Australia']\n",
        "print(populations)"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-76",
      "source": "<h2 id=\"exercises\">Exercises</h2>\n<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<div class=\"box-title question-title\" id=\"question-dna-complement\"><i class=\"far fa-question-circle\" aria-hidden=\"true\" ></i> Question: DNA Complement</div>\n<p>DNA is usually in the form of dsDNA, a paired strand, where A maps to T and C maps to G and vice versa.\nBut when we’re working with DNA sequences in bioinformatics, we often only store one strand, because we can calculate the <em>complement</em> on the fly, when we need.</p>\n<p>Write a dictionary that lets you look up the letters A, C, T, and G and find their complements.</p>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<div class=\"box-title solution-title\" id=\"solution-5\"><button class=\"gtn-boxify-button solution\" type=\"button\" aria-controls=\"solution-5\" aria-expanded=\"true\"><i class=\"far fa-eye\" aria-hidden=\"true\" ></i> <span>Solution</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<p>You need to have the complements of every base. If you just defined ‘A’ and ‘C’, how would you look up the complement when you want to translate a ‘T’ or a ‘G’? It’s not easily possible to look up a key by a value, only to search a key and find a value.</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">translation = {\n'A': 'T',\n'T': 'A',\n'C': 'G',\n'G': 'C',\n}\n</code></pre></div>    </div>\n</details>\n</blockquote>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-77",
      "source": [
        "# Test code here!\n",
        "translation = {\n",
        "\n",
        "}\n",
        "print(translation)"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-78",
      "source": "<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<div class=\"box-title question-title\" id=\"question-modifying-an-array\"><i class=\"far fa-question-circle\" aria-hidden=\"true\" ></i> Question: Modifying an array</div>\n<p>Fill in the blanks to make the execution correct:</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">variants = {\n  'B.1.1.7': 26267,\n  'B.1.351': 439,\n}\nvariants[_____] =  _____\nprint(variants) # Should print {'B.1.1.7': 26267, 'B.1.351': 439, 'P.1': 384}\n__________\nprint(variants) # Should print {'B.1.1.7': 26267, 'B.1.351': 439, 'P.1': 384, 'B.1.617.2': 43486}\n# Maybe we've exterminated B.1.1.7 and B.1.351, remove their numbers.\ndel _______\ndel _______\nprint(variants[______]) # Should print 384\nprint(variants[______]) # Should print 43486\n</code></pre></div>  </div>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<div class=\"box-title solution-title\" id=\"solution-6\"><button class=\"gtn-boxify-button solution\" type=\"button\" aria-controls=\"solution-6\" aria-expanded=\"true\"><i class=\"far fa-eye\" aria-hidden=\"true\" ></i> <span>Solution</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<p>variants = {\n  ‘B.1.1.7’: 26267,\n  ‘B.1.351’: 439,\n}\nvariants[‘P.1’] = 384\nprint(variants) # Should print {‘B.1.1.7’: 26267, ‘B.1.351’: 439, ‘P.1’: 384}\nvariants[‘B.1.617.2’] = 43486\nprint(variants) # Should print {‘B.1.1.7’: 26267, ‘B.1.351’: 439, ‘P.1’: 384, ‘B.1.617.2’: 43486}</p>\n<h1 id=\"maybe-weve-exterminated-b117-and-b1351-remove-their-numbers\">Maybe we’ve exterminated B.1.1.7 and B.1.351, remove their numbers.</h1>\n<p>del variants[‘B.1.1.7’]\ndel variants[‘B.1.351’]\nprint(variants[‘P.1’]) # Should print 384\nprint(variants[‘B.1.617.2’]) # Should print 43486\n```</p>\n</details>\n</blockquote>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-79",
      "source": [
        "# Test code here!\n",
        "variants = {\n",
        "  'B.1.1.7': 26267,\n",
        "  'B.1.351': 439,\n",
        "}\n",
        "variants[_____] =  _____\n",
        "print(variants) # Should print {'B.1.1.7': 26267, 'B.1.351': 439, 'P.1': 384}\n",
        "__________\n",
        "print(variants) # Should print {'B.1.1.7': 26267, 'B.1.351': 439, 'P.1': 384, 'B.1.617.2': 43486}\n",
        "# Maybe we've exterminated B.1.1.7 and B.1.351, remove their numbers.\n",
        "del _______\n",
        "del _______\n",
        "print(variants[______]) # Should print 384\n",
        "print(variants[______]) # Should print 43486"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [

      ],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-80",
      "source": "<h1 id=\"choosing-the-right-data-type\">Choosing the Right Data Type</h1>\n<p>Choosing the correct data type can sometimes require some thought, and even discussion with colleagues. And don’t be afraid to search the internet for how other people have done it!</p>\n<table>\n<thead>\n<tr>\n<th>Data type</th>\n<th>Examples</th>\n<th>When to use it</th>\n<th>When <strong>not</strong> to use it</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>Boolean (<code class=\"language-plaintext highlighter-rouge\">bool</code>)</td>\n<td><code style=\"color: inherit\">True</code>, <code style=\"color: inherit\">False</code></td>\n<td>If there are only two possible states, true or false</td>\n<td>If your data is not binary</td>\n</tr>\n<tr>\n<td>Integer (<code class=\"language-plaintext highlighter-rouge\">int</code>)</td>\n<td>1, 0, -1023, 42</td>\n<td>Countable, singular items. How many patients are there, how many events did you record, how many variants are there in the sequence</td>\n<td>If doubling or halving the value would not make sense: do not use for e.g. patient IDs, or phone numbers. If these are integers you might accidentally do math on the value.</td>\n</tr>\n<tr>\n<td>Float (<code class=\"language-plaintext highlighter-rouge\">float</code>)</td>\n<td>123.49, 3.14159, -3.33334</td>\n<td>If you need more precision or partial values. Recording distance between places, height, mass, etc.</td>\n<td> </td>\n</tr>\n<tr>\n<td>Strings (<code class=\"language-plaintext highlighter-rouge\">str</code>)</td>\n<td>‘patient_12312’, ‘Jane Doe’, ‘火锅’</td>\n<td>To store free text, identifiers, sequence IDs, etc.</td>\n<td>If it’s truly a numeric value you can do calculations with, like adding or subtracting or doing statistics.</td>\n</tr>\n<tr>\n<td>List / Array (<code class=\"language-plaintext highlighter-rouge\">list</code>)</td>\n<td><code style=\"color: inherit\">['A', 1, 3.4, ['Nested']]</code></td>\n<td>If you need to store a list of items, like sequences from a file. Especially if you’re reading in a table of data from a file.</td>\n<td>If you want to retrieve individual values, and there are clear identifiers it might be better as a dict.</td>\n</tr>\n<tr>\n<td>Dictionary / Associative Array / map (<code class=\"language-plaintext highlighter-rouge\">dict</code>)</td>\n<td><code style=\"color: inherit\">{\"weight\": 3.4, \"age\": 12, \"name\": \"Fluffy\"}</code></td>\n<td>When you have identifiers for your data, and want to look them up by that value. E.g. looking up sequences by an identifier, or data about students based on their name. Counting values.</td>\n<td>If you just have a list of items without identifiers, it makes more sense to just use a list.</td>\n</tr>\n</tbody>\n</table>\n<h2 id=\"exercises\">Exercises</h2>\n<blockquote class=\"question\" style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em\">\n<div class=\"box-title question-title\" id=\"question-which-datatype\"><i class=\"far fa-question-circle\" aria-hidden=\"true\" ></i> Question: Which Datatype</div>\n<ol>\n<li>Chromosome Length</li>\n<li>Name</li>\n<li>Weight</li>\n<li>Sex</li>\n<li>Hair Colour</li>\n<li>Money/Currency</li>\n</ol>\n<br/><details style=\"border: 2px solid #B8C3EA; margin: 1em 0.2em;padding: 0.5em; cursor: pointer;\"><summary>👁 View solution</summary>\n<div class=\"box-title solution-title\" id=\"solution-7\"><button class=\"gtn-boxify-button solution\" type=\"button\" aria-controls=\"solution-7\" aria-expanded=\"true\"><i class=\"far fa-eye\" aria-hidden=\"true\" ></i> <span>Solution</span><span class=\"fold-unfold fa fa-minus-square\"></span></button></div>\n<ol>\n<li>Here you need to use an integer, a fractional or float value would not make sense. You cannot have half an A/C/T/G.</li>\n<li>Here a string would be a good choice. (And probably just a single <code style=\"color: inherit\">name</code> string, rather than a <code style=\"color: inherit\">first</code> and <code style=\"color: inherit\">last</code> name, as not all humans have two names! And some have more than two.)</li>\n<li>An integer is good type for storing weight, if you are using a small unit (e.g. grams). Otherwise you might consider a float, but you’d need to be careful to format it properly (e.g. <code style=\"color: inherit\">{value:0.2f}</code>) when printing it out. It depends on the application.</li>\n<li>This is a case where you should consider carefully the application, but <code style=\"color: inherit\">bool</code> is usally the <em>wrong answer</em>. Are you recording patient data? Is their expressed gender the correct variable or did you need sex? {% cite Miyagi_2021 %} goes into detail on this multifaceted issue in a medical research context. For example chromosomal sex is also more complicated and cannot be stored with a true/false value, as people with <a href=\"https://en.wikipedia.org/wiki/Klinefelter_syndrome\">Kleinfelters</a> exist. A string can be an ok choice here.</li>\n<li>There is a limited vocabulary humans use to describe hair colour, so a string can be used, or a data type we haven’t discussed! An <code style=\"color: inherit\">enum</code> is an <code style=\"color: inherit\">enumeration</code>, and when you have a limited set of values that are possible, you can use a <code style=\"color: inherit\">enum</code> to double check that whatever value is being used (or read from a file, or entered by a user) matches one of the “approved” values.</li>\n<li>A float is a good guess, but with floats come weird rounding issues. Often times people choose to use an integer storing the value in cents (or fractional cents, to whatever the desired precision is).</li>\n</ol>\n</details>\n</blockquote>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "cell_type": "markdown",
      "id": "final-ending-cell",
      "metadata": {
        "editable": false,
        "collapsed": false
      },
      "source": [
        "# Key Points\n\n",
        "- A list stores many values in a single structure.\n",
        "- Use an item's index to fetch it from a list.\n",
        "- Lists' values can be replaced by assigning to them.\n",
        "- Appending items to a list lengthens it.\n",
        "- Use `del` to remove items from a list entirely.\n",
        "- The empty list contains no values.\n",
        "- Lists may contain values of different types.\n",
        "- Character strings can be indexed like lists.\n",
        "- Character strings are immutable.\n",
        "- Indexing beyond the end of the collection is an error.\n",
        "\n# Congratulations on successfully completing this tutorial!\n\n",
        "Please [fill out the feedback on the GTN website](https://training.galaxyproject.org/training-material/topics/data-science/tutorials/python-iterables/tutorial.html#feedback) and check there for further resources!\n"
      ]
    }
  ]
}