{
  "metadata": {},
  "nbformat": 4,
  "nbformat_minor": 5,
  "cells": [
    {
      "id": "metadata",
      "cell_type": "markdown",
      "source": "<div style=\"border: 2px solid #8A9AD0; margin: 1em 0.2em; padding: 0.5em;\">\n\n# Python - Warm-up for statistics and machine learning\n\nby [Wandrille Duchemin](https://training.galaxyproject.org/hall-of-fame/wandrilled/)\n\nCC-BY licensed content from the [Galaxy Training Network](https://training.galaxyproject.org/)\n\n**Objectives**\n\n- to do\n\n**Objectives**\n\n- to do\n\n**Time Estimation: 1H**\n</div>\n",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-0",
      "source": "<blockquote class=\"agenda\" style=\"border: 2px solid #86D486;display: none; margin: 1em 0.2em\">\n<div class=\"box-title agenda-title\" id=\"agenda\">Agenda</div>\n<p>In this tutorial, we will cover:</p>\n<ol id=\"markdown-toc\">\n<li><a href=\"#basic-python\" id=\"markdown-toc-basic-python\">Basic python</a></li>\n</ol>\n</blockquote>\n<h2 id=\"basic-python\">Basic python</h2>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-1",
      "source": [
        "\n",
        "X = []\n",
        "\n",
        "for i in range(10):\n",
        "    X.append( i**2 )\n",
        "\n",
        "print(X)"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-2",
      "source": "<p>[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-3",
      "source": [
        "\n",
        "for x in X:\n",
        "    print(x)\n",
        ""
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-4",
      "source": "<p>0\n    1\n    4\n    9\n    16\n    25\n    36\n    49\n    64\n    81</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-5",
      "source": [
        "for x in X:\n",
        "    if x%2 == 1:\n",
        "        print(x,'is odd')\n",
        "    else:\n",
        "        print(x,'is even')"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-6",
      "source": "<p>0 is even\n    1 is odd\n    4 is even\n    9 is odd\n    16 is even\n    25 is odd\n    36 is even\n    49 is odd\n    64 is even\n    81 is odd</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-7",
      "source": [
        "# list comprehension is a very fine way of compressing all this\n",
        "\n",
        "X = [ i**2 for i in range(10) ]\n",
        "\n",
        "Xeven = [ x for x in X if x%2 == 0 ]\n",
        "Xodd = [ x for x in X if x%2 == 1 ]\n",
        "\n",
        "\n",
        "print( 'X    ', X )\n",
        "print( 'Xeven', Xeven )\n",
        "print( 'Xodd ', Xodd )"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-8",
      "source": "<p>X     [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]\n    Xeven [0, 4, 16, 36, 64]\n    Xodd  [1, 9, 25, 49, 81]</p>\n<p><a href=\"#top\">back to the top</a></p>\n<h2 id=\"numpy-and-vectorized-operations\">numpy and vectorized operations</h2>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-9",
      "source": [
        "import numpy as np\n",
        "\n",
        "X_array = np.array(X)\n",
        "\n",
        "print(X_array)"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-10",
      "source": "<p>[ 0  1  4  9 16 25 36 49 64 81]</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-11",
      "source": [
        "print(X_array / 2 )"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-12",
      "source": "<p>[ 0.   0.5  2.   4.5  8.  12.5 18.  24.5 32.  40.5]</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-13",
      "source": [
        "print( np.exp(X_array ) )\n",
        "print( np.log(X_array ) )"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-14",
      "source": "<p>[1.00000000e+00 2.71828183e+00 5.45981500e+01 8.10308393e+03\n     8.88611052e+06 7.20048993e+10 4.31123155e+15 1.90734657e+21\n     6.23514908e+27 1.50609731e+35]\n    [      -inf 0.         1.38629436 2.19722458 2.77258872 3.21887582\n     3.58351894 3.8918203  4.15888308 4.39444915]</p>\n<div class=\"language-plaintext highlighter-rouge\"><div><pre style=\"color: inherit; background: transparent\"><code style=\"color: inherit\">/tmp/ipykernel_490123/2855859755.py:2: RuntimeWarning: divide by zero encountered in log\n  print( np.log(X_array ) )\n</code></pre></div></div>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-15",
      "source": [
        "print( 'shape' , X_array.shape )\n",
        "print( 'mean ' , np.mean(X_array) )\n",
        "print( 'standard deviation' , np.std(X_array) )"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-16",
      "source": "<p>shape (10,)\n    mean  28.5\n    standard deviation 26.852374196707448</p>\n<h3 id=\"linspace-and-arange\">linspace and arange</h3>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-17",
      "source": [
        "print( 'linspace 0,2,9 :' , np.linspace(0,2,9) , sep='\\t' )\n",
        "print( 'linspace -0.5,0.5,11 :' , np.linspace(-0.5,0.5,11) , sep='\\t' )\n",
        "print( 'linspace 10,0,11 :' , np.linspace(10,0,11) , sep='\\t' )"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-18",
      "source": "<p>linspace 0,2,9 :\t[0.   0.25 0.5  0.75 1.   1.25 1.5  1.75 2.  ]\n    linspace -0.5,0.5,11 :\t[-0.5 -0.4 -0.3 -0.2 -0.1  0.   0.1  0.2  0.3  0.4  0.5]\n    linspace 10,0,11 :\t[10.  9.  8.  7.  6.  5.  4.  3.  2.  1.  0.]</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-19",
      "source": [
        "print( \"arange 0,2,0.1 :\", np.arange(1.5,2,0.1) , sep='\\t' )\n",
        "print( \"arange -1,1,0.125 :\", np.arange(-1,1,0.125) , sep='\\t' )\n",
        "print( \"arange 10,2 :\", np.arange(10,2,1) , sep='\\t' ) # reverse does not work!"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-20",
      "source": "<p>arange 0,2,0.1 :\t[1.5 1.6 1.7 1.8 1.9]\n    arange -1,1,0.125 :\t[-1.    -0.875 -0.75  -0.625 -0.5   -0.375 -0.25  -0.125  0.     0.125\n      0.25   0.375  0.5    0.625  0.75   0.875]\n    arange 10,2 :\t[]</p>\n<h2 id=\"basic-plotting\">Basic plotting</h2>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-21",
      "source": [
        "import matplotlib.pyplot as plt\n",
        "\n",
        "plt.plot( [0,1,2,3] , [10,5,7,0.2] )\n",
        "plt.show()"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-22",
      "source": "<h3 id=\"adding-color-symbols-\">Adding color, symbols, …</h3>\n<p><code style=\"color: inherit\">matplotlib</code> offers many options to customize the appearance of your plot.</p>\n<p>Here are the (some) common arguments to <code style=\"color: inherit\">plot()</code> (which can also be applied to many other graphical representations):</p>\n<ul>\n<li><code style=\"color: inherit\">color</code> : could be given as a (red,green,blue) tuple, a <a href=\"https://matplotlib.org/3.1.0/gallery/color/named_colors.html\">name</a>, a hex code, …  (see <a href=\"https://matplotlib.org/tutorials/colors/colors.html\">Something better here</a> for all the options)</li>\n<li><code style=\"color: inherit\">marker</code> : symbols for the data point. <code style=\"color: inherit\">'.'</code> is a point, <code style=\"color: inherit\">'v'</code> a down triangle, … see <a href=\"https://matplotlib.org/3.3.3/api/markers_api.html#module-matplotlib.markers\">Something better here</a> for the list of possibilities.</li>\n<li><code style=\"color: inherit\">linestyle</code> : style of the line. <code style=\"color: inherit\">'-'</code> is solid, <code style=\"color: inherit\">'--'</code> is dashed, <code style=\"color: inherit\">''</code> for no line. See <a href=\"https://matplotlib.org/3.3.3/gallery/lines_bars_and_markers/linestyles.html\">Something better here</a> for more options</li>\n<li><code style=\"color: inherit\">linewidth</code> : width of the lines</li>\n<li><code style=\"color: inherit\">markersize</code> : size of the markers</li>\n</ul>\n<p>You are invited to experiment and explore these options. Here are a few examples:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-23",
      "source": [
        "y1 = [1,2,3,10,5]\n",
        "y2 = [10,9,7,5.5,6]\n",
        "y3 = [4,3,1.5,1]\n",
        "\n",
        "# green, dashed line, with circle markers\n",
        "plt.plot( y1, color = 'green', marker = 'o', linestyle = '--', linewidth = 2, markersize = 8 )\n",
        "\n",
        "# blue triangle with no line\n",
        "plt.plot( y2, color = 'blue', marker = 'v', linestyle = '' , markersize = 16 )\n",
        "\n",
        "# solid orange line\n",
        "plt.plot(y3, color = 'orange', marker = '', linestyle = '-', linewidth = 4 )\n",
        "\n",
        "plt.show()"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-24",
      "source": "<p>Note that:</p>\n<ul>\n<li>you can call plot several time in a row to make several lines appear (only <code style=\"color: inherit\">plt.show()</code> causes the figure to appear)</li>\n<li>the frame of the picture automatically adjust to what it needs to show</li>\n</ul>\n<h3 id=\"multiple-subplots\">multiple subplots</h3>\n<p>Now would normally be when we show you how to add labels, titles and legends to figures.</p>\n<p>However, the way <code style=\"color: inherit\">matplotlib</code> is built, it is actually a bit more efficient to first learn how to create multiple subplots.</p>\n<p>Creating multiple plots is possible with the function <code style=\"color: inherit\">plt.subplots()</code>.\nAmon its many arguments, it takes:</p>\n<ul>\n<li><code style=\"color: inherit\">nrows</code> : number of subplot rows</li>\n<li><code style=\"color: inherit\">ncols</code> : number of subplot columns</li>\n<li><code style=\"color: inherit\">figsize</code> : tuple (width,height) of the figure</li>\n</ul>\n<p>This function creates a Figure and an Axes object.\nThe Axes object can be either :</p>\n<ul>\n<li>a simple Axe is there is 1 row and 1 columns</li>\n<li>a list of Axe objects if there is 1 row and multiple columns, or 1 column and multiple rows</li>\n<li>a list of lists of Axes objects if there is multiple rows and multiple columns</li>\n</ul>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-25",
      "source": [
        "y1 = [1,2,3,10,5]\n",
        "y2 = [10,9,7,5.5,6]\n",
        "y3 = [4,3,1.5,1]\n",
        "\n",
        "\n",
        "# subplots returns a Figure and an Axes object\n",
        "fig, ax = plt.subplots(nrows=1, ncols=2) # 2 columns and 1 row\n",
        "\n",
        "# ax is a list with two objects. Each object correspond to 1 subplot\n",
        "\n",
        "# accessing to the first column ax[0]\n",
        "ax[0].plot( y1, color = 'green', marker = 'o', linestyle = '--', linewidth = 2, markersize = 8 )\n",
        "\n",
        "# accessing to the second column ax[1]\n",
        "ax[1].plot( y2, color = 'blue', marker = 'v', linestyle = '' , markersize = 16 )\n",
        "ax[1].plot( y3, color = 'orange', marker = '', linestyle = '-' )\n",
        "\n",
        "plt.show()"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-26",
      "source": "<p>Notice how we call <code style=\"color: inherit\">ax[0].plot(...)</code> instead of <code style=\"color: inherit\">plt.plot(...)</code> to specify in which subplots we want to plot.</p>\n<h3 id=\"multiple-subplots---continued\">multiple subplots - continued</h3>\n<p>Let’s see the same thing with several lines and several columns</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-27",
      "source": [
        "y1 = [1,2,3,10,5]\n",
        "y2 = [10,9,7,5.5,6]\n",
        "y3 = [4,3,1.5,1]\n",
        "y4 = [1,2,3,7,5]\n",
        "\n",
        "# 2 columns and 2 rows, and we also set the figure size\n",
        "fig, ax = plt.subplots(nrows=2, ncols=2 , figsize = (12,12))\n",
        "\n",
        "# ax is a list of two lists with two objects each.\n",
        "\n",
        "# accessing to the first row, first column : ax[0][0]\n",
        "ax[0][0].plot( y1, color = 'green', marker = 'o', linestyle = '--', linewidth = 2, markersize = 8 )\n",
        "\n",
        "# accessing to the first row, second column : ax[0][1]\n",
        "ax[0][1].plot( y2, color = 'blue', marker = 'v', linestyle = '' , markersize = 16 )\n",
        "\n",
        "# accessing to the second row, first column : ax[1][0]\n",
        "ax[1][0].plot( y3, color = 'orange', marker = 'x', linestyle = '-' )\n",
        "\n",
        "# accessing to the first row, second column : ax[1][1]\n",
        "ax[1][1].plot( y4, color = 'teal', linestyle = '-.' , linewidth=5 )\n",
        "\n",
        "plt.show()"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-28",
      "source": "<h3 id=\"setting-up-labels\">setting up labels</h3>\n<p>To set the labels at the x-axis, y-axis and title, we use the method of the Axe object:</p>\n<ul>\n<li><code style=\"color: inherit\">.set_xlabel(...)</code></li>\n<li><code style=\"color: inherit\">.set_ylabel(...)</code></li>\n<li><code class=\"language-plaintext highlighter-rouge\">.set_title(...) </code></li>\n</ul>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-29",
      "source": [
        "y1 = [1,2,3,10,5]\n",
        "y2 = [10,9,7,5.5,6]\n",
        "y3 = [4,3,1.5,1]\n",
        "\n",
        "# subplots returns a Figure and an Axes object\n",
        "fig, ax = plt.subplots(nrows=1, ncols=2 , figsize=(10,5)) # 2 columns and 1 row\n",
        "\n",
        "\n",
        "# accessing to the first column ax[0]\n",
        "ax[0].plot( y1, color = 'green', marker = 'o', linestyle = '--', linewidth = 2, markersize = 8 )\n",
        "ax[0].set_xlabel('x-axis label')\n",
        "ax[0].set_ylabel('y-axis label')\n",
        "ax[0].set_title('plot 1')\n",
        "\n",
        "\n",
        "# accessing to the second column ax[1]\n",
        "ax[1].plot( y2, color = 'blue', marker = 'v', linestyle = '' , markersize = 16 )\n",
        "ax[1].plot( y3, color = 'orange', marker = '', linestyle = '-' )\n",
        "ax[1].set_xlabel('x-axis label')\n",
        "ax[1].set_ylabel('y-axis label')\n",
        "ax[1].set_title('plot 2')\n",
        "\n",
        "plt.show()"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-30",
      "source": "<p><strong>setting up a legend</strong></p>\n<p>Each element we add to the figure using <code style=\"color: inherit\">plot()</code> can be given a label using the <code style=\"color: inherit\">label</code> argument.\nThen, a legend may be added to the figure using the <code style=\"color: inherit\">legend()</code> method.</p>\n<p>This <code style=\"color: inherit\">legend()</code> method can take a <code style=\"color: inherit\">loc</code> argument that specifies where it should be plotted.\nPossible values for this argument are: <code style=\"color: inherit\">'best' , 'upper right' , 'upper left' , 'lower left' , 'lower right' , 'right' , 'center left' , 'center right' , 'lower center' , 'upper center' , 'center'</code> (the default is <code style=\"color: inherit\">best</code>).</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-31",
      "source": [
        "\n",
        "fig, ax = plt.subplots(nrows=1, ncols=1 , figsize=(10,5)) # 2 columns and 1 row\n",
        "\n",
        "# NB : with 1 col and 1 row, ax is directly the sole subplot we have\n",
        "#      so to call it we just use ax.plot , ax.set_xlabel , ...\n",
        "\n",
        "ax.plot( y1, color = 'green', marker = 'o', linestyle = '--', linewidth = 2 , label = 'line A' )\n",
        "ax.plot( y2, color = 'blue', marker = 'v', linestyle = '' , markersize =  8 , label = 'line B' )\n",
        "ax.plot( y3, color = 'orange', marker = '', linestyle = '-' , linewidth = 2 , label = 'line C' )\n",
        "\n",
        "ax.set_xlabel('x-axis label')\n",
        "ax.set_ylabel('y-axis label')\n",
        "ax.set_title('plot with a legend')\n",
        "\n",
        "#adding a legend in the upper right\n",
        "ax.legend( loc='upper right')\n",
        "\n",
        "plt.show()\n",
        ""
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-32",
      "source": "<h3 id=\"additional--writing-a-figure-to-a-file\">additional : writing a figure to a file</h3>\n<p>Writing a matplotlib figure to a file can be achieved simply by replacing the call to <code style=\"color: inherit\">plt.show()</code> to <code style=\"color: inherit\">plt.savefig(...)</code>.</p>\n<p><code style=\"color: inherit\">plt.savefig</code> takes a number of argument, the most commons are :</p>\n<ul>\n<li><code style=\"color: inherit\">fname</code> : name of the file to write the figure. The extension is used to determine the output format (.pdf,.png, .jpg , .svg ,  …). Many formats are supported, you can get a list with this command : <code style=\"color: inherit\">plt.gcf().canvas.get_supported_filetypes()</code></li>\n<li><code style=\"color: inherit\">dpi</code> : dots per inches , useful to set-up when saving to raster formats (ie., pixel-based such as png or jpeg). The actual size of the image is set using the argument <code style=\"color: inherit\">figsize</code> of <code style=\"color: inherit\">plt.subplots()</code></li>\n</ul>\n<blockquote class=\"comment\" style=\"border: 2px solid #ffecc1; margin: 1em 0.2em\">\n<div class=\"box-title comment-title\" id=\"comment\"><i class=\"far fa-comment-dots\" aria-hidden=\"true\" ></i> Comment</div>\n<p>in a jupyter notebook the figure will still be shown, whereas in a standard .py script it will not appear on screen.</p>\n</blockquote>\n<p>Here is a demonstration. Apply in on your side and verify that the file <code style=\"color: inherit\">testPlot.png</code> was created:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-33",
      "source": [
        "import matplotlib.pyplot as plt\n",
        "\n",
        "y1 = [1,2,3,10,5]\n",
        "y2 = [10,9,7,5.5,6]\n",
        "y3 = [4,3,1.5,1]\n",
        "\n",
        "\n",
        "# subplots returns a Figure and an Axes object\n",
        "fig, ax = plt.subplots(nrows=1, ncols=2 , figsize = (10,6) ) # 2 columns and 1 row\n",
        "\n",
        "# ax is a list with two objects. Each object correspond to 1 subplot\n",
        "\n",
        "# accessing to the first column ax[0]\n",
        "ax[0].plot( y1, color = 'green', marker = 'o', linestyle = '--', linewidth = 2, markersize = 8 )\n",
        "\n",
        "# accessing to the second column ax[1]\n",
        "ax[1].plot( y2, color = 'blue', marker = 'v', linestyle = '' , markersize = 16 )\n",
        "ax[1].plot( y3, color = 'orange', marker = '', linestyle = '-' )\n",
        "\n",
        "plt.savefig( 'testPlot.png' , dpi = 90  )"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-34",
      "source": "<h2 id=\"exercise-0001--bringing-together-numpy-and-matplotlib\">Exercise 00.01 : bringing together numpy and matplotlib</h2>\n<p>Numpy arrays can be plotted as if they were lists.</p>\n<ol>\n<li>plot x and y, where:\n<ul>\n<li>y = 1/(1+exp(-x))</li>\n<li>x varies between -5 and 5 (plotting around a 100 points should suffice).</li>\n</ul>\n</li>\n<li><strong>Bonus :</strong> plot multiples lines : y = 1/(1+exp(-x*b)) , for the following values of b: 0.5 , 1 , 2 , 4.\n<ul>\n<li>x still varies between -5 and 5 (plotting around a 100 points should suffice).</li>\n<li>put a legend in your plot</li>\n</ul>\n</li>\n</ol>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-35",
      "source": [
        ""
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-36",
      "source": "<p>You can load the solution directly in this notebook by uncommenting and running the following line:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-37",
      "source": [
        "# %load  -r -8 solutions/solution_00_01.py"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-38",
      "source": "<p>bonus question solution:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-39",
      "source": [
        "# %load  -r 9- solutions/solution_00_01.py"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-40",
      "source": "<h2 id=\"generating-random-numbers\">Generating random numbers</h2>\n<h3 id=\"the-basics\">the basics</h3>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-41",
      "source": [
        "import numpy.random as rd\n",
        "\n",
        "# random floats between 0 and 1\n",
        "for i in range(4):\n",
        "    print( rd.random() )\n",
        ""
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-42",
      "source": "<p>0.6696103730869407\n    0.7426639266737763\n    0.6767219223242785\n    0.8602105555191791</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-43",
      "source": [
        "print( rd.random(size=10) ) # draw directly 10 numbers"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-44",
      "source": "<p>[0.37971723 0.80354745 0.4168427  0.70867247 0.17547126 0.43760884\n     0.75933345 0.06571168 0.45772397 0.67191214]</p>\n<h3 id=\"setting-the-seed-pseudorandomness-and-reproducibility\">setting the seed: pseudorandomness and reproducibility</h3>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-45",
      "source": [
        "rd.seed(42) # setting the seed to 42\n",
        "print( '1st draw' , rd.random(size=5) )\n",
        "print( '2nd draw' , rd.random(size=5) )\n",
        "rd.seed(42)\n",
        "print( 'after resetting seed' , rd.random(size=5) )"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-46",
      "source": "<p>1st draw [0.37454012 0.95071431 0.73199394 0.59865848 0.15601864]\n    2nd draw [0.15599452 0.05808361 0.86617615 0.60111501 0.70807258]\n    after resetting seed [0.37454012 0.95071431 0.73199394 0.59865848 0.15601864]</p>\n<h3 id=\"beyond-the-uniform-distribution\">beyond the uniform distribution</h3>\n<p>numpy offers you quite a large <a href=\"https://docs.scipy.org/doc/numpy-1.15.0/reference/routines.random.html#distributions\">set of distributions you can draw from</a>.</p>\n<p>Let’s look at the normal distribution:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-47",
      "source": [
        "\n",
        "normalDraw = rd.normal(size = 1000 )\n",
        "\n",
        "print( 'mean ' , np.mean( normalDraw ) )\n",
        "print( 'stdev' , np.std( normalDraw ) )"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-48",
      "source": "<p>mean  0.025354699638558926\n    stdev 1.0003731428167348</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-49",
      "source": [
        "normalDraw2 = rd.normal( loc = -2 , scale = 3 , size = 300 ) # loc chnages the location (mean), and scale changes the standard deviation\n",
        "\n",
        "print( 'mean ' , np.mean( normalDraw2 ) )\n",
        "print( 'stdev' , np.std( normalDraw2 ) )"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-50",
      "source": "<p>mean  -1.9773491637651965\n    stdev 2.964622032924749</p>\n<p>of course, we could want to plot these drawn numbers:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-51",
      "source": [
        "plt.hist( normalDraw  , alpha = 0.5 , label='loc=0  , scale=1')\n",
        "plt.hist( normalDraw2 , alpha = 0.5 , label='loc=-2 , scale=3')\n",
        "plt.legend()\n",
        "plt.show()"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-52",
      "source": "<h2 id=\"statistical-testing\">Statistical testing</h2>\n<p><code style=\"color: inherit\">numpy.random</code> let’s you draw random numbers ;\n<code style=\"color: inherit\">scipy.stats</code> implements the probability density functions, and Percent point function, as well as the most statistical tests.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-53",
      "source": [
        "import scipy.stats as stats\n",
        "\n",
        "# plotting the probability density function for 1 of the random draw we just made:\n",
        "\n",
        "x = np.linspace(-10,10,1001)\n",
        "\n",
        "normPDF = stats.norm.pdf( x , loc = -2 , scale = 3 )\n",
        "\n",
        "plt.hist( normalDraw2 , alpha = 0.5 , label='random draw' , density = True) # don't forget density=True\n",
        "plt.plot(x,normPDF , label='PDF' )\n",
        "plt.legend()\n",
        "plt.show()"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-54",
      "source": "<p>We can also get the expected quantiles of a distribution:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-55",
      "source": [
        "print( '95% quantile of a Chi-square distribution with 3 degrees of freedom:', stats.chi2.ppf(0.95 , df=3))\n",
        "print( 'fraction of a Chi-square distribution with 3 degrees of freedom above of equal to 5' ,\n",
        "      1 - stats.chi2.cdf( 5 , df=3 ) )"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-56",
      "source": "<p>95% quantile of a Chi-square distribution with 3 degrees of freedom: 7.814727903251179\n    fraction of a Chi-square distribution with 3 degrees of freedom above of equal to 5 0.17179714429673354</p>\n<p>And you can apply some classical statistical tests:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-57",
      "source": [
        "# t-test of independance between two random samples:\n",
        "rd.seed(73)\n",
        "\n",
        "s1 = rd.normal(size=67)\n",
        "s2 = rd.normal(size=54 , loc = 0.2)\n",
        "\n",
        "testStat , pval = stats.ttest_ind(s1,s2 , equal_var=True)  # equal variance : Student's t-test ; unequal : Welch's\n",
        "#almost all of these stat functions return the same test-statistic , pvalue tuple\n",
        "\n",
        "print('result of the t-test')\n",
        "print('\\tt:',testStat)\n",
        "print('\\tp-value:',pval)"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-58",
      "source": "<p>result of the t-test\n        t: 0.26673986193074073\n        p-value: 0.7901311339594405</p>\n<h3 id=\"what-is-our-conclusion-for-these-tests-results-what-do-you-think-about-this\">What is our conclusion for these tests results? What do you think about this?</h3>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-59",
      "source": [
        "\n",
        "# Kolmogorov-smirnov test for a chi-square distribution\n",
        "\n",
        "sample = rd.chisquare(df=13 , size = 43)\n",
        "\n",
        "\n",
        "# kstest expect as second argument the cdf function of the reference distribution\n",
        "# this is how to handle the fact that me must set an argument (degree of freedom)\n",
        "refDistribution = stats.chi2(df=13).cdf\n",
        "\n",
        "testStat , pval = stats.kstest( sample , refDistribution )\n",
        "# alternative :\n",
        "# testStat , pval = stats.kstest( sample , lambda x : stats.chi2.cdf(x , df=13 ) )\n",
        "\n",
        "print('result of the Kolmogorov-Smirnov test comparing our sample to a Chi-square distribution with 13 degrees of freedom')\n",
        "print('\\tK:',testStat)\n",
        "print('\\tp-value:',pval)\n",
        ""
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-60",
      "source": "<p>result of the Kolmogorov-Smirnov test comparing our sample to a Chi-square distribution with 13 degrees of freedom\n        K: 0.12249766392962913\n        p-value: 0.5003109000967569</p>\n<p>If you are interested, this <a href=\"https://machinelearningmastery.com/statistical-hypothesis-tests-in-python-cheat-sheet/\">webpage</a> references all implemented tests, with examples.</p>\n<p><a href=\"#top\">back to the top</a></p>\n<h2 id=\"bringing-together-numpy-numpyrandom-and-matplotlib\">Bringing together numpy, numpy.random, and matplotlib</h2>\n<p>The random generation function return a numpy array, meaning it is fairly trivial to combine it with other arrays:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-61",
      "source": [
        "# combining\n",
        "\n",
        "x = np.sort( rd.normal(loc=170 , scale = 23 , size = 100) )\n",
        "\n",
        "y_theoretical = 0.75 * x + 100 # simple linear relationship : y = a * x + b\n",
        "\n",
        "measurement_noise = rd.normal(scale = 10 , size = 100) # some noise associated to the measure\n",
        "\n",
        "y_observed = y_theoretical + measurement_noise # observed = expected + noise\n",
        "\n",
        "fig,ax = plt.subplots(figsize=(8,8))\n",
        "plt.plot( x , y_theoretical , label = 'expected' )\n",
        "plt.plot( x , y_observed , marker = '.' , linestyle='' , alpha = 0.7 , label = 'observed')\n",
        "plt.legend()\n",
        "plt.show()"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-62",
      "source": "<h2 id=\"the-briefest-intro-to-pandas\">The briefest intro to pandas</h2>\n<p><code style=\"color: inherit\">pandas</code> is a powerful library when doing data analysis, especially in the forms of table.</p>\n<p>Basically, it reimplements R data.frame as a DataFrame object and ties together neatly with the libraries we’ve just seen.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-63",
      "source": [
        " import pandas as pd\n",
        "\n",
        "df = pd.read_table( 'data/beetle.csv' , sep=',' , index_col=0 ) # pandas automatically detects header.\n",
        "\n",
        "df.head()"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-64",
      "source": "<div>\n<style scoped=\"\">\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th></th>\n<th>dose</th>\n<th>nexp</th>\n<th>ndied</th>\n<th>prop</th>\n<th>nalive</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<th>1</th>\n<td>49.1</td>\n<td>59</td>\n<td>6</td>\n<td>0.102</td>\n<td>53</td>\n</tr>\n<tr>\n<th>2</th>\n<td>53.0</td>\n<td>60</td>\n<td>13</td>\n<td>0.217</td>\n<td>47</td>\n</tr>\n<tr>\n<th>3</th>\n<td>56.9</td>\n<td>62</td>\n<td>18</td>\n<td>0.290</td>\n<td>44</td>\n</tr>\n<tr>\n<th>4</th>\n<td>60.8</td>\n<td>56</td>\n<td>28</td>\n<td>0.500</td>\n<td>28</td>\n</tr>\n<tr>\n<th>5</th>\n<td>64.8</td>\n<td>63</td>\n<td>52</td>\n<td>0.825</td>\n<td>11</td>\n</tr>\n</tbody>\n</table>\n</div>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-65",
      "source": [
        "Nrows, Ncols = df.shape\n",
        "print( 'number of rows:',Nrows, 'number of columns:', Ncols )\n",
        "print( 'column names' , df.columns )"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-66",
      "source": "<p>number of rows: 8 number of columns: 5\n    column names Index([‘dose’, ‘nexp’, ‘ndied’, ‘prop’, ‘nalive’], dtype=’object’)</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-67",
      "source": [
        "df.describe()"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-68",
      "source": "<div>\n<style scoped=\"\">\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th></th>\n<th>dose</th>\n<th>nexp</th>\n<th>ndied</th>\n<th>prop</th>\n<th>nalive</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<th>count</th>\n<td>8.000000</td>\n<td>8.000000</td>\n<td>8.000000</td>\n<td>8.000000</td>\n<td>8.000000</td>\n</tr>\n<tr>\n<th>mean</th>\n<td>62.800000</td>\n<td>60.125000</td>\n<td>36.375000</td>\n<td>0.602000</td>\n<td>23.750000</td>\n</tr>\n<tr>\n<th>std</th>\n<td>9.599702</td>\n<td>2.232071</td>\n<td>22.557466</td>\n<td>0.367937</td>\n<td>21.985385</td>\n</tr>\n<tr>\n<th>min</th>\n<td>49.100000</td>\n<td>56.000000</td>\n<td>6.000000</td>\n<td>0.102000</td>\n<td>0.000000</td>\n</tr>\n<tr>\n<th>25%</th>\n<td>55.925000</td>\n<td>59.000000</td>\n<td>16.750000</td>\n<td>0.271750</td>\n<td>4.750000</td>\n</tr>\n<tr>\n<th>50%</th>\n<td>62.800000</td>\n<td>60.000000</td>\n<td>40.000000</td>\n<td>0.662500</td>\n<td>19.500000</td>\n</tr>\n<tr>\n<th>75%</th>\n<td>69.675000</td>\n<td>62.000000</td>\n<td>54.750000</td>\n<td>0.919500</td>\n<td>44.750000</td>\n</tr>\n<tr>\n<th>max</th>\n<td>76.500000</td>\n<td>63.000000</td>\n<td>61.000000</td>\n<td>1.000000</td>\n<td>53.000000</td>\n</tr>\n</tbody>\n</table>\n</div>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-69",
      "source": [
        "# select a single column:\n",
        "df['dose']"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-70",
      "source": "<p>1    49.1\n    2    53.0\n    3    56.9\n    4    60.8\n    5    64.8\n    6    68.7\n    7    72.6\n    8    76.5\n    Name: dose, dtype: float64</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-71",
      "source": [
        "df[ ['ndied','nalive'] ] # select several columns"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-72",
      "source": "<div>\n<style scoped=\"\">\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th></th>\n<th>ndied</th>\n<th>nalive</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<th>1</th>\n<td>6</td>\n<td>53</td>\n</tr>\n<tr>\n<th>2</th>\n<td>13</td>\n<td>47</td>\n</tr>\n<tr>\n<th>3</th>\n<td>18</td>\n<td>44</td>\n</tr>\n<tr>\n<th>4</th>\n<td>28</td>\n<td>28</td>\n</tr>\n<tr>\n<th>5</th>\n<td>52</td>\n<td>11</td>\n</tr>\n<tr>\n<th>6</th>\n<td>53</td>\n<td>6</td>\n</tr>\n<tr>\n<th>7</th>\n<td>61</td>\n<td>1</td>\n</tr>\n<tr>\n<th>8</th>\n<td>60</td>\n<td>0</td>\n</tr>\n</tbody>\n</table>\n</div>\n<h3 id=\"plotting-dataframe-columns\">Plotting DataFrame Columns</h3>\n<p>Because <code style=\"color: inherit\">DataFrame</code> columns are iterable, they can seamlessly be given as argument to <code style=\"color: inherit\">plot()</code>.</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-73",
      "source": [
        "\n",
        "# plotting the column dose along the x-axis and prop along the y-axis\n",
        "# I use the + marker, with a teal color.\n",
        "plt.plot(df['dose'] , df['prop'] , color = 'teal' , linestyle='' , marker = '+' , markersize=10 )\n",
        "plt.xlabel( 'dose' )\n",
        "plt.ylabel( 'proportion of dead' )\n",
        "plt.show()"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-74",
      "source": "<p>DataFrame column can be manipulated like numpy array:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-75",
      "source": [
        "\n",
        "## we can combine columns using normal operators\n",
        "Odds = df['nalive'] /df['ndied'] # the odds of being alive is nalive / ndead\n",
        "\n",
        "## adding a new column to the DataFrame is trivial:\n",
        "df['Odds'] = Odds\n",
        "\n",
        "\n",
        "## we can also apply numpy function to them\n",
        "df['logOdds'] = np.log( df['Odds'] )\n",
        "\n",
        "\n",
        "plt.plot(df['dose'] , df['logOdds'] , color = 'teal' , linestyle='' , marker = '+' , markersize=10 )\n",
        "plt.xlabel( 'dose' )\n",
        "plt.ylabel( 'log Odds' )\n",
        "plt.show()\n",
        ""
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-76",
      "source": "<h2 id=\"exercise-0002--tying-everything-together\">Exercise 00.02 : tying everything together</h2>\n<ol>\n<li>Read the file <code style=\"color: inherit\">'data/kyphosis.csv'</code>.</li>\n<li>how many columns are there ?</li>\n<li>What is the maximum Age ?</li>\n<li>create a new column <code style=\"color: inherit\">Stop</code> , corresponding to the addition of columns <code style=\"color: inherit\">'Start'</code> and <code style=\"color: inherit\">'Number'</code></li>\n<li>plot the relationship between <code style=\"color: inherit\">'Age'</code> and <code style=\"color: inherit\">'Number'</code> (bonus point : use colors to indicate the presence or absence of kyphosis ).</li>\n</ol>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-77",
      "source": [
        ""
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-78",
      "source": "<p>Solutions:</p>\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-79",
      "source": [
        "# %load  -r -7 solutions/solution_00_02.py"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-80",
      "source": "\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-81",
      "source": [
        "# %load  -r 8-9 solutions/solution_00_02.py"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-82",
      "source": "\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-83",
      "source": [
        "# %load  -r 11-12 solutions/solution_00_02.py"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-84",
      "source": "\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-85",
      "source": [
        "# %load  -r 14-15 solutions/solution_00_02.py"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-86",
      "source": "\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-87",
      "source": [
        "# %load  -r 17-22 solutions/solution_00_02.py"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-88",
      "source": "\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "id": "cell-89",
      "source": [
        "# %load  -r 24- solutions/solution_00_02.py"
      ],
      "cell_type": "code",
      "execution_count": null,
      "outputs": [],
      "metadata": {
        "attributes": {
          "classes": [
            "> In this tutorial, we will cover:"
          ],
          "id": ""
        }
      }
    },
    {
      "id": "cell-90",
      "source": "\n",
      "cell_type": "markdown",
      "metadata": {
        "editable": false,
        "collapsed": false
      }
    },
    {
      "cell_type": "markdown",
      "id": "final-ending-cell",
      "metadata": {
        "editable": false,
        "collapsed": false
      },
      "source": [
        "# Key Points\n\n",
        "- to do\n",
        "\n# Congratulations on successfully completing this tutorial!\n\n",
        "Please [fill out the feedback on the GTN website](https://training.galaxyproject.org/training-material/topics/data-science/tutorials/python-warmup-stat-ml/tutorial.html#feedback) and check there for further resources!\n"
      ]
    }
  ]
}