SQL Educational Game - Murder Mystery

Overview
Questions:
  • Who did the crime?

Objectives:
  • Explore SQL City and discover who committed the murder

  • Reinforce your experiences with SQL such as querying, filtering, and joining data.

Requirements:
Time estimation: 2 hours
Level: Intermediate Intermediate
Supporting Materials:
Published: Oct 11, 2021
Last modification: Nov 23, 2023
License: Copyright (c) 2018 NUKnightLab. Tutorial Content is licensed under MIT. The GTN Framework is licensed under MIT
purl PURL: https://gxy.io/GTN:T00108
version Revision: 6
Best viewed in a Jupyter Notebook

This tutorial is best viewed in a Jupyter notebook! You can load this notebook one of the following ways

Launching the notebook in Jupyter in Galaxy

  1. Instructions to Launch JupyterLab
  2. Open a Terminal in JupyterLab with File -> New -> Terminal
  3. Run wget https://training.galaxyproject.org/training-material/topics/data-science/tutorials/sql-game/data-science-sql-game.ipynb
  4. Select the notebook that appears in the list of files on the left.

Downloading the notebook

  1. Right click one of these links: Jupyter Notebook (With Solutions), Jupyter Notebook (Without Solutions)
  2. Save Link As..

This is not a tutorial like most GTN content but a fun exercise for you to play around and learn a bit about SQL in a more ‘practical’, and hopefully re-inforce the skills you covered in Basic and Advanced SQL skills. It makes use of the NUKnightLab/sql-mysteries SQL murder mystery project and released under open licenses:

Comment: Source

Original code for NUKnightLab/sql-mysteries is released under the MIT License. Original text and other content is released under Creative Commons CC BY-SA 4.0.

Download the database and connector:

# This preamble sets up the sql "magic" for jupyter. Use %%sql in your cells to write sql!
!python3 -m pip install ipython-sql sqlalchemy
!wget -c https://github.com/NUKnightLab/sql-mysteries/raw/master/sql-murder-mystery.db

Setup the database connection:

import sqlalchemy
engine = sqlalchemy.create_engine("sqlite:///sql-murder-mystery.db")
%load_ext sql
%sql sqlite:///sql-murder-mystery.db
%config SqlMagic.displaycon=False

Tables

Which tables are available to you? What columns do they contain? Here’s a handy reference for you:

import pandas as pd
from sqlalchemy import MetaData
m = MetaData()
m.reflect(engine)
results = []
for table in m.tables.values():
    results.append([table.name, ', '.join([c.name for c in table.c])])
pd.set_option('display.max_colwidth', None)
pd.DataFrame(results, columns=["Table", "Columns"])

Begin your search for the truth

A crime has taken place and the police are both useless and corrupt and it is up to you and your community to solve the mystery. They failed to secure their database, and now their crime scene reports are public, it is time to figure out who the murderer was.

You know that the crime was a murder that occurred sometime on Jan.15, 2018 and that it took place in SQL City.

All the clues to this mystery are buried in a huge database, and you need to use SQL to navigate through this vast network of information. Your first step to solving the mystery is to retrieve the corresponding crime scene report from the police department’s database. From there, you can use your SQL skills to find the murderer.

%%sql
select * from crime_scene_report limit 8;

Try using the ‘Insert → Cell Below’ functionality to keep track of important query results as you go!

Solution

Write the following queries in your SQL environment to check whether you’ve found the right murderer:

%%sql
INSERT INTO solution VALUES (1, "Insert the name of the person you found here");
SELECT value FROM solution;