For today's workshop we will be using the
pandas library, the
matplotlib library, and the
seaborn library. Also, we will read data from the web with the
pandas-datareader. By the end of the workshop, participants should be able to use Python to tell a story about a dataset they build from an open data source.
Matplotlibto visualize data
Seabornto explore data
pandas-datareaderand compare development indicators from the World Bank
An important component of the Jupyter notebooks are Markdown cells. These cells allow you to type, write mathematics, and even render a number of
html tags. We will use markdown cells to help describe our work in the Jupyter notebook.
To change a cell to a markdown cell, you can either use the menu bar, or the keyboard shortcut
ctrl + m + m. From here, you can type markdown syntax, @@0@@,
HTML, and even more code styles. We will typically use the features demonstrated below:
1 2 3 4 5 6 7 8
#Header 1 ##Header 2 ###Header 3 *italic* **bold** !(image/filepath.png) @@1@@ @@2@@
Here are two cheatsheets to help you with markdown syntax and @@3@@ symbols.
One important note is to organize our files in our notebook directories. We will use the convention of having a
image subdirectory where we will store our images and datasets. Thus, if we have a picture of a dog in our image folder, we can show this with
Very important idea for us. Here, we define a function that takes some input and spits out an output. We can define these however we want, so let's examine a mathematical and non-mathematical example.
Today we will examine two different libraries for plotting with Python. The first, is the standard
matplotlib library. We will continue to come back to
matplotlib and it is a very powerful library. Sometimes, to harness this power requires deep understanding, however, it can do most things you'd like. Using the Jupyter notebook, we will import the library to make sure the plots stay in the notebook using a magic command, we will abbreviate the
pyplot library, and import and abbreviate the
1 2 3
%matplotlib notebook import matplotlib.pyplot as plt import numpy as np
Now, when we go to use these libraries, we preface any function with
np. Here are cheatsheets for each of the libraries:
To find out more about each of these functions, we can use the built-in help. Tell me more about each of the options above by executing cells with
Pandas has the functionality to access certain data through a datareader. We will use the
pandas_datareader to investigate information about the World Bank. For more information, please see the documentation:
We will explore other examples with the datareader later, but to start let's access the World Bank's data. For a full description of the available data, look over the source from the World Bank.