*This notebook contains an excerpt from the [Python Data Science Handbook](http://shop.oreilly.com/product/0636920034919.do) by Jake VanderPlas; the content is available [on GitHub](https://github.com/jakevdp/PythonDataScienceHandbook).*
One common type of visualization in data science is that of geographic data.
Matplotlib's main tool for this type of visualization is the Basemap toolkit, which is one of several Matplotlib toolkits which lives under the
Admittedly, Basemap feels a bit clunky to use, and often even simple visualizations take much longer to render than you might hope.
More modern solutions such as leaflet or the Google Maps API may be a better choice for more intensive map visualizations.
Still, Basemap is a useful tool for Python users to have in their virtual toolbelts.
In this section, we'll show several examples of the type of map visualization that is possible with this toolkit.
Installation of Basemap is straightforward; if you're using conda you can type this and the package will be downloaded:
$ conda install basemap
We add just a single new import to our standard boilerplate:
Once you have the Basemap toolkit installed and imported, geographic plots are just a few lines away (the graphics in the following also requires the
PIL package in Python 2, or the
pillow package in Python 3):
The meaning of the arguments to
Basemap will be discussed momentarily.
The useful thing is that the globe shown here is not a mere image; it is a fully-functioning Matplotlib axes that understands spherical coordinates and which allows us to easily overplot data on the map! For example, we can use a different map projection, zoom-in to North America and plot the location of Seattle. We'll use an etopo image (which shows topographical features both on land and under the ocean) as the map background:
This gives you a brief glimpse into the sort of geographic visualizations that are possible with just a few lines of Python. We'll now discuss the features of Basemap in more depth, and provide several examples of visualizing map data. Using these brief examples as building blocks, you should be able to create nearly any map visualization that you desire.
The first thing to decide when using maps is what projection to use. You're probably familiar with the fact that it is impossible to project a spherical map, such as that of the Earth, onto a flat surface without somehow distorting it or breaking its continuity. These projections have been developed over the course of human history, and there are a lot of choices! Depending on the intended use of the map projection, there are certain map features (e.g., direction, area, distance, shape, or other considerations) that are useful to maintain.
The Basemap package implements several dozen such projections, all referenced by a short format code. Here we'll briefly demonstrate some of the more common ones.
We'll start by defining a convenience routine to draw our world map along with the longitude and latitude lines:
The simplest of map projections are cylindrical projections, in which lines of constant latitude and longitude are mapped to horizontal and vertical lines, respectively.
This type of mapping represents equatorial regions quite well, but results in extreme distortions near the poles.
The spacing of latitude lines varies between different cylindrical projections, leading to different conservation properties, and different distortion near the poles.
In the following figure we show an example of the equidistant cylindrical projection, which chooses a latitude scaling that preserves distances along meridians.
Other cylindrical projections are the Mercator (
projection='merc') and the cylindrical equal area (
The additional arguments to Basemap for this view specify the latitude (
lat) and longitude (
lon) of the lower-left corner (
llcrnr) and upper-right corner (
urcrnr) for the desired map, in units of degrees.
Pseudo-cylindrical projections relax the requirement that meridians (lines of constant longitude) remain vertical; this can give better properties near the poles of the projection.
The Mollweide projection (
projection='moll') is one common example of this, in which all meridians are elliptical arcs.
It is constructed so as to preserve area across the map: though there are distortions near the poles, the area of small patches reflects the true area.
Other pseudo-cylindrical projections are the sinusoidal (
projection='sinu') and Robinson (
The extra arguments to Basemap here refer to the central latitude (
lat_0) and longitude (
lon_0) for the desired map.
Perspective projections are constructed using a particular choice of perspective point, similar to if you photographed the Earth from a particular point in space (a point which, for some projections, technically lies within the Earth!).
One common example is the orthographic projection (
projection='ortho'), which shows one side of the globe as seen from a viewer at a very long distance. As such, it can show only half the globe at a time.
Other perspective-based projections include the gnomonic projection (
projection='gnom') and stereographic projection (
These are often the most useful for showing small portions of the map.
Here is an example of the orthographic projection:
A Conic projection projects the map onto a single cone, which is then unrolled.
This can lead to very good local properties, but regions far from the focus point of the cone may become very distorted.
One example of this is the Lambert Conformal Conic projection (
projection='lcc'), which we saw earlier in the map of North America.
It projects the map onto a cone arranged in such a way that two standard parallels (specified in Basemap by
lat_2) have well-represented distances, with scale decreasing between them and increasing outside of them.
Other useful conic projections are the equidistant conic projection (
projection='eqdc') and the Albers equal-area projection (
Conic projections, like perspective projections, tend to be good choices for representing small to medium patches of the globe.
If you're going to do much with map-based visualizations, I encourage you to read up on other available projections, along with their properties, advantages, and disadvantages. Most likely, they are available in the Basemap package. If you dig deep enough into this topic, you'll find an incredible subculture of geo-viz geeks who will be ready to argue fervently in support of their favorite projection for any given application!
Earlier we saw the
shadedrelief() methods for projecting global images on the map, as well as the
drawmeridians() methods for drawing lines of constant latitude and longitude.
The Basemap package contains a range of useful functions for drawing borders of physical features like continents, oceans, lakes, and rivers, as well as political boundaries such as countries and US states and counties.
The following are some of the available drawing functions that you may wish to explore using IPython's help features:
Physical boundaries and bodies of water
drawcoastlines(): Draw continental coast lines
drawlsmask(): Draw a mask between the land and sea, for use with projecting images on one or the other
drawmapboundary(): Draw the map boundary, including the fill color for oceans.
drawrivers(): Draw rivers on the map
fillcontinents(): Fill the continents with a given color; optionally fill lakes with another color
drawcountries(): Draw country boundaries
drawstates(): Draw US state boundaries
drawcounties(): Draw US county boundaries
drawgreatcircle(): Draw a great circle between two points
drawparallels(): Draw lines of constant latitude
drawmeridians(): Draw lines of constant longitude
drawmapscale(): Draw a linear scale on the map
bluemarble(): Project NASA's blue marble image onto the map
shadedrelief(): Project a shaded relief image onto the map
etopo(): Draw an etopo relief image onto the map
warpimage(): Project a user-provided image onto the map
For the boundary-based features, you must set the desired resolution when creating a Basemap image.
resolution argument of the
Basemap class sets the level of detail in boundaries, either
'f' (full), or
None if no boundaries will be used.
This choice is important: setting high-resolution boundaries on a global map, for example, can be very slow.
Here's an example of drawing land/sea boundaries, and the effect of the resolution parameter. We'll create both a low- and high-resolution map of Scotland's beautiful Isle of Skye. It's located at 57.3°N, 6.2°W, and a map of 90,000 × 120,000 kilometers shows it well:
Notice that the low-resolution coastlines are not suitable for this level of zoom, while high-resolution works just fine. The low level would work just fine for a global view, however, and would be much faster than loading the high-resolution border data for the entire globe! It might require some experimentation to find the correct resolution parameter for a given view: the best route is to start with a fast, low-resolution plot and increase the resolution as needed.
Perhaps the most useful piece of the Basemap toolkit is the ability to over-plot a variety of data onto a map background.
For simple plotting and text, any
plt function works on the map; you can use the
Basemap instance to project latitude and longitude coordinates to
(x, y) coordinates for plotting with
plt, as we saw earlier in the Seattle example.
In addition to this, there are many map-specific functions available as methods of the
These work very similarly to their standard Matplotlib counterparts, but have an additional Boolean argument
latlon, which if set to
True allows you to pass raw latitudes and longitudes to the method, rather than projected
(x, y) coordinates.
Some of these map-specific methods are:
contourf(): Draw contour lines or filled contours
imshow(): Draw an image
pcolormesh(): Draw a pseudocolor plot for irregular/regular meshes
plot(): Draw lines and/or markers.
scatter(): Draw points with markers.
quiver(): Draw vectors.
barbs(): Draw wind barbs.
drawgreatcircle(): Draw a great circle.
We'll see some examples of a few of these as we continue. For more information on these functions, including several example plots, see the online Basemap documentation.
Recall that in Customizing Plot Legends, we demonstrated the use of size and color in a scatter plot to convey information about the location, size, and population of California cities. Here, we'll create this plot again, but using Basemap to put the data in context.
We start with loading the data, as we did before:
Next, we set up the map projection, scatter the data, and then create a colorbar and legend:
This shows us roughly where larger populations of people have settled in California: they are clustered near the coast in the Los Angeles and San Francisco areas, stretched along the highways in the flat central valley, and avoiding almost completely the mountainous regions along the borders of the state.
As an example of visualizing some more continuous geographic data, let's consider the "polar vortex" that hit the eastern half of the United States in January of 2014. A great source for any sort of climatic data is NASA's Goddard Institute for Space Studies. Here we'll use the GIS 250 temperature data, which we can download using shell commands (these commands may have to be modified on Windows machines). The data used here was downloaded on 6/12/2016, and the file size is approximately 9MB:
The data comes in NetCDF format, which can be read in Python by the
You can install this library as shown here
$ conda install netcdf4
We read the data as follows:
The file contains many global temperature readings on a variety of dates; we need to select the index of the date we're interested in—in this case, January 15, 2014:
Now we can load the latitude and longitude data, as well as the temperature anomaly for this index:
Finally, we'll use the
pcolormesh() method to draw a color mesh of the data.
We'll look at North America, and use a shaded relief map in the background.
Note that for this data we specifically chose a divergent colormap, which has a neutral color at zero and two contrasting colors at negative and positive values.
We'll also lightly draw the coastlines over the colors for reference:
The data paints a picture of the localized, extreme temperature anomalies that happened during that month. The eastern half of the United States was much colder than normal, while the western half and Alaska were much warmer. Regions with no recorded temperature show the map background.