Python Spatial Analysis and Mapping

#Python-Spatial-Analysis-and-Mapping

Agenda:

  • python spatial analysis with geopandas
  • projecting lat-long spatial data and shapefiles
  • mapping with basemaps

Quick overview of mapping and projections

#Quick-overview-of-mapping-and-projections

Some terminology:

#Some-terminology:
  • geoid: (that's gee-oid) the surface of the earth's gravity field, which approximates mean sea level
  • spheroid or ellipsoid (interchangeable terms): a model that smoothly approximates the geoid
  • datum: based on spheroid but incorporates local variations in the shape of the Earth. Used to describe a point on the Earth's surface, such as in latitude and longitude.
    • NAD83 (North American Datum 1983) uses the GRS80 spheroid
    • WGS84 (World Geodetic Survey 1984 datum) uses the WGS84 spheroid
    • The latitude and longitude coordinates of some point differ slightly based on the datum. GPS uses WGS84.
  • coordinate reference system (CRS) or spatial reference system (SRS): a series of parameters that define the coordinate system and spatial extent (aka, domain) of some dataset.
  • geographic coordinate system (GCS): specifies a datum, spheroid, units of measure (such as meters), and a prime meridian
  • projected coordinate system or map projection: projects a map of the Earth's 3-D spherical surface onto a flat surface that can be measured in units like meters. Here's a list of projections.
  • eastings and northings: the x and y coordinates of a projected map, usually measured in meters
  • false origin: the 0,0 origin point from which eastings and northings are measured on the map, usually the lower left corner rather than the center
  • PROJ.4: a library to convert/project spatial data with consistent CRS parameter names

Common CRS parameters (and their PROJ.4 names):

#Common-CRS-parameters-(and-their-PROJ.4-names):
  • datum (datum)
  • ellipse (ellps)
  • projection (proj)
    • the name of the projected coordinate system, such as Albers Equal Area (aea) or Lambert Conformal Conic (lcc)
  • standard parallels (lat_1, lat_2)
    • where the projection surface touches the globe - at the standard parallels, the projection shows no distortion
  • central meridian and latitude of origin (lon_0, lat_0)
    • the origin of the projection's x and y coordinates (eastings and northings) - usually the center of the map projection
  • false easting and false northing (x_0, y_0)
    • offsets to add to all your eastings and northings - usually used to make all the coordinates on the map positive numbers by starting 0,0 at the lower left corner rather than the center of the map (see false origin, above)

Common projection types:

#Common-projection-types:
  • equal area projections: maintain area at the expense of shape, distance, and direction - such as the Albers Equal Area projection
  • conformal projections: maintain shapes at the expense of area, distance, and direction - such as the Lambert Conformal Conic projection
  • equidistant projections: preserve distance from one point or along all meridians and parallels
  • azimuthal projections: maintain direction from one point to all other points - such as an orthographic projection
  • others compromise to minimize overall distortion or aim for aesthetic value - such as the Robinson projection

Part 1: spatial analysis

#Part-1:-spatial-analysis

We'll use geopandas, which spatializes pandas dataframes.

Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...

But these buffers are kind of weird because the data is not projected - it's all in lat-long degrees. Let's project it.

Part 2: projecting spatial data

#Part-2:-projecting-spatial-data

Let's define a projection we can use to convert and map our lat-long data. The parameters in the following dictionaries correspond to the projection parameters from PROJ4. Geopandas uses the pyproj library to convert spatial data, which in turn uses PROJ4 projection names and parameters.

You can figure out these parameter values either by approximating the lats and longs of your spatial data set, or by trial and error, or by looking up a reference like this one for UTM zone 11.

Geopandas needs your projection to be specified in a dict - you can create one manually, or use the function below to convert a PROJ4 string to a dict.

Loading output library...
Loading output library...
Loading output library...
Loading output library...

So that's our projected data and shapefile. Notice how the shape has changed, and how the units make more sense - they are in meters now. So our buffers are a 30km radius from each point.

It's also easy to save a geodataframe as a shapefile or as a geojson string (for easy leaflet mapping):

Now let's project our entire USA points data to a projection appropriate for the entire USA. We'll specify the datum, ellipsoid, projection name, standard parallels, central meridian and latitude of origin, false easting and false northing (because matplotlib basemap sticks the origin at the lower left corner), and measurement units.

Loading output library...
Loading output library...
Loading output library...
Loading output library...

Unprojected lat-long data (left) and projected data (right). The origin on the right is 0,0 like we'd expect for our false origin. Now let's make it look nice, with matplotlib basemap.

Part 3: basemaps

#Part-3:-basemaps

We'll use the matplotlib basemap toolkit

Loading output library...
Loading output library...
Loading output library...
Loading output library...

Final example: plot projected Europe data, from scratch

#Final-example:-plot-projected-Europe-data,-from-scratch

One last simple example, showing how easy it is to project and map spatial data from scratch in just a few lines of code.

Loading output library...

So what's the point of all this? Why not just use QGIS? Well if I'm just trying to make a one-off map, I'd just use QGIS. But if I were automating a workflow, I'd use Python: geopandas and basemap are fast for projecting, mapping, and spatial analysis especially when it's repetitive. But most of all, if I'm already working with pandas data, cleaning it, analyzing it, modeling it - I can create a nice map of it with just a few more lines of code.

If you're interested in more fine-grained control over plotting your basemap, you can project a shapefile and convert each piece of geometry inside it into a patch for matplotlib to plot (individually customizable). I describe this process in an old blog post I wrote: http://geoffboeing.com/2014/09/visualizing-summer-travels-part-6-projecting-spatial-data-python/