Jake Gluck, Nhien Theresa Phan
Do you have a favorite music artist or genre? Have they performed in your city? Rock and hip-hop are two very popular genres, and top music charts reflect this. However, these lists don't show where artists of a particular genre usually tour, nor do they show listenership of genres in a particular city. This tutorial looks into the geographic distribution of tour locations of rock artists versus hip-hop artists. First, we demonstrate how to scrape top artists of specific genres from last.fm. We use data from setlist.fm to map these artists' tour locations. We make hypotheses about the data, and by plotting these tour locations using
folium, we can analyze the geographic distribution of genres both worldwide and within cities to see if certain areas are predominated by one genre.
You will need Python 3 and the following libraries:
folium can be installed using
First, we must retrieve the music artists whose tour dates we want to explore. last.fm is a music website where users can share their listening data and tag artists. By scraping their tag pages, we can get a list of top artists in whatever genres we are interested in.
For this tutorial, we will be comparing rock and hip-hop. Scrape the first three pages of artist results for each genre. As each artist name is scraped, remove the special characters
! so that we can later scrape their information from the setlist.fm API. Each page lists 22 artists, so we will have 66 rock artists and 66 hip-hop artists. We want to map 50 of each artists, and some of these artists may have never toured or had their tour locations recorded in setlist.fm, so retrieving 66 will allow us to account for missing data that we might encounter when searching for these artists in the setlist.fm API.
In the next section, we will be searching artists in setlist.fm. setlist.fm is a music website that records setlists of artists' performances. These setlists include the location of that particular performance. Each artist on setlist.fm is identified by a unique ID from MusicBrainz, an open-source music encyclopedia. Define a function that sends the MusicBrainz API the string artist names we obtained in the last section and returns a MusicBrainz ID.
Define a second function to query the setlist.fm API for artists' sets. The function takes in the number of sets we want to request and the MusicBrainz ID.
Loop through the artists' names that we gathered using the first function defined above, and retrieve the MusicBrainz ID. This may take awhile, as the MusicBrainz API has rate limiting that will throttle too many requests made per second.
Now that we have the MusicBrainz ID for each artist, we can get all the pages of set information available from the setlist.fm API to create our final datasets. Create one dataframe for each genre, and only include artists with at least 50 sets on setlist.fm. This may take awhile.
Save these data to
.csv files, one for each genre.
Count the number of tour dates that have occurred in each city that appears in the data. We will use this information later to calculate percentages of rock vs. hip-hop concerts.
Before we can plot the visited cities on a map, we need to get the latitude and longitude from each city name using the Google Maps Geocoding API. You will need to log into your Google account and get an API key. Save this API key in a UTF-8 encoded text file. We can now use this API key to access the Google Maps Geocoding API.
Search each city name in the data to get the latitude and longitude of each city. Add this information to the dataframe.
We can also add the coordinate data to our
Now that we have the latitude and longitude coordinates of our artists' tour dates, we can plot the tour locations on a map using
folium, a library that adapts the
leaflet.js mapping library for a Python ecosystem. We demonstrate how to install
pip in the "Python dependencies" section of this tutorial, but detailed installation instructions can be found here.
The second map we want to create will plot one marker per city that appears in the data. Each marker can be clicked on to reveal the percentage of rock concerts vs. hip hop concerts that have occurred at that city.