Tour Locations of Rock Artists vs. Hip-hop Artists

#Tour-Locations-of-Rock-Artists-vs.-Hip-hop-Artists

Jake Gluck, Nhien Theresa Phan

Introduction

#Introduction

Do you have a favorite music artist or genre? Have they performed in your city? Rock and hip-hop are two very popular genres, and top music charts reflect this. However, these lists don't show where artists of a particular genre usually tour, nor do they show listenership of genres in a particular city. This tutorial looks into the geographic distribution of tour locations of rock artists versus hip-hop artists. First, we demonstrate how to scrape top artists of specific genres from last.fm. We use data from setlist.fm to map these artists' tour locations. We make hypotheses about the data, and by plotting these tour locations using folium, we can analyze the geographic distribution of genres both worldwide and within cities to see if certain areas are predominated by one genre.

Python dependencies

#Python-dependencies

You will need Python 3 and the following libraries:

  • bs4
  • folium
  • itertools
  • json
  • numpy
  • pandas
  • requests
  • time

folium can be installed using pip:

Scraping and cleaning data from the last.fm website

#Scraping-and-cleaning-data-from-the-last.fm-website

First, we must retrieve the music artists whose tour dates we want to explore. last.fm is a music website where users can share their listening data and tag artists. By scraping their tag pages, we can get a list of top artists in whatever genres we are interested in.

For this tutorial, we will be comparing rock and hip-hop. Scrape the first three pages of artist results for each genre. As each artist name is scraped, remove the special characters / and ! so that we can later scrape their information from the setlist.fm API. Each page lists 22 artists, so we will have 66 rock artists and 66 hip-hop artists. We want to map 50 of each artists, and some of these artists may have never toured or had their tour locations recorded in setlist.fm, so retrieving 66 will allow us to account for missing data that we might encounter when searching for these artists in the setlist.fm API.

Define API request functions

#Define-API-request-functions

In the next section, we will be searching artists in setlist.fm. setlist.fm is a music website that records setlists of artists' performances. These setlists include the location of that particular performance. Each artist on setlist.fm is identified by a unique ID from MusicBrainz, an open-source music encyclopedia. Define a function that sends the MusicBrainz API the string artist names we obtained in the last section and returns a MusicBrainz ID.

Define a second function to query the setlist.fm API for artists' sets. The function takes in the number of sets we want to request and the MusicBrainz ID.

Get MusicBrainz IDs for last.fm artists

#Get-MusicBrainz-IDs-for-last.fm-artists

Loop through the artists' names that we gathered using the first function defined above, and retrieve the MusicBrainz ID. This may take awhile, as the MusicBrainz API has rate limiting that will throttle too many requests made per second.

Getting tour location data

#Getting-tour-location-data

Now that we have the MusicBrainz ID for each artist, we can get all the pages of set information available from the setlist.fm API to create our final datasets. Create one dataframe for each genre, and only include artists with at least 50 sets on setlist.fm. This may take awhile.

Save these data to .csv files, one for each genre.

Loading output library...
Loading output library...

Determine top cities

#Determine-top-cities

Count the number of tour dates that have occurred in each city that appears in the data. We will use this information later to calculate percentages of rock vs. hip-hop concerts.

Loading output library...

Getting latitude and longitude using Google Maps Geocoding API

#Getting-latitude-and-longitude-using-Google-Maps-Geocoding-API

Before we can plot the visited cities on a map, we need to get the latitude and longitude from each city name using the Google Maps Geocoding API. You will need to log into your Google account and get an API key. Save this API key in a UTF-8 encoded text file. We can now use this API key to access the Google Maps Geocoding API.

Search each city name in the data to get the latitude and longitude of each city. Add this information to the dataframe.

We can also add the coordinate data to our top_cities dataframe.

Mapping artists’ tour locations with folium

#Mapping-artists’-tour-locations-with-

Now that we have the latitude and longitude coordinates of our artists' tour dates, we can plot the tour locations on a map using folium, a library that adapts the leaflet.js mapping library for a Python ecosystem. We demonstrate how to install folium with pip in the "Python dependencies" section of this tutorial, but detailed installation instructions can be found here.

Mapping individual cities

#Mapping-individual-cities

The second map we want to create will plot one marker per city that appears in the data. Each marker can be clicked on to reveal the percentage of rock concerts vs. hip hop concerts that have occurred at that city.

Loading output library...

Analysis

#Analysis
Loading output library...

First lets look at the raw numbers

#First-lets-look-at-the-raw-numbers

Hip Hop vs Rock Cities

#Hip-Hop-vs-Rock-Cities

We want to find what cities have more rock concerts and what cities have more hip hop concerts

#We-want-to-find-what-cities-have-more-rock-concerts-and-what-cities-have-more-hip-hop-concerts

this map displays what cities are a majority hip hop and what cities are a majority rock

#this-map-displays-what-cities-are-a-majority-hip-hop-and-what-cities-are-a-majority-rock

it shows the top 200 cities in the world by total concerts from our data

#it-shows-the-top-200-cities-in-the-world-by-total-concerts-from-our-data

Almost every city is a majority rock concerts, except for two cities. Brooklyn New York and Warsaw Poland

#Almost-every-city-is-a-majority-rock-concerts,-except-for-two-cities.-Brooklyn-New-York-and-Warsaw-Poland
Loading output library...

Hip Hop vs Rock Cities Gradient

#Hip-Hop-vs-Rock-Cities-Gradient

This map contains the same information, but displays the results as a gradient based on their ratio of hip hop vs rock

#This-map-contains-the-same-information,-but-displays-the-results-as-a-gradient-based-on-their-ratio-of-hip-hop-vs-rock

100% Rock is pure white, 100% hip hop is pure black

#100%-Rock-is-pure-white,-100%-hip-hop-is-pure-black

The results show that most of our top 200 cities are greatly leaning to rock

#The-results-show-that-most-of-our-top-200-cities-are-greatly-leaning-to-rock

However it show that a signifigent subset of cities are almost even but very slightly prefer rock

#However-it-show-that-a-signifigent-subset-of-cities-are-almost-even-but-very-slightly-prefer-rock

Some of these cities include Gold Coast, Australia Miami, Florida, Georgetown, South Africa, and Indio California

#Some-of-these-cities-include-Gold-Coast,-Australia-Miami,-Florida,-Georgetown,-South-Africa,-and-Indio-California
Loading output library...