Artist WordClouds


Skills Shown:


Web Scraping
Data Visualization
Natural Language Processing

As an avid music listener, I have always had a deep appreciation for great lyrics. In my opinion, an artist's lyrics can act like a mirror into his/her soul. Websites like Genius do a fantastic job of providing lyrics for a great sum of artists. However, at this point, the website does not provide the functionality for users to summarize their favorite artist's lyrics.


Understanding this problem, I set out to find out a way to condense an artist's lyrics. After much thought, I figured the best way to do this would be by creating a word cloud. For those who are unaware, a Wordcloud "is a novelty visual representation of text data, typically used to depict keyword metadata (tags) on websites, or to visualize free form text. Tags are usually single words, and the importance of each tag is shown with font size or color.2" (Wikipedia)

Loading output library...

Part 1


Initializing the Genius API. Genius offers a simple API for accessing Artist and lyrics from its website (more information can be found here: Thus, I found that it made the most sense to get the lyrics from Genius.


Part 2


Scraping Song Titles


One of the problems with Genius is that the website has too much information. For many artists, there are logs for their appearances on Guest Shows (e.g. not their music). For the case of this project, this information is not useful. So before I use the API, I have to scrape an artist's song titles from the web. I chose to scrape from two websites because for some artists, has more songs. To be as thorough as possible, I want to have the most the maximum amount of lyrics from an indivudal artist.


Part III


Getting the Lyrics. Once I captured the song titles, I needed to capture the lyrics for these songs. Using the Genius API, I searched for every song indivudally. Due to the fact I may have upwards of 100 songs to search for, this takes a good chunk of time to run.


Part IV


Cleaning the Data


Ok, almost done now. I have now gotten the most time-consuming task out of the way. However, now I need to process the language. The lyrics sourced from Genius contain punctuation, sentences and capital-letters. In order to get the most accurate results, I must break these sentences down into individual words and make sure all word are lower-case.


Final Part


Creating the WordCloud.


Here comes the fun part-outputing the WordCloud. Using the Python WordCloud Library and the cleaned-up lyrics, I am able to create the lyrics.

Loading output library...