Answering Business Questions using SQL

#Answering-Business-Questions-using-SQL

In this project, we're going to use SQL to answer business questions. The goal is:

  • Write a SQL query to extract the relevant data
  • Create plots where necessary to visualize the data
  • Write a short paragraph, drawing concolusions and explaining the data and/or visualizations

We are using the Chinook database. The Chinook database is provided as a SQLite database file called chinook.db. Its schema is below.

alt text

The database retains 'state', so if we run a qeury with a CREATE or DROP twice, the query will fail.

  • We'll create some helper functions in python to save some time.
  • We'll use a context manager to handle the connection to the SQLite database.
1
2
3
4
with sqlite3.connect('chinook.db') as conn:
    # we can now access `conn`
    # inside an indented block
    query_result = pd.read_sql(my_query, conn)

With a few extra lines, we can turn this code into a function:

1
2
3
with sqlite3.connect('chinook.db') as conn:
    conn.isolation_level = None
    conn.execute(c)

The conn.isolation_level = None line above tells SQLite to autocommit any changes.

Loading output library...

Selecting Albums to Purchase

#Selecting-Albums-to-Purchase

The Chinook record store has just signed a deal with a new record label, and you've been tasked with selecting the first three albums that will be added to the store, from a list of four. All four albums are by artists that don't have any tracks in the store right now - we have the artist names, and the genre of music they produce:

Artist NameGenre
RegalHip-Hop
Red TonePunk
Meteor and the GirlsPop
Slim Jim BitesBlues

The record label specializes in artists from the USA, and they have given Chinook some money to advertise the new albums in the USA, so we're interested in finding out which genres sell the best in the USA.

Numbe of tracks sold in The USA by genres

#Numbe-of-tracks-sold-in-The-USA-by-genres
Loading output library...
Loading output library...
Loading output library...
Loading output library...

Because there's no invoice line of which quantity is not eqault to 1, we can use count instead of sum.

Loading output library...
Loading output library...
Loading output library...

By far, Rock is the most popular genre in the USA. But, none of the artist that we are considering is in Rock. Let's narrow our fields to those genres.

Artist NameGenre
RegalHip-Hop
Red TonePunk
Meteor and the GirlsPop
Slim Jim BitesBlues
Loading output library...
Loading output library...

Among those 4 genres, Punk, Blues, Pop are rather popular ones.

Aanlysis on Sales Support Agent

#Aanlysis-on-Sales-Support-Agent

Let's see which sales support agents bring the most and why.

Loading output library...

This seems weirdly small number of employee. However, there is indeed only 3 agents. Jane bring the most amount of dollars. Let's dive deeper.

Loading output library...
Loading output library...

Jane has the most customers, however not by much. Let's see if Jane's customers are particularly more profitable.

Loading output library...

Again Jane has the most profitable customers, then Margaret, and then Steve. This seems to affect the total sales per agent.

Loading output library...

Among top 10 customers, 6 of them were assigned to Jane. This probably contributes to the fact that Jane performs better than other two. However, it's not conclusive that Jane is performing particularly better because we aren't sure what the assignment process is like. It could be just randomly assigned, which means it could be just sheere luck that she performed better than the other two.

Customer Analysis

#Customer-Analysis

The next task is to analyze the sales data for customers from each different country.

  • total number of customers
  • total value of sales
  • average value of sales per customer
  • average order value
Loading output library...
Loading output library...
Loading output library...
Loading output library...

Average Value of Sales per Customer

#Average-Value-of-Sales-per-Customer
Loading output library...
Loading output library...

Average Order Value

#Average-Order-Value
Loading output library...
Loading output library...

visualization

#visualization

To simplify the analysis, we are going to merge all the datasets by country.

Loading output library...
Loading output library...

From the result above, Czech Republic stands out on average sales per customer and order value. Which leads us to believe that even though the number of customer in Czech Republic is small, they have good buying power. With the right promotion we will likely be able to fulfill their needs more.

For the similar reason, even though the tendency is smaller, India might be a good place to implement the similar strategy for the similar reason.

Albums vs Individual Track Purchase

#Albums-vs-Individual-Track-Purchase

The Chinook store is setup in a way that allows customer to make purchases in one of the two ways:

  • purchase a whole album
  • purchase a collection of one or more individual tracks.

The store does not let customers purchase a whole album, and then add individual tracks to that same purchase (unless they do that by choosing each track manually). When customers purchase albums they are charged the same price as if they had purchased each of those tracks separately.

Album purchases still account for 18.6% of purchases. It is smaller than track purchases, but it is considerable portion. I would recommend against purchasing only tracks.