I listen to the Roots

I love music and data science

As I am writing this post, I am listening to Lazy Afternoon by the Roots and I am reminiscence my college years. Those were the days! But on a different note I would like to analyze how The Roots changed in terms of their music regarding one of their first albums (Do You Want More?!!!??! - 1995) to one of their most recent albums (Undun - 2011).

Did Jimmy Fallon change one of my favorite hip artist groups?

via GIPHY

They join the Jimmy Fallon’s show in 2009.

Let’s Load the appropriate libraries

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.6     ✓ dplyr   1.0.8
## ✓ tidyr   1.2.0     ✓ stringr 1.4.0
## ✓ readr   2.1.2     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(spotifyr)
library(scales)
## 
## Attaching package: 'scales'
## The following object is masked from 'package:purrr':
## 
##     discard
## The following object is masked from 'package:readr':
## 
##     col_factor
access_token <- get_spotify_access_token()

Step 1: Extract the Data

the_roots_df <- get_artist_audio_features('the roots') 

Step 2a: Manipulate the data to get relevant data

mod1_the_roots_df <- the_roots_df %>% 
  filter(album_name %in% c('Do You Want More?!!!??!','Undun'))

Step 2b: Manipulate the data to get relevant data

Issues that arise with Spotify is that some artists can upload multiple versions of the same album. So what I did is just find the album id that is most recent for both albums.

mod2_the_roots_df <- the_roots_df %>% 
  filter(album_id %in% c('14dfGE6B5TLYdrelQ7AOsa','3N0wHnD5Rd8jnTUvNqOXGz')) %>% 
  mutate(m1_valence = 2*valence - 1)

Step 3a: Visualize the the data

We want to see if there is a difference between the albums in terms of:

  1. Valence vs. energy
df = data.frame(x = c(0,1), y = c(0,1))
p1 = df %>% 
  ggplot(aes(x = x, y = y)) +
  geom_blank() +
  geom_vline(xintercept = 0,size = 1) +
  geom_hline(yintercept = 0.5,size = 1) +
  scale_x_continuous(limits = c(-1,1), 
                     expand = c(0, 0),
                     labels = label_number(accuracy = 0.1)) +
  scale_y_continuous(limits = c(0,1), expand = c(0, 0))  +
  theme_bw() +
  theme(axis.title = element_text(size = 18, face = 'bold')) +
  geom_rect(aes(xmin=-1, xmax=0, ymin=0, ymax=1),alpha = 0.15, fill = 'red') +
  geom_rect(aes(xmin=0, xmax=1, ymin=0, ymax=1),alpha = 0.15, fill = 'blue') +
  labs(x = 'Valence',y='Energy') +
  theme(axis.text = element_text(size = 12, face = 'bold'),
        plot.margin = margin(0.3, 0.5, 0.1, 0.5, "cm")
  ) +
  annotate("text", x=-0.5, y=0.25, label= "Sad",size = 15.5, color = 'white') +
  annotate("text", x=0.55, y=0.25, label= "Chill",size = 15.5, color = 'white') +
  annotate("text", x=-0.5, y=0.75, label= "Anger",size = 15.5, color = 'white') +
  annotate("text", x=0.55, y=0.75, label= "Happy",size = 15.5, color = 'white')

p1
p1 +
  geom_point(data = mod2_the_roots_df,aes(x = m1_valence, y = energy, color = album_name)) +
    scale_color_manual(values=c('#999999','#E69F00'))
Immanuel Williams PhD
Immanuel Williams PhD
Lecturer of Statistics

My passion is to help people learn…

Related