This project explores the intriguing possibility of changing movie durations on Netflix, leveraging my skills in exploratory data analysis to uncover trends in the entertainment industry.
To verify the trend of decreasing movie durations using exploratory data analysis techniques and identify factors that could be contributing to changes in movie lengths.
The analysis confirmed a noticeable trend in the shortening of movie lengths on Netflix. Several factors influencing this trend were identified, including genre shifts, production constraints, and changing viewer preferences.
This project not only supported the initial hypothesis but also sharpened my analytical skills and provided insights into the dynamics of content duration within major streaming platforms. The findings have implications for content creators and marketers in the entertainment industry.
Below is a snapshot of the dataset used for this analysis:
Column | Description |
---|---|
show_id | The ID of the show |
type | Type of show |
title | Title of the show |
director | Director of the show |
cast | Cast of the show |
country | Country of origin |
date_added | Date added to Netflix |
release_year | Year of Netflix release |
duration | Duration of the show in minutes |
description | Description of the show |
genre | Show genre |
import pandas as pd
# Load the CSV file and store as netflix_df
netflix_df = pd.read_csv("netflix_data.csv")
# Display the first few rows of the dataframe
print(netflix_df.head())
show_id | type | title | director | cast | country | date_added | release_year | duration | description | genre |
---|---|---|---|---|---|---|---|---|---|---|
s1 | TV Show | 3% | null | João Miguel, Bianca Comparato, Michel Gomes, Rodolfo Valente, Vaneza Oliveira, Rafael Lozano, Viviane Porto, Mel Fronckowiak, Sergio Mamberti, Zezé Motta, Celso Frateschi | Brazil | August 14, 2020 | 2020 | 4 | In a future where the elite inhabit an island paradise far from the crowded slums, you get one chance to join the 3% saved from squalor. | International TV |
s2 | Movie | 7:19 | Jorge Michel Grau | Demián Bichir, Héctor Bonilla, Oscar Serrano, Azalia Ortiz, Octavio Michel, Carmen Beato | Mexico | December 23, 2016 | 2016 | 93 | After a devastating earthquake hits Mexico City, trapped survivors from all walks of life wait to be rescued while trying desperately to stay alive. | Dramas |
s3 | Movie | 23:59 | Gilbert Chan | Tedd Chan, Stella Chung, Henley Hii, Lawrence Koh, Tommy Kuan, Josh Lai, Mark Lee, Susan Leong, Benjamin Lim | Singapore | December 20, 2018 | 2011 | 78 | When an army recruit is found dead, his fellow soldiers are forced to confront a terrifying secret that's haunting their jungle island training camp. | Horror Movies |
s4 | Movie | 9 | Shane Acker | Elijah Wood, John C. Reilly, Jennifer Connelly, Christopher Plummer, Crispin Glover, Martin Landau, Fred Tatasciore, Alan Oppenheimer, Tom Kane | United States | November 16, 2017 | 2009 | 80 | In a postapocalyptic world, rag-doll robots hide in fear from dangerous machines out to exterminate them, until a brave newcomer joins the group. | Action |
s5 | Movie | 21 | Robert Luketic | Jim Sturgess, Kevin Spacey, Kate Bosworth, Aaron Yoo, Liza Lapira, Jacob Pitts, Laurence Fishburne, Jack McGee, Josh Gad, Sam Golzari, Helen Carey, Jack Gilpin | United States | January 1, 2020 | 2008 | 123 | A brilliant group of students become card-counting experts with the intent of swindling millions out of Las Vegas casinos by playing blackjack. | Dramas |
# Filter the data to remove TV shows and store as netflix_subset
netflix_subset = netflix_df[netflix_df["type"] != 'TV Show']
# Display the first few rows of the subsetted dataframe
print(netflix_subset.head())
show_id | type | title | director | cast | country | date_added | release_year | duration | description | genre |
---|---|---|---|---|---|---|---|---|---|---|
s2 | Movie | 7:19 | Jorge Michel Grau | Demián Bichir, Héctor Bonilla, Oscar Serrano, Azalia Ortiz, Octavio Michel, Carmen Beato | Mexico | December 23, 2016 | 2016 | 93 | After a devastating earthquake hits Mexico City, trapped survivors from all walks of life wait to be rescued while trying desperately to stay alive. | Dramas |
s3 | Movie | 23:59 | Gilbert Chan | Tedd Chan, Stella Chung, Henley Hii, Lawrence Koh, Tommy Kuan, Josh Lai, Mark Lee, Susan Leong, Benjamin Lim | Singapore | December 20, 2018 | 2011 | 78 | When an army recruit is found dead, his fellow soldiers are forced to confront a terrifying secret that's haunting their jungle island training camp. | Horror Movies |
s4 | Movie | 9 | Shane Acker | Elijah Wood, John C. Reilly, Jennifer Connelly, Christopher Plummer, Crispin Glover, Martin Landau, Fred Tatasciore, Alan Oppenheimer, Tom Kane | United States | November 16, 2017 | 2009 | 80 | In a postapocalyptic world, rag-doll robots hide in fear from dangerous machines out to exterminate them, until a brave newcomer joins the group. | Action |
s5 | Movie | 21 | Robert Luketic | Jim Sturgess, Kevin Spacey, Kate Bosworth, Aaron Yoo, Liza Lapira, Jacob Pitts, Laurence Fishburne, Jack McGee, Josh Gad, Sam Golzari, Helen Carey, Jack Gilpin | United States | January 1, 2020 | 2008 | 123 | A brilliant group of students become card-counting experts with the intent of swindling millions out of Las Vegas casinos by playing blackjack. | Dramas |
s7 | Movie | 122 | Yasir Al Yasiri | Amina Khalil, Ahmed Dawood, Tarek Lotfy, Ahmed El Fishawy, Mahmoud Hijazi, Jihane Khalil, Asmaa Galal, Tara Emad | Egypt | June 1, 2020 | 2019 | 95 | After an awful accident, a couple admitted to a grisly hospital are separated and must find each other to escape — before death finds them. | Horror Movies |
# Subset the Netflix movie data, keeping only the columns "title", "country", "genre", "release_year", "duration"
netflix_movies = netflix_subset[['title', 'country', 'genre', 'release_year', 'duration']]
# Display the first few rows of the new dataframe
print(netflix_movies.head())
title | country | genre | release_year | duration |
---|---|---|---|---|
7:19 | Mexico | Dramas | 2016 | 93 |
23:59 | Singapore | Horror Movies | 2011 | 78 |
9 | United States | Action | 2009 | 80 |
21 | United States | Dramas | 2008 | 123 |
122 | Egypt | Horror Movies | 2019 | 95 |
# Filter movies that are shorter than 1 hour and save as short_movies
short_movies = netflix_movies[netflix_movies['duration']<60]
# Display the first few rows of the filtered dataframe
print(short_movies)
title | country | genre | release_year | duration |
---|---|---|---|---|
#Rucker50 | United States | Documentaries | 2016 | 56 |
100 Things to do Before High School | United States | Uncategorized | 2014 | 44 |
13TH: A Conversation with Oprah Winfrey & Ava DuVernay | null | Uncategorized | 2017 | 37 |
3 Seconds Divorce | Canada | Documentaries | 2018 | 53 |
A 3 Minute Hug | Mexico | Documentaries | 2019 | 28 |
A Christmas Special: Miraculous: Tales of Ladybug & Cat Noir | France | Uncategorized | 2016 | 22 |
A Family Reunion Christmas | United States | Uncategorized | 2019 | 29 |
A Go! Go! Cory Carson Christmas | United States | Children | 2020 | 22 |
A Go! Go! Cory Carson Halloween | null | Children | 2020 | 22 |
A Go! Go! Cory Carson Summer Camp | null | Children | 2020 | 21 |
import matplotlib.pyplot as plt
# Create colors list based on genre
colors = []
for lab, row in netflix_movies.iterrows():
if row['genre'] == "Children":
colors.append('yellow')
elif row['genre'] == "Documentaries":
colors.append('brown')
elif row['genre'] == "Stand-Up":
colors.append('blue')
else:
colors.append('grey')
# Initialize a matplotlib figure object called fig and create a scatter plot
fig = plt.figure()
plt.scatter(netflix_movies['release_year'], netflix_movies['duration'], color = colors)
plt.xlabel('Release year'); plt.ylabel('Duration (min)'); plt.title('Movie Duration by Year of Release')
plt.show()
Based on the scatter plot, it is not definitively clear that movies are consistently getting shorter over time. There are fluctuations in movie durations across different release years, suggesting that other factors, such as genre and production constraints, may play a significant role in determining movie lengths.