Project Case: Microsoft Looking to Explore Film

Posted by Justin Grisanti on May 11, 2021

Disclaimer: This scenario is a case, all companies mentioned are based on fictional scenarios

Microsoft has decided to join the movie industry, and they are looking to work with me to help them understand the industry. My aim to help them decide the best way to move forward. After working countless hours on my code and creating visualizations, I think I figured out what they should do. Here is my process:

Step 1: Develop Questions

  • Question 1: Which studio, genre and director generated the most profit?

  • Question 2: Do audience ratings impact profit?

  • Question 3: What drives revenue for the top film franchises?

Step 2: Import the Data

I started by importing data provided by Flatiron School. The data here includes tables from imdb, Rotton Tomatoes, Box Office Mojo, and TheMovieDB. I also scraped a table showing revenue breakdown by franchise from Wikipedia. In all, the data I use comes from the following tables:

  • tn_budgets
  • imdb_title_basics
  • imdb_ratings
  • imdb_names
  • imdb_principals
  • bom_budgets
  • Wikipedia Table: List_of_highest-grossing_media_franchises

Step 3: Clean the Data

From there, I cleaned the data from each table using python and various methods from pandas. This process was tedious and took the most amount of time. However, this is the most important part because innacurate or incomplete data could mess with the effectiveness of my visualizations, or even my recommendations.

Step 4: Analyze Data

After I successfully cleaned the data, I started to analyze the data and perform joins using SQL. For each of the questions developed in Step 1, I created visualizations that could help me understand what recommendations I should make to Microsoft. Below are the visualizations that I have created:

Graph 1

Image 1 This graph shows which movie studios were able to generate the most profit.

Graph 2

Image 2 This graph shows which genre generated the most profit.

Graph 3

Image 3 This graph shows which directors were able to generate the most profit.

Graph 4

Image 4 This graph is a scatter plot that shows the relationship between film rating and profit generated. Points that are darker have more votes on imdb.

Graph 5

Image 5 This graph shows which the revenue breakdown for the top film franchises.

Step 5: Develop Conclusions

Question 1

From our three graphs related to question 1, we can determine the following: Walt Disney, Universal Studios, and Fox generate the most profit, followed by Warner Bros and Sony. From personal knowledge, these film studios tend to produce action-adventure franchises, as well as family-comedy movies. According to my ‘genres’ graph, Action, Adventure, and Comedy generate the most profit overall, which could be explained by the type of movies that these studios make. From the ‘directors’ graph, we can see the directors that generated the most profit, as well. Seven out of the ten directors create action/adventure movies, and the other three create family/comedy movies.

Question 2

When thinking about question 2, I wanted to see if profit was affected if a movie had poor ratings, or if higher ratings caused it to generate more profit. From my calculations, there was a .17 correlation coefficient between ratings and profit, which leads me to believe that movies can make profit even if they rate poorly. Looking at the graph, I see two things: 1. It seems that the best performing movies do tend to have higher ratings; 2. There seem to be an even distribution of movies that had a negative profit across good and bad ratings.

Question 3

From looking at this data, I discovered that box office numbers don’t tell the whole story. When looking at all of the revenue generated by all of the franchises in our data, box office sales accounted for 43% and merchandise sales accounted for 37%. This shows me that there is more to a film franchise than box office sales. There definitely is potential for other ways to bring in revenue.

Step 6: Business Recommendations

1. Create a franchise or reboot of action/adventure movies, or collection family-friendly movies. The careers of the top 10 directors overall revolve around these types of movies and they tend to bring in the most profit. I recommend looking to collaborate with one of these top directors to make films in these genres.

2. As ratings do not appear to be correlated to profit, I recommend spending more money on advertising, so we can draw people into theaters. Getting people interested in the franchise ‘brand’ itself is important for recurring franchise interest (see recommendation 3).

3. Focus on the merchantability of the movies. As shown in question 3, merchandise sales make up for a big chunk of franchise revenue. Making family-friendly movies or action movies could help promote sales of stuffed animals, action figures, or apparel.

Reflection

This task was my first large data science project, and while it was hard, it was also fun to learn as much as I did! It can feel difficult to know where to begin when given the freedom to decide for yourself. Persistence is key, and while I can’t know for certain if my recommendations would succeed, I do know that the data driving these assumptions was handled to the best of my ability.