Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

TV Show Popularity Analysis Using Data Mining with Python

Project Domain / Category

Sentiment Analysis

Abstract / Introduction

The domain of television has witnessed an exponential growth in content production over the years, with
numerous shows spanning various genres being produced and aired globally. In rapidly evolving television
landscape, the competition for audience attention is vast. With an abundance of content across diverse
genres and platforms, understanding what makes a TV show popular is essential for content creators,
producers, and broadcasters. This project, "TV Show Popularity Analysis Using Data Mining with Python,"
aims to delve into the factors driving TV show popularity using data mining techniques and Python
programming.

In today's television landscape, understanding what makes a TV show popular is crucial for content creators,
producers, and broadcasters. This project proposes to utilize Python programming along with data mining
techniques to analyze the factors influencing the popularity of TV shows.

Functional Requirements:

1. Data Set:

First step is to collect a dataset comprising various attributes of TV shows, including genre, cast, ratings,
release year, seasons/episodes, viewer’s comments etc. You can collect data from online sources,
databases, and APIs such as IMDb, TVDB, or Kaggle datasets related to TV shows. To meet the project
requirements you may have to combine two or more datasets.

2. Data Pre-processing:

Second step is to clean and preprocess the dataset to ensure data quality, reliability and prepare it for
sentiment analysis. You have to perform following preprocessing steps:
Text normalization: Lowercasing, removing special characters.
Tokenization: Splitting text into individual words or tokens.
Stop word removal: Eliminating common words that do not carry significant sentiment information.
Stemming or Lemmatization: Reducing words to their base form.

3. Data mining Techniques:

In next step you will analyse the prepared dataset by applying data mining techniques such as clustering,
classification within the dataset.

So, in this step, you have to use different data mining techniques to perform following tasks:
1. Implement clustering algorithms (e.g., K-means) to group TV shows based on similarities.
2. Use classification algorithms (e.g., decision trees, neural networks) to predict TV show popularity
categories.

4. TV Show Recommendation:

1. Interpret the findings from the analysis and provide actionable recommendations for content creators and
broadcasters.
2. In this final recommendation phase, you have to give list of factors which plays an important role in the
popularity of any TV show.
3. These factors can be labelled as per their importance.

Note:
 More Functional requirements can be added to each deliverable.
 A detailed document for each deliverable, tools, and libraries to be used will be provided later
after the selection of project.
 Python skills and prior knowledge of data mining is required. Please thoroughly study the proposal
and then opt for the project.

Tools:
 Windows OS
 R software
 Python
 Online sentiment analysis tool

Supervisor:

Name: Rizwana Noor


Email ID: rizwana.noor@vu.edu.pk
Skype ID: rizwana.noor77

You might also like