This document provides details about a student intern working on a movie success prediction project using an IMDB movie dataset. The project aims to analyze factors like actors, directors, ratings, and box office data to predict the success of upcoming movies and help film producers choose casts and crews. Visualization and machine learning techniques will be used to model relationships in the data and identify trends. The results found highest revenue movies between 2006-2016 and average movie ratings by year.
This document provides details about a student intern working on a movie success prediction project using an IMDB movie dataset. The project aims to analyze factors like actors, directors, ratings, and box office data to predict the success of upcoming movies and help film producers choose casts and crews. Visualization and machine learning techniques will be used to model relationships in the data and identify trends. The results found highest revenue movies between 2006-2016 and average movie ratings by year.
This document provides details about a student intern working on a movie success prediction project using an IMDB movie dataset. The project aims to analyze factors like actors, directors, ratings, and box office data to predict the success of upcoming movies and help film producers choose casts and crews. Visualization and machine learning techniques will be used to model relationships in the data and identify trends. The results found highest revenue movies between 2006-2016 and average movie ratings by year.
This document provides details about a student intern working on a movie success prediction project using an IMDB movie dataset. The project aims to analyze factors like actors, directors, ratings, and box office data to predict the success of upcoming movies and help film producers choose casts and crews. Visualization and machine learning techniques will be used to model relationships in the data and identify trends. The results found highest revenue movies between 2006-2016 and average movie ratings by year.
SKILLS BUILD EMAIL ID: anilvudumula2002@gmail.com COLLEGE NAME: KRISHNA CHAITANYA INSTITUTE OF TECHNOLOGY & SCIENCES COLLEGE STATE: ANDHRA PRADESH INTERNSHIP DOMAIN: DATA ANALYTICS INTERNSHIP START & END DATE: PROBLEM STATEMENT: In this Movie Success Rate Prediction project, The Method of using the ratings of the films by the cast and crew has been an innovative and original way to solve the dilemma of film producers. Looking at the average ratings of each actor and director together with all the films they participated in should be able to give the producer a good idea of who to cast and who not to cast in a film that is to be out right now. So I decided to present some interesting stats about the popular movies,highest revenue movies and some other statistics about the movies which were released during 2006 to 2016. AGENDA: Load the Dataset and display First 15 Rows. Shape of the Dataset. Information about the Dataset. Drop all the Missing Values. Check for Duplicate Data. Get Overall Information About the Dataset. Display Number of Movies per Year. Top 10 Highest Revenue Movies. Average Rating Of Movies Year Wise. Classification Of Movies Based on Ratings. PROJECT OVERVIEW: The purpose of this Imdb movie dataset analysis project is to predict the success of any upcoming movie using data tools. For this purpose we have proposed a method that will analyze the cast and crew of the movie to find the success rate of the film using existing knowledge. Many factors like actors , directors , producers , budget , worldwide gross ,ratings and language will be considered as the algorithm to train and test the data. THE END USERS: 43%, or almost half, of IMDb TV viewers do not subscribe to a pay TV service, which is 25% more than the US general population. Among them, 26% are cord-cutters and 17% are cord- nevers . With IMDb TV, advertisers can effectively extend their reach beyond linear TV, given its high concentration of non-pay TV audiences. 46%, or nearly half, of IMDb TV viewers are family audiences, which is 23% higher than the US general population. For CPG, toys, and household products advertisers, IMDb TV can provide an effective way to reach a heavily family-oriented audience that align with their brands. This is especially beneficial to advertisers, as we’ve also found that IMDb TV viewers are more likely to purchase learning and technology toys, outdoor and sports toys. 62% of IMDb TV viewers fall in the 18-49 age cohort. In addition, 64% of IMDb TV’s audience is male, which is +32% higher than the US general population. IMDb TV can help brands engage the highly coveted male audience cohorts. 36% of IMDb TV audience is M18-49, which is +33% higher than the US general population. 19% of the IMDb TV audience is Male 18-34, which is +18% higher than the US general population. SOLUTION AND ITS VALUE : The main use of this project is it will helps the production houses who are making films and also to common audience. Based on the factors in IMDB producers choose cast and crew for their films Looking at the average ratings of each actor and director together with all the films they participated in should be able to give the producer a good idea of who to cast and who not to cast in a film that is to be out right now. The audience also have benefits from IMDB that they pick good movies to watch based on the ratings and reviews provided by IMDB. The Method of using the ratings of the films by the cast and crew has been an innovative and original way to solve the dilemma of film producers. MODELLING: A data science framework is a collection of libraries that provides data mining functionality, i.e., methods for exploring the data, cleaning it up, and transforming it into some more useful format that can be used for data processing or machine learning tasks. The whole process involves building visualizations to gain insights from your data. The main objective of data modeling is to help teams visualize the relationships between their data analytics to understand better how it all interconnects. In this project I use different modelling techniques to visualize the data some of them are bar plot scatter plot and histogram. RESULTS: After the analysis of the IMDB dataset I found some results regarding to the dataset. From 2006 to 2016 I get the data about number of movies releasing each year .the data is as follows 2006-44,2007-53,2008-52,2009-51,2010-60,2011-63,2012-64,2013-91,2014- 98,2015=127,2016-297. The Most popular movie during this period is Star wars : Episode VII As a result from this project I got average rating of a movie year wise. The project shows that the rating of a movie should effect the revenue of that movie. LINKS: