Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

This repository contains analysis of IMDB data from multiple sources and analysis of movies/cast/box office revenues, movie brands and franchises

Notifications You must be signed in to change notification settings

ashmitan/IMDB-Analysis

Open more actions menu

Repository files navigation

IMDB Data Analysis Pipeline

Objective:

The aim of the project is to analyse the movies data from multiple sources such as IMDB MoviesLens, The Numbers and BoxOffice Mojo.com based on movies/cast/box office revenues, movie brands and franchises and perform ETL processes using Talend.

Technologies Used:

  1. ER/ Studio
  2. SQL server Developer Edition
  3. Microsoft SQL server Management Studio
  4. Talend Real-Time Data Platform 7.1
  5. Tableau Desktop
  6. Microsoft PowerBI

Dataset Links:

  1. https://datasets.imdbws.com/
  2. https://www.boxofficemojo.com/franchise/?ref_=bo_nb_fr_secondarytab
  3. https://www.boxofficemojo.com/brand/?ref_=bo_nb_frs_secondarytab
  4. https://grouplens.org/datasets/movielens/25m/
  5. https://www.the-numbers.com/movies/franchises
  6. https://www.the-numbers.com/movies/franchise/Marvel-Cinematic-Universe#tab=summary
  7. https://www.the-numbers.com/movie/Avengers-The-(2012)#tab=box-office

Code Walkthrough:

Step 1 : Run following script in SSMS to setup the staging database

The Number - stage tables.sql

stg imdb tables - core tables.sql

stg imdb tables expanded part 2.sql

stg_ml_tables.sql

Step 2 : Open Talend and setup your database connections and input file connections

When the connections are successful run jobs.

Step 3 : Perform Visualizations in Tableau and PowerBI

Refer to Tableau workbook for checking visualizations and new use cases will be added soon. Microsoft PowerBI file to be added soon.

References:

Contact

Please feel free to reach out to ashmitan20@gmail.com for any questions or any changes you propose.

About

This repository contains analysis of IMDB data from multiple sources and analysis of movies/cast/box office revenues, movie brands and franchises

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages

Morty Proxy This is a proxified and sanitized view of the page, visit original site.