Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

A report on

CUTM Alumni Data Analysis and Visualization Using Python

Submitted by

Amit Kumar (190101120110)


Raj Gautam (190101120111)
Sandeep Kumar (190101120102)
Jyoti Kumari (190101120103)

Submitted to
DR. DHAWALESWAR RAO
Assistant Professor, CSE

School of Engineering & Technology


CENTURION UNIVERSITY OF TECHNOLOGY AND MANAGEMENT
ODISHA

1
School of Engineering & Technology
CENTURION UNIVERSITY OF TECHNOLOGY AND MANAGEMENT
ODISHA

CERTIFICATE

This is to certify that the dissertation entitled “CUTM Alumni Data Analysis and
Visualization Using Python” is a Bonafide work done Group 9 of B-tech(CSE) in
partial fulfilment for the award of the degree of B.tech in CSE, School of
Engineering & Technology of Centurion University of Technology and
Management, Odisha. The project work done and the report satisfy the
requirements for the award of the degree mentioned.

Supervisor HOD Dean


DR. DHAWALESWAR RAO Debendra Maharana Dr. Ashish Ranjan Dash

2
ACKNOWLEDGEMENT

I wish to express my profound and sincere gratitude to Mr. Anshuman Patnaik,


School of Engineering &Technology, CUTM Paralakhemundi, who guided us into
the intricacies of this project non-chalantly with matchless magnanimity.

Group 6

Amit Kumar (190101120110)


Raj Gautam (190101120111)
Sandeep Kumar (190101120102)
Jyoti Kumari (190101120103)

3
CONTENTS

Contents
ABSTRACT...................................................................................................................................................... 5
INTRODUCTION ............................................................................................................................................. 5
Objective ....................................................................................................................................................... 5
The Proposal ................................................................................................................................................. 5
System Requirement..................................................................................................................................... 5
Software Required ........................................................................................................................................ 6
Libraries used in this project ......................................................................................................................... 6
Pandas ........................................................................................................................................................... 6
Why use Pandas? ...................................................................................................................................... 6
What can Pandas do? ............................................................................................................................... 6
Numpy ........................................................................................................................................................... 6
Why Use NumPy?...................................................................................................................................... 6
Why is NumPy Faster Than Lists? ............................................................................................................. 7
Matplotlib ..................................................................................................................................................... 7
What is Matplotlib? .................................................................................................................................. 7
Dash .............................................................................................................................................................. 7
Plotly Express ................................................................................................................................................ 7
Implementation ............................................................................................................................................ 8
CONCLUSION............................................................................................................................................... 27
References .................................................................................................................................................. 27

4
ABSTRACT
In this project we are going to analysis the COVID-19 dataset and also we are going to visualize it with help of
various graphs like bar, pie, line, etc. In this project we are using different libraries like Numpy, Pandas, Matplotlib,
Dash, Plotly, Plotly.express, etc. We have also find numerous questions to extract some really meaningful and
important information.

INTRODUCTION
Python provides numerous libraries for data analysis and visualization mainly numpy, pandas, matplotlib, seaborn
etc. In this section, we are going to discuss pandas library for data analysis and visualization which is an open source
library built on top of numpy.

This alumni dataset is in xlsx format. It has 27 columns and 1625 rows. This dataset contains data of the
alumni till 2021 who have passed out from CUTM. It has several columns like Company, Future plans, Email,
Mobile, etc.

Objective
The project is to be considered to achieve the following objectives:-

➢ To extract important information from the dataset.


➢ To analyse the dataset.
➢ To visualize the dataset and get some benefit out of it.

The Proposal
It is proposed here to make this project in Jupyter notebook which is a great ide for python apps. We will be
finding about 100 questions from it and then we will draw approx 20 graphs. We are also going to implement Dash
in this project.

System Requirement
A computer with atleast:
· 4GB RAM

· 1 TB HARDISK

· i3 PROCESSOR

· 2GB GRAPHICS CARD

5
Software Required
• Anaconda Navigator
• Jupyter Notebook

Libraries used in this project

Pandas
Pandas is a software library written for the Python programming language for data manipulation and analysis. In
particular, it offers data structures and operations for manipulating numerical tables and time series. It is free
software released under the three-clause BSD license.

Why use Pandas?


Pandas allows us to analyze big data and make conclusions based on statistical theories.

Pandas can clean messy data sets, and make them readable and relevant.

Relevant data is very important in data science.

What can Pandas do?


Pandas gives you answers about the data. Like:

• Is there a correlation between two or more columns?

• What is average value?

• Max value?

• Min value?

Pandas are also able to delete rows that are not relevant, or contains wrong values, like empty or NULL values. This
is called cleaning the data.

Numpy
NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and
matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

Why Use NumPy?


In Python we have lists that serve the purpose of arrays, but they are slow to process.

NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.

The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working
with ndarray very easy.

Arrays are very frequently used in data science, where speed and resources are very important.

6
Why is NumPy Faster than Lists?
NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate
them very efficiently.

This behavior is called locality of reference in computer science.

This is the main reason why NumPy is faster than lists. Also it is optimized to work with latest CPU architectures.

Matplotlib
What is Matplotlib?
Matplotlib is a low level graph plotting library in python that serves as a visualization utility.

Matplotlib was created by John D. Hunter.

Matplotlib is open source and we can use it freely.

Matplotlib is mostly written in python, a few segments are written in C, Objective-C and Javascript for Platform
compatibility.

Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension
NumPy. It provides an object-oriented API for embedding plots into applications using general-purpose GUI toolkits
like Tkinter, wxPython, Qt, or GTK.

Dash
Dash is an open-source Python framework used for building analytical web applications. It is a powerful library that
simplifies the development of data-driven applications. It’s especially useful for Python data scientists who aren’t
very familiar with web development. Users can create amazing dashboards in their browser using dash.

Built on top of Plotly.js, React, and Flask, Dash ties modern UI elements like dropdowns, sliders and graphs directly
to your analytical python code.

Dash apps consist of a Flask server that communicates with front-end React components using JSON packets over
HTTP requests.

Dash applications are written purely in python, so NO HTML or JavaScript is necessary.

Plotly Express
Plotly Express is a new high-level Python visualization library: it’s a wrapper for Plotly.py that exposes a simple
syntax for complex charts. Inspired by Seaborn and ggplot2, it was specifically designed to have a terse, consistent
and easy-to-learn API: with just a single import, you can make richly interactive plots in just a single function call,
including faceting, maps, animations, and trendlines.

7
Implementation

8
9
10
11
12
13
14
15
16
DESHBOARD

17
18
19
20
21
22
23
24
25
26
CONCLUSION
While making this project we have learnt many things like basics of python, numpy, pandas, matplotlib, pyplot,
Dash, etc. we have got the idea about plotting the graphs (Bar, Line, Pie). It was a great fun and learning while
making this project.

References
1) www.geeksforgeeks.org

2) www.tutorialspoint.com

3) https://numpy.org/

4) https://pandas.pydata.org/

5) https://matplotlib.org/ 6) https://dash.plotly.com/introduction

27

You might also like