Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 25

NATIONAL INSTITUTE OF FASHION TECHNOLOGY

PATNA

DATA ANALYTICS AND R


END TERM JURY ASSIGNMENT

Submitted by:
Debamita Basak (BFT/21/400)
Shreya Srivastava (BFT/21/459)
Navya Angel (BFT/21/763)

Under the guidance of:


Ms. Nilima Regina Topno
Associate Professor, NIFT Patna

Department of Fashion Technology


Batch 2021-25
Abstract

In today's data-driven world, the ability to effectively analyze data is crucial for making
informed decisions and gaining valuable insights. Data analytics tools, such as R, provide
powerful capabilities for analyzing and visualizing data, making them indispensable for
researchers, analysts, and decision-makers across various industries.
This paper examines the process of analyzing data using R, a popular programming language and
software environment for statistical computing and graphics. We also discuss the advantages of
using R for data analysis, including its flexibility, extensibility, and rich ecosystem of packages.
By providing a comprehensive overview of the process of analyzing data using R, this paper
aims to help researchers and practitioners effectively leverage data analytics tools to extract
meaningful insights from their data and make informed decisions.

1
Certificate

This is to certify that this Report titled - “Analysis of data using data analytics tool ,” is based on the
original research work of- Debamita Basak, Shreya Srivastava and Navya Angel, conducted under the
guidance of Ms. Nilima Regina Topno, Associate Professor at NIFT Patna towards partial fulfillment of
the requirement for the award of Bachelor’s Degree, of National Institute of Fashion Technology, Patna.
No part of this work has been copied from any other source. Material, wherever borrowed has been duly
acknowledged.

Date:

Signature:
Debamita Basak
Navya Angel
Shreya Srivastava

Signature:
Ms. Nilima Regina Topno
Associate Professor
NIFT-Patna

2
Acknowledgement

We extend our heartfelt gratitude to the National Institute of Fashion Technology for giving us
the opportunity to take this assignment. Foremost, we would like to thank our subject faculty of
‘Data Analytics and R’, at NIFT, Ms. Nilima Regina Topno for giving their invaluable feedback
and the guidance on this assignment throughout the classes. This could not have been achieved
without their support.
Lastly, we take the opportunity to thank all the people who guided us through the entire process,
and fellow students at NIFT who have imparted the necessary knowledge and skills that I
required to complete this document.

Yours sincerely,
Debamita Basak
Navya Angel
Shreya Srivastava

3
Table of Contents
Abstract......................................................................................................................................................1
Certificate...................................................................................................................................................2
Acknowledgement.....................................................................................................................................3
Introduction to Data Analytics and R......................................................................................................6
Applications of Data Analytics and R......................................................................................................7
Advantages of using Data Analytics and R..............................................................................................8
Our Survey.................................................................................................................................................9
Responses of Survey................................................................................................................................10
Analysis of Data.......................................................................................................................................13
Analysis 1- Linear Regression Line....................................................................................................13
Analysis 2- Multiple Regression Line.................................................................................................16
Analysis 3- Time-series........................................................................................................................18
Analysis 4- Pie chart............................................................................................................................20
Conclusion................................................................................................................................................22

4
Table of Figures

Figure 1: Data Analytics with R Programming............................................................................................6


Figure 2: Applications of Data Analytics and R..........................................................................................7
Figure 3: Advantages of using R.................................................................................................................8
Figure 4: Linear regression line.................................................................................................................14
Figure 5: Output of linear regression line 1...............................................................................................14
Figure 6: Output of linear regression line 2...............................................................................................15
Figure 7: Multiple Regression...................................................................................................................16
Figure 8: Multiple Regression Output.......................................................................................................17
Figure 9: Time-series graph.......................................................................................................................19
Figure 10: Time-series Output...................................................................................................................19
Figure 11: Pie Chart...................................................................................................................................20
Figure 12: Pie-chart output........................................................................................................................21

5
Introduction to Data Analytics and R

• Data analytics is the process of analyzing raw data to extract meaningful insights and
make informed decisions. It involves various techniques and tools to uncover patterns,
trends, and correlations in data. R is a programming language and software environment
commonly used for data analysis and statistical computing.
• R provides a wide variety of statistical and graphical techniques, and it is highly
extensible, allowing for easy integration with other applications and data sources. It is
particularly popular in academia and among data scientists for its powerful capabilities in
data visualization, data manipulation, and statistical modeling.
• In data analytics, R can be used to perform tasks such as data cleaning, exploratory data
analysis, hypothesis testing, and predictive modeling. Its flexibility and rich ecosystem of
packages make it a versatile tool for a wide range of data analytics tasks.
• Data analytics with R is about using data to gain insights, make predictions, and drive
decision-making processes, making it a valuable skill for anyone working with data in
various fields such as business, science, and academia.

Figure 1: Data Analytics with R Programming

6
Applications of Data Analytics and R

Data analytics and R can be used in a wide range of fields and industries. Some common
applications include:
• Business and Finance: Analyzing sales data, financial trends, and market data to make
informed business decisions. R can be used for financial modeling, risk analysis, and
portfolio management.
• Healthcare: Analyzing patient data, medical records, and clinical trials to improve
patient outcomes and streamline healthcare processes. R can be used for epidemiological
studies, clinical research, and healthcare analytics.
• Marketing and Advertising: Analyzing customer behavior, market trends, and
advertising campaigns to optimize marketing strategies. R can be used for market
segmentation, customer profiling, and campaign analytics.
• E-commerce: Analyzing customer preferences, sales data, and website traffic to improve
user experience and increase sales. R can be used for product recommendations, pricing
optimization, and customer churn analysis.
• Manufacturing and Supply Chain: Analyzing production data, supply chain logistics,
and quality control processes to improve efficiency and reduce costs. R can be used for
demand forecasting, inventory management, and process optimization.
• Education: Analyzing student performance data, learning outcomes, and educational
trends to improve teaching methods and curriculum design. R can be used for educational
research, student assessment, and program evaluation.

Figure 2: Applications of Data Analytics and R

7
Advantages of using Data Analytics and R

Using data analytics and R offers several advantages:


• Data Management: R provides tools to import, clean, and manipulate data, making it
easier to prepare data for analysis.
• Statistical Analysis: R has a vast array of packages for statistical analysis, making it a
powerful tool for exploring data and deriving insights.
• Visualization: R has excellent data visualization capabilities, allowing you to create
high-quality graphs and plots to communicate your findings effectively.
• Reproducibility: R scripts can be saved and rerun, ensuring that analyses are
reproducible and making collaboration easier.
• Community Support: R has a large and active user community, so you can find help and
resources easily.
• Integration: R can easily integrate with other languages and tools, making it versatile for
different analytical needs.
• Scalability: R can handle large datasets and complex analyses, making it suitable for a
wide range of applications.
• Cost-Effective: R is open-source and free to use, making it a cost-effective option for
data analysis.
• Strong statistical foundation: R is a powerful tool for conducting linear and nonlinear
modeling, time series analysis, and hypothesis testing for statistical analysis.

8
Figure 3: Advantages of using R

Our Survey

Online spaces utilised by users to interact, share, communicate, create or maintain connections
with others for academic, entertainment, sociability, etc. are referred to as social networking
spaces. The popularity of social networking as a communication tool is rapidly growing, largely
due to the successful expansion of mobile device applications. Young adults in particular are
becoming accustomed to chatting about their hobbies, keeping in touch with teachers, friends,
and family online, and sharing details of their day-to-day experiences.
Social media allows individuals to stay in-tuned with friends and relatives . Some people will use
various social media applications to network and find career opportunities, connect with people
across the world with like-minded interests, and share their own thoughts, feelings, and insights
online. Educationally, social media also have various uses.
To examine social networking usage and its user’s behaviour, there seems to be a need for a
reliable and valid questionnaire to be developed. So, the sole purpose of this study is to bridge
this gap and validate the developed questionnaire regarding its psychometric properties by
specifying its accuracy and consistency of measurement.
This study examines the Social media usage and user’s behaviour. Our aim is to discover,
people's favourite social media platforms, preferred time, how they want to see the content, how
satisfied they are with the privacy policy and their overall behaviour toward social media
website.
We have designed a survey to know people’s behaviour on social media websites which plays
major role in purchasing product or services online. The first section asks for information about
the student’s background (e.g., Name, age, gender, occupation,). The second section include
questions like how many account they have on different social media websites, which device
they use to view social media sites, what time they are most active on social media, how satisfied

9
they are with social medial privacy policies, if privacy policy affect the way they post on social
media and their purpose to use social media. The response of some of the question’s options are
0=never, 1=occasionally, 2=often, and 3=very often. We have analyse the data using R and made
a linear and multiple regression analysis and timeseries analysis.

10
Responses of Survey

11
12
13
14
Analysis of Data

After linking the survey responses in an Excel sheet, we imported the data into the R Studio app
for analysis. Using R's powerful data manipulation and statistical analysis tools, we were able to
gain deeper insights into the impact of social media usage and data analytics based on the
responses.

Analysis 1- Linear Regression Line


This regression line shows the relationship between the age of the respondents and the number of
hours they use the mobile phones.
In R, a linear regression line is a straight line that best fits a set of data points. It is used to model
the relationship between a dependent variable (Y) and one or more independent variables (X).
The linear regression line is represented by the equation Y = a + bX.
Code:
# Step 1: Read data from file
ads <- read.csv("C:/my assignments/Semester 6/Data Analytics and R/jury response.csv")

View(ads)
nrow(ads)
ncol(ads)
colnames(ads)
Age <- ads$Age
hours <- ads$How.many.hours.do.you.spend.on.social.media.sites.
plot(Age,hours)
plot(Age, hours, pch=16, cex=1, col='blue',
main='Age vs hours', xlab='Age', ylab='hours')
model <- lm(hours ~ Age)
summary(model)
attributes(model)
coefficients(model)
abline(model)

15
Figure 4: Linear regression line

Figure 5: Output of linear regression line 1

16
Figure 6: Output of linear regression line 2

17
Analysis 2- Multiple Regression Line
This regression lines shows the relationship between the age, hours they use mobile phones and
year from which they are using social media.

In R, a multiple regression line is a statistical model that allows us to analyze the relationship
between multiple independent variables and a dependent variable.
Code:
# Step 1: Read data from file
ads <- read.csv("C:/my assignments/Semester 6/Data Analytics and R/jury response.csv")
Age <- ads$Age
hours <- ads$How.many.hours.do.you.spend.on.social.media.sites.
Years <- ads$Foe.how.many.years.have.you.been.using.social.networking.
mouse.data <- data.frame(Age,hours,Years)
plot(mouse.data)
plot(mouse.data,col='red')

multiple.regression <- lm(Age~hours+Years,data=mouse.data)


summary(multiple.regression)

Figure 7: Multiple Regression

18
Figure 8: Multiple Regression Output

19
Analysis 3- Time-series
This time-series graph shows the relationship between the Age and Year from which they have
started using social media, of the respondents.

In R, a time series is a series of data points indexed in time order. Time series data in R is
typically represented as a vector or a data frame with an additional time-based index. This index
can be a specific date or time format, allowing for easy manipulation and analysis of temporal
patterns.
Code:
# Step 1: Read data from file
time <- read.csv("C:/my assignments/Semester 6/Data Analytics and R/jury response.csv",
header=TRUE)

View(time)
time
class(time)

# Convert our data to timeseries


timets<-ts(time$Age, start=2004, end=2023,frequency=4)
timets

class(timets)
plot(timets)
lines(timets)
plot(timets,col='purple',main='Age vs Time')

20
Figure 9: Time-series graph

Figure 10: Time-series Output

21
Analysis 4- Pie chart
This pie chart shows that why the respondents are using these social networking sites.
In R, a pie chart is a circular statistical graphic divided into slices to illustrate numerical
proportions. Each slice in a pie chart represents a proportion of the whole, and the size of each
slice is proportional to the quantity it represents.
Code:
#Taking data from the survey response to make a pie chart

x <- c(47,20,21,27,43,27,17)
labels <- c("contact and connect with family","raise awareness","to feel a sense of belonging","to
create content","sharing or liking posts","keep up with news and trends","help with studies")

pie(x,labels)
pie(x,labels,main="Main purpose of using social networking sites",col=rainbow(length(x)))

Figure 11: Pie Chart

22
Figure 12: Pie-chart output

23
Conclusion

Using data analytics, particularly with tools like R, to analyze a survey on the impact of social
media usage and behavior can provide valuable insights. R's flexibility in data manipulation and
visualization makes it well-suited for this task. Through statistical analysis, patterns and trends in
social media usage can be identified, such as the frequency of usage, preferred platforms, and the
impact on mental health or social interactions.
Moreover, R can help in identifying correlations between social media behavior and other
variables, such as age, gender, or personality traits. This can lead to a deeper understanding of
how different demographic groups engage with social media and how it influences their
behavior.

24

You might also like