Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Big DATA

By: Amit Shankar


Data Science is not R.
Data Science is not Python.
Data Science is not SQL.
Data Science is not Excel.
Data Science is not SAS.
Data Science is not Statistics.
Data Science is not Experiments.
Data Science is not Tableau.
Data Science is not Visualisation.
Data Science is not Spark.
Data Science is not TensorFlow.

Data Science is using above tools and techniques, and if required inventing new tools
and techniques, to solve a problem using “data” in a “scientific” way.
Data Science tools
Types of data analytics
Statistics Vs Machine learning
Introduction TO R
History

● R was created by Ross Ihaka and Robert Gentleman at the


University of Auckland.

● First appeared: August 1993

● R and its libraries implement a wide variety of statistical


and graphical techniques, including linear and nonlinear
modelling, classical statistical tests, time-series analysis,
classification, clustering, and others.
What is R
• R is an environment for data manipulation, statistical
computing, data analysis and data visualization.
• Better data handling and storage of output.
• Combination of both simple and complex data analysis.
• Own programing language.
• Similar to “s” language (extension of S plus software)
• 10000 packages.
Why R?
• No cost
• Statistical computing environment
• Open source
• Easy language
• Codes can be saved, run and stored
• Available for all platform
• Built in and contributed packages are available. Users can create their own
packages
• Interpreted computer language not compiler
• Error indication
• Graphics can be saved in different format
Library in R
• In R, a package is a collection of R functions, data and compiled
code. The location where the packages are stored is called the library.
• Base library (MASS, mgcv)
• Special library
library(spatial)
library (help=spatial)
Packages in R
• install.packages(“rmeta”)
• Install.packages(“xlsx”)
Help in R
• Menu
• Google “Baba”
• ?mean
• help.search (“data input”)
• help()/help.start()
• find(“lowess”)
• apropos("lm")
Example and demonstration
• example(lm)
• demo(persp)
• demo(graphics)
Quit
• Q()
Command line and Script
• Command (Enter)
• Script (CTRL+R)
Introduction of R Studio
• Interface between R and us
• Easy for beginners
• Help in coding
• Suggestions
• 4 windows
Script
Console
Environment
Output

You might also like