Professional Documents
Culture Documents
Statistics With R
Statistics With R
• Data is the new oil –Clive Humby (the famous phrase was later embraced by
World Economic Forum in 2011)
• Data is the new oil. We need to find it, extract it, distribute it and monetize it. –
David Buckingham
• The world’s most valuable resource is no longer oil, but data. –The Economist,
May 2017
• The data is not only new oil, but also new soil. –Mukesh Ambani (Hindustan
Times Leadership Summit, 2017)
• Fundamentals of R
• Overview of the language
• Input and Output of Data
• Operators in R
• Variables in R
History
• Start the R system, the main window (R GUI) with a sub window (R Console) will
appear.
• In the ‘Console’ window, the cursor is waiting for you to type in some R
commands.
R Introduction
• These objects can then be used in other calculations. To print the object just
enter the name of the object. There are some restrictions when giving an object a
name:
• Object names cannot contain ‘strange’ symbols like !, +, -, #.
• A dot (.) and an underscore (_) are allowed, also a name starting with a dot.
• Object names can contain a number but cannot start with a number.
• R is case sensitive, X and x are two different objects, as well as temp and temP.
Few important points:
1. Logical
• a1=TRUE or a1=T
• a2=FALSE or a2=F
2. Numeric
• a3=4
• a4=7.6
3. Integer
• a5=10
4. Complex
• z=3+4i or z=complex(real=3,imaginary=4)
5. Character or String
• char=“Hello, how are you?”
Types of Operators:
• There are some in-built datasets in R under the package named ‘datasets’.
• data(), command to see the list of these in-built datasets.
• View(iris), command to view a dataset named ‘iris’.
• dim(iris), command to see the dimensions of the dataset.
• names(iris), command to see the names of all the column heads in the dataset.
• summary(iris), to summarise a particular dataset.
• mean(trees$Height), command to view the mean value of the ‘Height’ head in
the ‘trees’ dataset.
• median(trees$Height), command to view the median value.
• plot(trees$Height,trees$Volume), command to create a scatterplot.
• hist(trees$Height), command to create a histogram.
Packages