Download as pdf
Download as pdf
You are on page 1of 11
Unit lay a8 Sa (din canal developed by Ross thaka and Robert Gentleman in 1993 includes machine ter nt tlOe OF statistical and graphical methods. ‘arning algorithm, linear regression, time series, statistical inference to name a few. Most of th he R libraries are 2. n codes are preferred nS Be Mitten in R, but for heavy computational task, C, C+ and Fortra R is not onh 'y entrusted by academic, but many | i ‘snot0 ic, ny large companies also use ymmning language, including Uber, Google, Airbnb, Facebook and soon es Data anatys with R is done in a ‘Communicate the reste” 1" # Series of steps; programming, transforming, discovering, modeling and . Program: Ris a clear and accessible programming tool i Dein spade we 7 colleetion of Hbravies designed specifically for data " a, refine your hypothesis and analyze them Model: R provides a wide array of tools to capture the right ‘model for your data + Communicate: Integrate codes, » Braphs, and outputs to ar it apps to share with the world) pp pt utputs to a report with R Markdown or build Shiny Winatis R used for? + Statistical inference + Data analysis + Machine learning algorithny) 1®- R language is an open source program maintained by the R core-development team. \t isa team of Volunteer developers from across the globe. F language used for performing statistical operations + Itis available from the R-Project website www -project.org. + Risacommand line driven program, + The user enters commands at the prompt (>by default) and each command is executed one at atime. + Many routines have been written for‘R analytics by people all over the world and made freely available from the R project Website as packages. + The basic installation (for Linux, Windows, or Mac) contains a powerful set of tools for most purposes + Ris a consolidated environment for performing statistical operations and generating R data analysis reports in graphical or text formats. + Recommands entered in the console are evaluated and executed. AD’ i Penis Cra) eer Ter eeu Pome) feo Pee! ‘TRADEOFFS OF USI ‘AGES Al Bisa programming language, mainly dealing withthe statistical computation of data and graphical representations. y data seience experts claim that R can be considered as a very different application, of its = Many licensed contemporary tool, SAS. «This data analytics tool was developed at Bell Laboratories, by John Chambers and his colleagues. «|The various offerings of this tool include linear and non-linear modelling, classical statistical tests, wnalysis, clustering and graphical representation. grated suite of software facilities, for the purpose of data time-series a + It-can be referred to as a more intey manipulation, calculation and data visualization. ‘The R envionment is more of a well-developed space for an R programming language, inclusive of user-defined recursive fimetions as well as input and output facilities. «Since it isa relatively new data analytics too in the IT sphere, iis till eonsidered to be very popular amongst a lot of data enthusiasts. ai ood + There are a number of advantages of this data analytics tool, wh is alytics tool, which make it so. ver amongst Data Scientists ace Firstly, the fact . on ie fact that it is by far the most comprehensive statistical analysis package avatlable totally in its favour. This tool strives to incorporate all of the standard statistical tests, models and analyses as well as provides for an effetive language so as to manage and manipulate data One of the biggest advantages of this tool i the fact that it is entirely open sourced This means that it can be downloaded very easily and is free of cost This is mainly the reason why there are also communities, which strive to develop the various aspects ofthis tool. Curently, there are about some 19 developers, including practising professionals fom the IT industry, who help in tweaking out this software This is also the reason why most of the latest technological developments, are first to artive on this software before they are seen anywhere else. ees sa teateto ate ungee sean ero, paves Sut Lappe asnng te Bavantages of R Programming Language: 1 B Disadvantages of R Pt Beye R Ris the most comprehensive statistical analysis package as new technology and ideas often appear first in R Ris open-source software. Hence anyone can use and change it isan open source. We can run R anywhere and at any time, and even soll it under conditions of the license, R is good for GNU/Linux and \ operating systems. In R, anyone is welcome to provide bug fixes. code enhancements, and new packages) 4A mming Language: soft Windows. R is cross-platform which runs on many In R, quality of some packages is less than perfect. In R, no one to complain, if something doesn’t work. R isa sofiware Application that many people devote their own time to develop} R commands give litle thought to memory management, and so R ean consume all availble memory. \VIRONME! R environment ‘The name R is used to describe both the R language and the R software environment that is used (© run code written in the language. The R software can be run on Windows, MacOS X, and Linux. (Reema a COMMAND LINE INTERFACE + The command line «The R command line interface consists of a prompt, usually the > character. 1 Wenype code written in the R language and, when we press Enter, the code is run and the result is printed out. «Avery simple interaction with the command line looks like this: > 1434547 (1) 16 Examples of R code will displayed like this, with the R code preceded by a prompt, >, and the results of the code (if any) displayed below the code. The format of the displayed result will vary because there can be many different kinds of results from running R code, Koja ‘Seren + One way to write R code is simply to enter it interactively at the command line as shown above, This interactivity is beneficial for experimenting with R or for exploring a data set in a casual manner. + A.superior approach in general is to write R code in a file and get R to re the code from the file. cut-and-paste ‘One way to work is to write R code in a text editor and then cut-and-paste bits of the code from the text editor into R. Some editors can be associated with an Rsession and allow submission of code chunks via ‘single key stroke (e.g,, the Windows GUI provides a script editor with this facility). source) Another option is to read an entire file of R code into R using the source() function For example, if we have a file called code.R containing R code, then we ean rn the R code by yping the following at the R command line: > source("code.R") > R reads the code from the file and runs it, one Tine at atime: STUDIO Studio is an integrated development environment (IDE) for R language. It is a code editor and development environment, with some nice features that make co easy and fun a. Features of RStudio * Code highlighting that gives different colors to keywords an + Automatic bracket matching 2 Code completion, so as to reduce the effort of typing the commands in full 2 Easy access to R Help, with additional features for exploring functions and parameters of functions Easy exploration of variables and values. RStudio is available free of charge for Linux, Windows, and Mas devices, Itcan be directly accessed by clicking the RStudio icon in the menu system on the desktop. Because RStudio is available free of charge for Linux, Windows, and Mac devices, it is a good option to use with R. To open RStudio, click the RStudio icon in the menu system or on the desktop b. Components of RStudio ‘© Source — Top left comer of the screen contains a text editor that lets the user work with source script files, Multiple lines of code can also be entered here. Users can save R script file to disk and perform other tasks on the script. + Console ~ Bottom left comer is the R console window. The console in RStudio is identical to the console in RGui. All the interactive work of R programming is performed in this window + Workspace and History — The top right corner is the R workspace and history window. This provides ‘an overview of the workspace, where the variables created in the session along with their values can be inspected. This is also the area where the user can see a history of the commands issued in Files, Plots, Package, and Help the bottom right comer gives access to the following tools de development in R variables, making it easier to read + Files ~ This is where the user can browse folders and files on a computer. «Plots — Now, this is where R displays the user's plots «Packages ~ This is where the user can view a list of all the installed packa «+ Help This is where you can browse the buill-in Help system of R. GETING STARTLED WIIG INS SLALLATION AND ORIENTATION R& RStudio: Lnstatta & Orlentatio We need to be able to perform calculations quickly and easily A few calculations ean be done by hand, but mont require a ele is available for Mac, Windows, and Linux, His also free, whic i greatly at, and m0 mit famously hard to Fearn at iFM Installation Base | To use |, we first have to install it cis an open source project, and the latest is a Network” (CRAN): http.//eran F-projecl ore lable for dawnload at “The Comprehensive K Archive RStudio Once we have downloaded and installed R we could be done and start right away using Rand the basic ‘iter face that t comes with, but we find that interface 1 be abi Fimited peers at interface comes from a separate open source project: RStudio, Heeaertent stable version of RStudio can be downloaded at hp/frstudio ory/download/desktop. Orientation The Console On the left isthe console. This is the main way that you interaet with R, issuing e¢ pmmands and reading the results that they produce, Example using simple math: 242 10*5-4/0.5 a Installation guide for R and RStudio Step 1 Install R 1. Download the R installer from https:/eran.t---proj ig ur laptop, then ask 2. Run the installer, Default settings are fine. If you do not have admin rights ois full permissions you local IT suppor. In that case, itis important that you als ask them Ey A ator to the R directories, Without this, you will not be able t stall additior Step 2 — Install RStudio 1. Download RStudio: https:/www tstudio.com/produets/rstudio/download/ @ . . ssuopeten 98.467 — ; 2. Once the installation of R has completed successfully (and not before), run the RStudio installer. 3. Ifwe do not have administrative rights on your laptop, step 2 may fail. Ask your IT Support or download a pre-built zip archive of RStudio which doesn’t need installing. The link for this is towards the bottom of the download page, highlighted in Image 2. ‘a. Download the appropriate archive for your system (Windows/Linux only ~ the Mac version can be installed into your personal “Applications” folder without admin rights). b. Double clicking on the zip archive should automatically unpack it on most Windows: machines, Step 3 ~ Check that R and RStudio are working 1. Open RStudio. It should open a window that looks similar to image 3 below. 2. Inthe left hand window, by the >’sign, type ‘4++5(without the quotes) and hit enter. An output line reading ‘[1] 9° should appear. This means that R and RStudio are working, 3. Ifthis is not successful, contact us or your local IT support for further advice Step 4~ Install R packages required for the workshop Click on the tab * Packages” then ‘Install’ as shown in Image 4. Or Tools —-> Install packages. 2. Install the following packages: mixOmies version 6.1.0, mvtnorm, RColorBrewer, corrplot, igraph (see Image 4). For apple mac users, if you are unable to install the mixOmics imported library rel, you will need to install the XQuartz software first xww.xquartz.org/ 3. Check that the packages are installed by typing ‘library(mixOmics)" (without the quotes) in the prompt and press enter (see Image 5). 4. Then type ‘sessionInfo()" and check that mixOmics version 6.1.0 has been installed (image Figure 4, Click on Install to install R packages. Figure 5 y the list of packages to be installed RPACKAGES What is a package in R programming? em/ A package is a set of R functions and data-sets and the library isa folder on your system computer which stores the files for those package(s). Installing Packages While the basic R program can do an immense amount of very powerful statstical work, some of the real power of the system comes through the packages that have been written for it. Most packages can be found at CRAN (The Comprehensive R Archive Network), but we don’t actually need to go to the website to install them. The main method of installation is quite simple. Assuming we know the name of the package, wwe can use the install packages() command in the console. Installing the plyr package would be done as follows: install,packages(“plyt") ‘An alternative method in RStudio is to go to the “Packages” tab and click on the “Install Packages” button at the top of the tabe of currently installed packages. This will pop up a window in which you can type the names of the package(s) we want to instal. For now, try installing the plyt, geplot2, and knitr packages. ‘We will be using those (and some others) later in the semester. Make sure that the “Install dependencies” checkbox is checked, as this will allow for R to automatically install some other packages that those packages depend on. Loading Packages in R For loading a package which is already existing and installed on our system, we can make use of and call the library function. It is widely granted that calling a function library was a faul, and that calling it load_package would have saved a lot of uncertainty. But since the concept ‘of function exists long ago, it is difficult’ to remove this concept. Ifwe have a standard edition of R — ice, we have not built some customized version from the source program — a package named 'latice’ need to be installed, but it won't automatically get loaded. we can load it using the library funetion: library(lattice) we can now employ all the functions included within lattice package. It is to be noted that the name of the package needs to be passed to library without being enclosed within quotes. If you want to programmatically pass the name of that package to library, then you need to set the argument ‘character.only’ as TRUE. This is gently useful if you have huge number of packages for loading: ils pkges <- ¢ ("lattice” "rpart") for (pkg! in pkges) library (pkgl, character.only = TRUE) Lists ofall the packages installed library() J output. This output When we execute the above code snippet, it gem which will come up miay vary based on the local seiting of your system. Packages in Library ‘:/Progran Files/A/R-3.2.2/140°ary": base ‘The R Base Package boot Bootstrap Functions (Originally by Angelo Canty for 8) class Functions for Classification cluster “Finding Groups in Data": Cluster Analysis Extended Rousseeus et al, codetools ode Analysis Tools for R compiler The R Compiler Package Installing any New Package in R There are 2 means for adding new packages wi directly from within CRAN directory like thi in aR program. Ist is installing package install. packages("Package Name") # Install the package named "XML". install. packages ("XML") Secondly by downloading the package to your local machine and then adding / installing it manually. The Search Path we can see the packages which are loaded by means of the search function: > search() ## [1] ".GlobalEnv" "package:stats" “package:graphics" ## [4] "package:grDevices™package:utils" "package: #4 [7] "package:methods" "Autoloads" "package:base” This collection of packages confirms the order of places which R will come across to try finding any variable. The global environment always comes at the beginning, and then the most recently loaded packages. MAINTAINING PACKAGES IN R ‘After your packages get installed and you frequently want to update them in order to have an up to date latest versions, This is possible using update.packages. By default, the function will remind us to update each package. update,packages (ask = FALSE) # this won't ask for package updating tasets” It may happen that you may want to delete any package. It is possible using the remove packages(). Example: remove,packages("Z00' if,

You might also like