Professional Documents
Culture Documents
Exercise - Commands in Blue, Comments in Green, Outputs in Black
Exercise - Commands in Blue, Comments in Green, Outputs in Black
Depending on the format of certain data sets, when calculating standard deviation, mean
etc. you may need to call a specific row:
>x=somedataset[,1] #rename your data here and select just the first column
[row, column]
>sd(x)
RStudio has a lot of preloaded datasets, to browse the library use the following:
> data() # a list will prepopulate and you will be able to scroll through, if you hover on a certain
dataset, a description will appear.
>data(“AirPassengers”)
>View(AirPassengers) # to open the dataset and manually scroll through and view its contents –
data set will appear at the top left of RStudio
> mean(AirPassengers)
[1] 280.2986
> median(AirPassengers)
[1] 265.5
> min(AirPassengers)
[1] 104
> max(AirPassengers)
[1] 622
> sd(AirPassengers)
[1] 119.9663
> quantile(AirPassengers)
0% 25% 50% 75% 100%
104.0 180.0 265.5 360.5 622.0
> summary(AirPassengers)
Min. 1st Qu. Median Mean 3rd Qu. Max.
104.0 180.0 265.5 280.3 360.5 622.0
> hist(AirPassengers)
> boxplot(AirPassengers)
> plot(AirPassengers)
References: