Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 13

Interpretations of datasets discussed in lab classes

Pramendra Kumar Singh

Interpretations of datasets
1. melsyd
library(GGally)

Warning: package 'GGally' was built under R version 4.3.2

Loading required package: ggplot2

Warning: package 'ggplot2' was built under R version 4.3.2

Registered S3 method overwritten by 'GGally':


method from
+.gg ggplot2

library(fpp2)

Warning: package 'fpp2' was built under R version 4.3.2

Registered S3 method overwritten by 'quantmod':


method from
as.zoo.data.frame zoo

── Attaching packages ────────────────────────────────────────────── fpp2 2.5


──

✔ forecast 8.21.1 ✔ expsmooth 2.3


✔ fma 2.5

Warning: package 'forecast' was built under R version 4.3.2

Warning: package 'fma' was built under R version 4.3.2

Warning: package 'expsmooth' was built under R version 4.3.2

help(melsyd)

starting httpd help server ...

done

data(melsyd)
head(melsyd)

Time Series:
Start = c(1987, 26)
End = c(1987, 31)
Frequency = 52
First.Class Business.Class Economy.Class
1987.481 1.912 NA 20.167
1987.500 1.848 NA 20.161
1987.519 1.856 NA 19.993
1987.538 2.142 NA 20.986
1987.558 2.118 NA 20.497
1987.577 2.048 NA 20.770

autoplot(melsyd[,"Economy.Class"]) +
ggtitle("Economy class passengers: Melbourne-Sydney") +
xlab("Year") +
ylab("Thousands")

The plot reveals the following details:


• There are some missing observations around second half of 1987.

• There was a period in 1989 when no passenger demands. This may be due to an
industrial dispute.

• There was a period of reduced load in 1992. This may be due to removal of business
class seats.

• A large increase in passenger load occurred in the second half of 1991. This may be
due to Australian goverment voting to repeal the Airlines Agreement Act with effect
from Oct 1991. This was a govt deregulation which resulted in dramatic growth for
the industry and easier operations. [1]

• There are some large dips in load around the start of each year. These can be due to
holiday effects.

• There is a long-term fluctuation in the level of the series which increases during
1987, decreases in 1989, and increases again through 1990 and 1991.

2. a10
autoplot(a10) +
ggtitle("Antidiabetic drug sales") +
ylab("$ million") +
xlab("Year")

• There is a clear and increasing trend in the sales.

• There is an increase in variance as the size of sales increases. This shows a strong
seasonal pattern that increases in size as the level of the series increases.

• There is a sudden drop at the start of each year. This may be caused by a
government subsidisation scheme that makes it cost-effective for patients to
stockpile drugs at the end of the calendar year.
ggseasonplot(a10, polar=TRUE) +
ylab("$ million") +
ggtitle("Polar seasonal plot: antidiabetic drug sales")

• There was an unusually small number of sales in March 2008 (most other years
show an increase between February and March).

• The small number of sales in June 2008 is probably due to incomplete counting of
sales at the time the data was collected.

3. visnights
autoplot(visnights)
As can be seen, this plot is inconclusive. Thus, we will focus on multi panel plots.
autoplot(visnights[,1:5], facets=TRUE) +
ylab("Number of visitor nights each quarter (millions)")
• There is clearly a cyclical trend in visitors to New South Wales (NSW) North Coast
region and NSW South Coast region. They follow the same pattern.

• There is a huge rise in visitors to NSW Metropolitan in the year 2000 due to Sydney
Olympics 2000.
autoplot(visnights[,6:10], facets=TRUE) +
ylab("Number of visitor nights each quarter (millions)")
• There is a cyclical pattern in visitors to Queensland North Coast. Also, there is a
cyclical pattern in visitors to South Australia Coast. This may be due to seasonal
nature of tourism to those part of Australia.

• There is almost always a spike in visitors to all the states’ metropolitan during
holidays.
autoplot(visnights[,11:15], facets=TRUE) +
ylab("Number of visitor nights each quarter (millions)")
• There is a strong and similar cyclical pattern in all of Victoria regions. This may be
due to people’s strong affinity for Victoria as a travel destination during different
times of the year.

• There is almost complete flattening of inwards tourists in South Australia Innerland.


autoplot(visnights[,16:20], facets=TRUE) +
ylab("Number of visitor nights each quarter (millions)")
• A significant spike in 2004 in Western Australia Inner.
Now to check the interrelationship between various datasets, ggpairs was used.
ggpairs(as.data.frame(visnights[,1:5]),lower = list(continuous =
wrap("smooth", size=0.5,color="blue")))
• As can be seen, there is strong positive relationship between visitors to NSW
Metropolitan-NSW North Coast, NSW Metropolitan-NSW South Coast and NSW
North Coast-NSW South Coast. This shows that an increase in visitors in one part of
the state results in increment in other part as well. Whole of NSW is preferred by
people travelling to that state.

• There is a slight relationship between visitors to NSW North Inland and visitors to
NSW South Inland.
ggpairs(as.data.frame(visnights[,6:10]),lower = list(continuous =
wrap("smooth", size=0.5,color="blue")))
• There is strong positive relationship between visitors to Queensland Central-
Queensland North Coast and Southern Australia Metroploitan-Southern Australia
Coast. This shows that an increase in visitors in one part of the state results in
increment in other part as well.

• There is negative relationship between visitors to Queensland Central-Southern


Australia Metropolitan, Queensland Central-Southern Australia Coast and
Queensland North Coast-Southern Australia Coast. This shows that the pattern of
visitors is in contrast with each other. There may be a chance that some of these
states and their parts are competing for similar visitor groups against each other.
ggpairs(as.data.frame(visnights[,11:15]),lower = list(continuous =
wrap("smooth", size=0.5,color="blue")))
• There is a very strong positive relationship between all of Victoria’s regions. Thus,
visitors prefer each region of Victoria as a whole.
ggpairs(as.data.frame(visnights[,16:20]),lower = list(continuous =
wrap("smooth", size=0.5,color="blue")))
• There is a positive relationship between all of Western Australia’s (WAU) regions.
But the strongest relationship is between visitors to WAU Inner and WAU Coast.
Thus people visiting to WAU prefer both the regions.

References
1. https://go.gale.com/ps/i.do?p=AONE&u=googlescholar&id=GALE|
A14561001&v=2.1&it=r&sid=AONE&asid=eb8b76de#:~:text=The%20deregulation
%20of%20domestic%20aviation,market%20entry%20and%20aircraft
%20importation.

You might also like