Download as odt, pdf, or txt
Download as odt, pdf, or txt
You are on page 1of 2

In the above graph, X axis shows days of the month and Y axis shows five cities.

Box plot is used to


show the data distribution. Here, first city is ‘Waipa_NZ’ and the tweets starts from day 1 to day 30.
blue box represent negative polarity and it starts from day 7 and end to day 21, which means that
most of the negative tweets in waipa was generated from day 7 to day 21. Line in the middle of the
box represents median value. Orange box means that most positive tweets generated from day 7 to
day 20. Green box represents that most of the neutral tweets starts from day 7 to day 22. The data
distribution means, we split or data into 4 equal parts such as 25%,50%, 75% and 100%. The line
below the box shows 25% of our data, the box shows the data from 25% to 75%, and the line above
the box shows the last part of the data. We consider the data from the box is for data analysis
because that part shows the real features of the data.

In Auck_NZ, the negative polarity starts from day 1 to day 30 but most of the negtive tweet
generated from day 14 to day 27 which shows the box. Positive also generated from day 1 to day
30 but most of the positive tweet generated from day 10 to day 27. The neutral tweets also starts
from day 1 and most of the tweets was generated from day 14 to day 27. The line in the box shows
the median of the data.

In the case of ‘well_NZ’, the dot shows that the outlier value, and the box start from day 16 which
means that, day 1 have some negative tweets after that the negative tweet starts from day 8 to day
26. We can’t start the box from day 1 because, from day 2 to day 7 there is no negative tweet also
we have to represent the presents of negative tweets in day 1. So it is represented as a dot value.The
point is called outlier.
The box shows the most tweeted day... the line in the box shows the median of the data... The dot
point shows the outliers of the data. Data distribution means showing all data points to know about
the nature of data.

You might also like