Professional Documents
Culture Documents
A5 Csai Problem
A5 Csai Problem
Subhasis Ray∗
Note
This is an individual assignment. You may not consult your peers or AI
tools to do these tasks except where explicitly asked for. Any misconduct
will result in a 0 score in this entire assignment, and will be noted and
reported to the Academic Integrity Committee.
Submit to CodePost: (1) all code in a single python script with a comment
indicating the problem number before the solution to each problem, and (2)
plots under headings indicating the problem numbers in a word document on
CodePost.
General rules for plots:
• Always label your axes. When multiple axes share the same
axis, you can label it once. Your label should indicate the
unit where applicable.
• If there are multiple axes in the same figure, provide a
meaningful title for each axes.
• If there are multiple plots in the same axes, provide a leg-
end.
• Provide a colorbar when making plots with color-coded val-
ues (colormap).
• Ensure legibility of your plots (e.g., adjust marker sizes so
that they do not overlap, control spacing between axes so
the ticklabels are visible, customize and rotate ticklabels,
etc.)
∗
subhasis.ray@plaksha.edu.in
1
• Title the figure.
• Briefly describe your observations from the plots.
1 Introduction
This dataset is from Loghub, also available on Zenodo, and the related article
is:
2. Which are the top five programs in these log messages? Plot the share
of these, and the rest of the programs together as the category other
in a pie chart. 5marks
2
4 Hypothesis testing
1. Do you notice a difference in the number of events in two time win-
dows in this data? Form a testable hypothesis about this and conduct
a hypothesis test. Summarize your results and make boxplots to com-
pare the event rates / counts in these time windows. 5marks