Professional Documents
Culture Documents
ICT110_Task_3_Sem_1_2024
ICT110_Task_3_Sem_1_2024
Task 3
Semester 1, 2024
ICT110 Introduction to Data Science
Task 3
Submit your assignment to Canvas – Assignments - Task 3. Please follow the submission
instructions in Canvas.
The assignment will be marked out of a total of 100 marks and forms 45% of the total
assessment for the course. ALL assignments will be checked for plagiarism by Turnitin system
provided by Canvas automatically.
Refer to your Course Outline or the Course Web Site for a copy of the “Student Misconduct,
Plagiarism and Collusion” guidelines.
Late submission will be penalised according to the policy in the course outline. Please note
Saturday and Sunday are included in the count of days late.
Requests for an extension to an assignment MUST be made to the course coordinator prior to
the date of submission and requests made on the day of submission or after the submission date
will only be considered in exceptional circumstances. Assignment submission extensions
will only be made using the official University guidelines.
Page 2 of 6
ICT110 Introduction to Data Science
Task 3
Background
A series of data sets are provided, and you are welcome to choose whichever you are
interested in.
SPORT:
ACCIDENT:
This data has been extracted from the Queensland Road Crash Database.
WINE:
This dataset is related to red variants of the Portuguese "Vinho Verde" wine.
BUSINESS:
Assignment Task
You are a member of the team and need to perform data analysis on selected attributes.
Describe the data. Provide a comprehensive overview of the data and its attributes, things such
as how many, what type, what it describes. Exploratory Data Analysis.
Describe the finding/s: What did you find, what did you predict, what did you thick is
important.
You have been requested to prepare a data analysis report about your work and explain your
findings. The potential audiences include other researchers, business representatives, and
government agencies. They may have limited ICT or mathematical knowledge. Therefore, the
report should be technical but have clear explanations describing the findings.
1. Introduction
Introduce the problem. Include background material as appropriate: who cares about this
problem, what impact it has, where does the data come from, what are the dimensions and
structure of the data.
2. Data Setup
Describe how to load the data, and how the pre-processing is performed.
The original dataset is not ready for analysis and it is different from the data forms that we
Page 3 of 6
ICT110 Introduction to Data Science
Task 3
are familiar with in previous practices. This means we need to do some pre-processing, either
for the whole dataset, or for a subset of the dataset required for each sub task described later.
Once you have some ideas of exploratory or advanced analysis, you need to adjust the form
of dataset. This can be achieved either by manipulating records in R by transposition or
subsetting, or with other tools (e.g. notepad or excel) before reading them into R. For
simplicity, you can also rename the attribute names.
Please clearly explain the way you have cleaned the data in this section. If you use Excel
please still explain the steps that you used for cleaning.
One-variable analysis studies one variable (one column/attribute) each time. It is up to you to
decide which attribute/variable you use for this analysis but the attribute you select need to be
related to the research objectives.
A two-variable analysis studies the relation between two variables. It is up to you to decide
which attributes/variables you use for this analysis but the attributes you select need to be
related to the research objectives.
4. Advanced Analysis [[AT LEAST TWO OF THE FOLLOWING (you can do the same
type twice on different data)]]
Briefly explain the concept of linear regression (with references). It is up to you to decide
which attributes/variables you use for this analysis but the attributes you select need to be
related to the research objectives.
Briefly explain the concept of clustering and k-means (with references). Perform 1 clustering
analysis. It is up to you to decide which attribute(s) you use for this analysis but the
attribute(s) you select need to be related to the research objectives.
5. Conclusion
Sum up your findings and provide some insight into the findings.
6. Reflections
In this part, discuss any difficulties you had performing the analysis and how you solved
those difficulties. Reflect on how the analysis process went for you, what you learnt, and
what you might do differently next time. Aim to write one paragraph.
For all data analysis (Section 3 & 4), you need to provide both R script file and the
explanation to the code (in comments in code). Please submit a single R code file as part
of your submission for compiling and running. Your R code MUST run.
Page 4 of 6
ICT110 Introduction to Data Science
Task 3
The marking rubric is viewable on Canvas.
Report Format
Your report should be 1,200 + words. The report MUST be formatted using the
following guidelines:
Referencing
References for the explanation of decision trees and linear regression are required. These
references should follow the Harvard or APA method of referencing. Note that ALL references
should be from journal articles, conference papers, technical papers or a recognized expert in
the field. Use the library databases or Google Scholar to find appropriate articles. DO NOT use
Wikipedia as a reference.
Assignment grades will be available on Canvas in two weeks after the submission. Details of
marking will also be accessible via online rubrics on Canvas.
Where an assignment is undergoing investigation for alleged plagiarism or collusion the grade
for the assignment and the assignment will be withheld until the investigation has concluded.
Page 5 of 6
ICT110 Introduction to Data Science
Task 3
Assignment Advice
This assignment will take many weeks to complete and will require a good understanding of
data science theories and practices for successful completion. It is imperative that students take
heed of the following points in relation to doing this assignment:
1. Ensure that you clearly understand the requirements for the assignment – what must
be done and what are the deliverables.
2. If you do not understand any of the assignment requirements – Please ASK the course
coordinator or your tutor.
3. Each time you work on any aspect of the assignment reread the assignment
requirements to ensure that what is required is clearly understood.
End of Assignment
Page 6 of 6