Professional Documents
Culture Documents
TM5140 Data-Driven Decision Making: Assignment 1
TM5140 Data-Driven Decision Making: Assignment 1
Assignment 1
12/02/2022
Introduction
This assignment is based on Chapter 1 through Chapter 3, as well as the lab sessions covered in each
chapter.
Assignment guidelines:
1. Data preparation
For this assignment, you will select a research question related to the Sri Lankan economy (e.g., education,
agriculture, health, transportation, business, export, environment, telecommunication, retail, IT, and so on),
identify a suitable (recent) data set with which to investigate the question, plan appropriate analyses,
implement them using R software and tidy tools, and prepare a brief report using rmarkdown.
Your research question should be related to the Sri Lankan economy. Once you have thought of a few different
research questions in which you are interested, find an appropriate dataset for the analysis. An ideal dataset
would be something related to or from your own research, but if this is not available, you may find something
on the internet (eg: Google Trends data), an annual report, indicators published by different organizations
in Sri Lanka etc. The data set would have to be small enough and/or formatted appropriately for you to
analyze it, but also large enough with enough different variables to demonstrate your knowledge gained in
Chapters 1, 2 and 3 and the lab sessions.
Save the dataset as a csv file in the following format:
Data_(Sector)_(YourStudentIDNumber).csv
Example:
If you select a dataset from education sector and your student ID number is XX1111111
Then your csv data file can be saved as Data_Education_XX1111111.csv
2. Data documentation
The documentation (i.e., the “Readme.txt” file) that accompanies each project data set is as important as the
data itself. This information permits collaborators and other analysts to understand any limitations or special
characteristics of the data that may impact its use. The following outline and content are recommended and
should be adhered to as closely as possible to make the documentation consistent across all data sets.
Data Set Documentation/Readme Outline:
Readme_(Sector)_(YourStudentIDNumber).txt
1
b) Your readme file should contains the following
• Sector:
• Title: A descriptive title that match the content of the data set
• Source: Data source (eg, web pages, annual report, publications). Provide web address references, if
any (i.e., links for any publications, additional documentation such as Project web site,if available).
• Format: A data frame with xx rows and yy variables.
Please replace xx (number of rows in your dataset) and yy (number of columns in your dataset) according to
your dataset.
• Provide a SEPERATE description for EACH column of your dataset using the following format.
• Other: Other Remarks. Any other information related to the dataset, if any (eg, special, unusual
incidents)
If you have multiple datasets save them as seperate csv files and prepare a seperate readme file
for each dataset.
3. Analysis
Produce THREE meaningful visual representations based on the dataset using the R programming language
(ggplot2 R package) and interpret the results. This assignment is based on your understanding of the
concepts discussed in the first three chapters and the lab sessions.
4. Result communication
Use rmarkdown to create a brief (less than five-page) report based on your findings. Include the visualizations
with R codes and a brief explanation of the visualizations in your answer.
Following topics should be addressed in your report.
• Research objectives
• Three meaningful visual representations
• Interpretation of the visual representations
2
• References: This is the last section of your report. Here you should provide all the references that you
used for your report writing.
Ethical writers ALWAYS acknowledge the contributions of others and the source of their ideas. Any verbatim
text or taken from another author must be enclosed in quotation marks. Please acknowledge every source
(including r codes) you use in your writing, whether you paraphrase it, summarize it or enclose it in quotations.
At the end of this activity you will have the following three documents
1. Data_(Sector)_(YourStudentIDNumber).csv
2. Readme_(Sector)_(YourStudentIDNumber).txt
3. Report_(Sector)_(YourStudentIDNumber).rmd
4. Report_(Sector)_(YourStudentIDNumber).pdf(.doc, .pdf or .html)
1. Data_(Sector)_(YourStudentIDNumber)_(dataset_1).csv
Data_(Sector)_(YourStudentIDNumber)_(dataset_2).csv
2. Readme_(Sector)_(YourStudentIDNumber)_(dataset_1).txt
Readme_(Sector)_(YourStudentIDNumber)_(dataset_2).txt
If you have any issues with submitting your assignment due to any unexpected circumstances, you can apply
for an extension or special consideration. For special consideration or deadline extensions, please send me
an email (priyangad@uom.lk) on or before 5 March 2022, informing me of your request with the email title:
5140_Assignment1_deadline_extension_(yourIndexNumber). Otherwise, a penalty will be applied for late
submission of assignments (0.5 mark for each extra day).
Discussion Forum
I have initiated a discussion forum for this assignment (Assignment 1-Forum). Use this forum to discuss any
questions or concerns related to this project. I am happy to participate in the forum discussion, but I want
you to take part in this discussion and help each other.