Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Social Data Visualisation

CCB302, Digital Media Analytics, Assessment Two

Due Date Refer to Blackboard


Length 2000 words (maximum)
Weight 50%
Individual/Group Individual
Submission Items A dataset
A report (Word or PDF)
A Tableau Workbook (Packaged Workbook in .twbx format)

WHAT YOU NEED TO DO

In this assignment, you will need to propose a digital media analytics project, collect data for it,
visualise and analyse the data, and write a report.

There are a few steps to doing this assessment.

Step One: Decide on Topic and Context

Think of a topic you would like to investigate or research. This could be a reflection on the career
path you have chosen, a job you see yourself in, a project you have done in a job or are going to do
for another unit, or simply a hypothetical project that you can think of for this assignment. It does not
need to be a full-scale, big project, so please consider the time you have for this assignment when
thinking about it.

IMPORTANT NOTE

For this unit, you have received an ethical clearance to collect public social media data and
perform digital media analytics. However, since the ethical clearance is at a unit-level, there
are limitations on the topics that you can choose for data collection and analysis. Any
research about sensitive or controversial issues, or topics that may cause harm to you or
the participants (including psychological harm) falls outside the scope of the ethical
clearance, so please make sure to carefully choose your topics.

If you are unsure about the topic you have chosen, please consult with your tutor before
collecting any data.
Step Two: Test Your Topic and Approach

You will use TAGS to collect Twitter data. Twitter’s API limits the collection of data to the past 7
days or so, but as you learned in the tutorials, you will be able to collect data on an ongoing basis using
your search terms. In this step, you will need to test whether the topic of choice or the set of search
terms you have chosen for your project can provide a good amount of data. If your search terms or
topic are too narrow and specific, you may not be able to collect enough tweets to do this assignment
properly. On the other hand, if your search terms and topic are too broad, or focus on a topic with a
very high level of tweeting activity, you may end up with more data than a personal computer can
handle, or your dataset may contain a high level of noise.

Therefore, in step 2, you will need to test your search terms, topic, and approach, until you find a
suitable balance between the query and data size. If your query yields few tweets, or if it contains a lot
of noise, you may need to reconsider the topic. You can do this test using Twitter’s search function
and investigate how many tweets are posted on your topic in each minute, hour, or day. Alternatively,
you can collect some data via TAGS using your search terms, and examine how many tweets you were
able to collect.

You may need to cycle back and forth between steps 1 and 2 until you are confident about your topic
and search terms.

Step Three: Collect Data

Depending on the topic you have chosen, the scope you have set for your project, and the level of
tweeting activity, you will need to either do one pass at data collection, or let your TAGS collector run
for some time (a day or more).

There is no set minimum or maximum for the amount of data you need to collect for this assignment,
but as a rule of thumb, to be able to do proper analysis in the next steps, you will need at least 10,000
to 15,000 tweets.

Make sure to document all the search parameters you use in TAGS. You will need to write about them
in your report.
IMPORTANT NOTE

Always backup your data. Things can always go wrong. Sometimes, opening a dataset file
with some tools might change the structure of some cells. For instance, Microsoft Excel
tends to automatically convert long numbers into scientific numbers when opening a file.
This will practically make your file unusable if saved, and you may have to collect the data
again. Always have the original file saved somewhere, make a copy, and work on the copy,
so that if something goes wrong, you still have the original file.

Step Four: Investigate the Dataset

Open your dataset file(s) in Tableau. Look at the columns and/or datapoints in the dataset, and think
about which ones are most useful for the project you have outlined in step one. What kinds of insights
can you get from each datapoint or a combination of them? What sorts of analyses are possible? In
short, what can you do with this dataset?

Your dataset might have tens of columns. You may not need all of them for the analysis, and some of
them may be more important than others for the task at hand. Reflect on these, and decide which
columns or datapoints are best to analyse for the project you have designed.

Step Five: Analyse the Data

Once you have decided on the most relevant datapoints and analytical approach for the job, it is time
to do the analysis. Analyse and visualise the data, and take notes of your findings. Obtain as much
insight about the dataset as possible.

The analysis needs to be done in Tableau, plus either Gephi or Leximancer. You do not need to do
both Gephi and Leximancer, and only one is enough, but please feel free to do so if it helps your
report and analysis.

Tool 1: Tableau:

Tableau

There is no set minimum or maximum for the number of visualisations you create. The main thing
you need to focus on is to make enough visualisations to be able to provide a rich, critical, and
insightful report. Too few visualisations, and the report may end up very descriptive. Too many
visualisations, and your report may not be able to go in-depth. If you are not sure, aim for around 5
to 10 visualisations.

NOTE: Make sure to save your Tableau workbook in .twbx (packaged workbook) format. You
will need to submit it as part of the assessment.

Tool 2: Gephi or Leximancer

Gephi

Your project may need to rely on some insights about the interactions among users. This is achievable
through social network analysis in Gephi. Prepare, analyse, and visualise the network(s) in Gephi, and
take note of your findings.

NOTE: Export your network image and make sure to include it in the report. This is part of your
assessment. You do not need to submit the network file itself, but without the image in the report, we
will not have evidence of the network analysis.

Leximancer

Your project may need some qualitative insights about the content of tweets. Given that you probably
have at least a few thousand tweets in your dataset, you may need to rely on a computational textual
analysis tool like Leximancer. Prepare, clean, visualise, and analyse the data in Leximancer, and take
notes of your findings from the semantic network analysis.

NOTE: Export your semantic network image and make sure to include it in the report. This is part
of your assessment. You do not need to submit the underlying network data, but without the image
in the report, we will not have evidence of the textual analysis.

Step Six: Write the Report

Write a summative report of your findings, and discuss the range of insights you have obtained from
the analysis of the dataset.

You can structure your report in any way that you deem fit, depending on the project you have
proposed and scoped. Below is one possible structure.
Context and Topic [~250 words]

Briefly tell the reader what this project is about. What is the context? Who are you? Why are you doing
this project? What are the main questions this report is going to answer?

You can use first-person language to describe this step. For instance:

I am a data journalist working in a news agency that primarily writes for a younger audience. We are
currently writing an investigative feature article about anecdotal reports of Twitter content farms that hijack
popular topics among young people, in order to spread misinformation. In our preliminary surveying of
tweets, one of the key topics that fits these descriptions is KPOP. We have noticed that many accounts on
Twitter post mis- and disinformation, and include KPOP related hashtags in their tweets (e.g. #BTS,
#BlackPink, #BLINK, etc.). As part of this investigation, I have been tasked with collecting a
representative sample of tweets containing KPOP related hashtags, and examining the data to find empirical
evidence of such activities.

After setting up the context, tell the reader about the key questions your project is going to answer.
The data journalist in the example above, for instance, needs to list a number of questions that help
them find empirical evidence of the content farms. This could involve some user metrics, some
temporal metrics, and some network metrics.

Make your questions as specific as possible, so that the reader can easily follow what your report is
going to do and what questions it answers.

Data Collection Parameters [~100 words]

In a couple of sentences, tell the reader how you collected the data (the search terms and search types
used), what was the timeline (the dates the dataset covers), and how many unique tweets you were
able to collect.

This information is required for your report.

Main Body (including any relevant sub-headings) [~1400 words]

Write the main body of the report, and discuss your findings. Provide as much information as possible,
and focus on the key questions that you proposed at the beginning of the report. You can structure
this section around the analytical tools that you have used, or around your thematic findings. For
example, you can dedicate a section on the findings in Tableau, and then another section to your
findings in Gephi or Leximancer. Alternatively, you may decide that it is better to provide user metrics
first, and include both Tableau and Gephi findings in this section, and then move to temporal findings.
The key factor in making such decisions is the logical flow of arguments and findings in the report.

Make sure to include screenshots of your visualisations in the report, so that the reader is able to easily
follow the flow of arguments and findings in the report.

Conclusions [~250 words]

Summarise the findings, and include information about any limitations in the dataset and/or your
analytical approach, any ethical considerations, the future directions this project can take, or any
recommendations that stem from your findings.

SUBMISSION INFORMATION
What You Need to Submit

You need to submit three items:

- The summative report you wrote. Submit this report in Adobe PDF or Microsoft Word
format.
- The Tableau workbook that you created
o VERY IMPORTANT: Submit your Tableau workbook in the .twbx format
(Packaged Workbook)
- The dataset that you collected
NOTE: On submission, you are declaring that, unless otherwise acknowledged, this submission is
wholly your work and it has not been used and already submitted. You understand that this work may
be submitted for plagiarism check and consent to this taking place.

Moderation

All staff who are assessing your work meet to discuss and compare their judgements before marks or
grades are finalised. Refer to MOPP C/5.1.7.

Academic Integrity

As a student of the QUT academic community, you are asked to uphold the principles of academic
integrity during your course of study. QUT sets expectations and responsibilities of students
specifically stating that students “adopt an ethical approach to academic work and assessment in
accordance with this policy and the Student Code of Conduct (E/2.1)". Students need to be aware
that academic integrity refers to text and non-text sources, i.e. "copying or adapting non-text based
material created by others, such as diagrams, designs, musical score, audio-visual materials, art work,
plans, code or photographs without appropriate acknowledgement" (MOPP C/5.3.6 Academic
Integrity). It also includes self-plagiarism, this “involves the re-use by a student of their own work
without appropriate acknowledgement of the source. Students should seek express consent from the
unit coordinator prior to re-using their own work in an assessment submission" (MOPP C/5.3.6
Academic Integrity).

Students are expected to demonstrate their own understanding and thinking using ideas provided by
‘others’ to support and inform their work, always acknowledging the source. While we encourage peer
learning, it is not appropriate to share assignments with other students unless your assessment piece
has been stated as being a group assignment. If you do share your assignment with another student,
and they copy all or part of your assignment for their submission, this is considered collusion and you
may be reported for academic misconduct. If you are unsure and need more information, please refer
to:

http://www.mopp.qut.edu.au/C/C_05_03.jsp#C_05_03.03.mdoc.
Criteria High Distinction Distinction Credit Pass Marginal Fail Fail
PROJECT DESIGN AND EXECUTION (25%)
Project Very well scoped project that Very well scoped project that Well scoped project that The scope of the project has merit The project does not have a No evidence
scope shows a high level of shows a high level of shows a good level of overall, but does not properly fit clear scope, and social media of learning in
(10%) understanding of the understanding of the potentials understanding of the the potentials and limitations of data cannot answer the this criterion.
potentials and limitations of and limitations of social media potentials and limitations of social media data questions set for the project
social media data data, with only minor social media data, with
inaccuracies some inaccuracies
Data Very well documented data Well documented data collection Well documented data Minimally documented data Undocumented data No evidence
collection collection collection, with omissions collection collection of learning in
(15%) The size of the dataset and query that impact replicability this criterion.
The size of the dataset and parameters fit the project The size of the dataset and query The size of the dataset and
query parameters fit the appropriately, but minor The size of the dataset and parameters can only provide query parameters only
project scope perfectly. inaccuracies in parameters query parameters fit the explorative, preliminary insights provide minimal insights
project overall, but include
inaccuracies that impact
analysis

ANALYSIS (30%)
Analytical Highly appropriate, creative, Appropriate, creative, and Some creative and Basic visualisations that show Basic visualisations that No evidence
approach and innovative use of various innovative use of various chart innovative use of various minimal engagement with the show minimal engagement of learning in
(15%) chart types and analytical types and analytical tools chart types and analytical range of chart types and analytical with the range of chart types this criterion.
tools tools tools and analytical tools
Well planned treatment of data
Error-free treatment of data fields, regarding both analytical Good treatment of data Treatment of data fields shows Treatment of data fields
fields, regarding both and interpretive implications, fields, but with some some merit, but contains contains major inaccuracies
analytical and interpretive with only minor inaccuracies that inaccuracies which impact inaccuracies that significantly that significantly impact
implications do not impact interpretations their interpretive impact interpretations interpretations
implications
Metrics selected and analysed Metrics selected and analysed Metrics selected and analysed Metrics selected and
appropriately and sufficiently appropriately and sufficiently Metrics selected and provide only descriptive answers analysed do not provide
answer project’s questions answer majority of project’s analysed answer some of to project’s questions answers to project’s
questions the project’s questions questions
Technical Highly expressive, rich, and Expressive, rich, and Generally expressive Visualisations are communicative, Visualisations are not No evidence
approach communicative visualisations communicative visualisations visualisations, but some but show minimal use of the communicative and show of learning in
(15%) analytical potentials not fully analytical tools’ potentials minimal use of the analytical this criterion.
Highly appropriate use of Appropriate use of analytical operationalized tools
analytical tools tools, only with minor
inaccuracies Visualisations are properly Visualisations are minimally
Visualisations are labelled, labelled, but some labelled, marked, or annotated Visualisations are not
marked, and/or annotated Visualisations are labelled, information is missing or labelled, marked, or
appropriately marked, and/or annotated inaccurately marked, A range of visualisations have annotated
appropriately, with minor labelled, or annotated been created to answer project’s
Considerate and thoughtful omissions questions, but key aspects remain Visualisations cannot answer
number of visualisations have A good range of unexplored project’s questions
been created to properly Considerate and thoughtful visualisations have been
answer project’s questions number of visualisations have created to answer project’s
been created to properly answer questions, but some key
project’s questions aspects remain
underexplored
REPORT (45%)
Analytical Provides a logically Provides a logically structured, Provides a generally logical Provides an acceptably structured Provides a poorly structured No evidence
elements structured, highly critical, and critical, and in-depth report of and well structured report, report, but logical flow is not report, with a lack of of learning in
(35%) in-depth report of findings findings but some parts remain always clear apparent logical flow this criterion
underexplored
The report goes beyond mere The report goes beyond mere The report is predominantly based The report is solely based on
visual analysis, and engages in visual analysis, and engages in The report is mainly based on visual analysis, with major visual analysis, and does not
appropriate, context-aware, appropriate, context-aware on visual analysis, with inaccuracies in interpretation of interpret findings
qualitative examination of analysis, with only minor some inaccuracies in findings
datasets. inaccuracies interpretation of findings There is no connection
The connection between the between the findings and
There is a clear connection The connection between the The connection between findings and analytics is present, analytics
between the findings and findings and analytics is clear in the findings and analytics is but is left to the reader to identify
analytics the majority of the report generally clear Report does not show an
Report shows little awareness of understanding of ethical
Report shows a high Report shows a high awareness Report shows some ethical implications, interpretive implications, interpretive
awareness of ethical of ethical implications, awareness of ethical limitations and considerations, limitations and
implications, interpretive interpretive limitations and implications, interpretive and/or data-driven consideration, and/or data-
limitations and considerations, and/or data- limitations and recommendations driven recommendations
considerations, and/or data- driven recommendations, with considerations, and/or data-
driven recommendations only minor omission of key driven recommendations,
factors with some omission of key
factors
Technical Report shows proper proof Report shows proper proof Report shows proper proof Report shows lack of proper proof Report does not show No evidence
elements reading and accurate, reading and accurate, reading and accurate, reading, with inaccuracies in proper proof reading of learning in
(10%) professional use of language professional use of language, professional use of language use this criterion
with only minor inaccuracies language, with some The report is substantially
The report adheres by the inaccuracies The report exceeds word limits by over/under word limits
word limits. The report adheres by the word a large margin
limits. The report is slightly above
the word limits

You might also like