Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Computers in Human Behavior 115 (2021) 106582

Contents lists available at ScienceDirect

Computers in Human Behavior


journal homepage: http://www.elsevier.com/locate/comphumbeh

Full length article

Exploring the relationship between social presence and learners’ prestige in


MOOC discussion forums using automated content analysis and social
network analysis
Wenting Zou a, *, Xiao Hu b, Zilong Pan a, Chenglu Li a, Ying Cai a, Min Liu a
a
Learning Technologies Program, The University of Texas at Austin, 1912 Speedway Stop D5700, Austin, TX, 78712, USA
b
Human Communication, Development, and Information Sciences, The University of Hong Kong, Pok Fu Lam, Hong Kong

A R T I C L E I N F O A B S T R A C T

Keywords: Research has repeatedly proven the importance of social interactions in online learning contexts such as Massive
Massive open online courses Open Online Courses (MOOCs), where learners often reported isolation and a lack of peer support. Previous
Social presence studies of social presence suggested that the ways learners present themselves socially online affect their learning
Discussion forums
outcomes. In order to further understand the role of learners’ social presence, this study attempts to examine the
Natural language processing
relationship between social presence and learners’ prestige in the learner network of a MOOC. An automated text
Content analysis
Social network analysis classification model based on the latest machine learning techniques was developed to identify different social
presence indicators from forum posts, while two metrics (in-degree and authority score) in social network
analysis (SNA) were used to measure learners’ prestige in the learner network. Results revealed that certain
social presence indicators such as Asking questions, Expressing gratitude, Self-disclosure, Sharing resources and
Using Vocatives have positive correlations with learners’ prestige, while the expressions of Disagreement/
doubts/criticism and Negative emotions were counterproductive to learners’ prestige. The findings not only
reinforce the importance of social presence in online learning, but also shed light on the strategies of leveraging
social presence to improve individual’s prestige in social learning contexts like MOOCs.

1. Introduction online learning, recent studies have been focusing on the social
dimension of learning in MOOCs, especially through the analysis of
Despite the popularity of Massive Open Online Courses (MOOCs), MOOC discussion forums, to better understand the patterns of students’
studies have found that a lot of MOOCs had low completion rates that interactions and struggles along the way (Poquet & Dawson, 2016; Rosé
were less than 10% (Fidalgo-Blanco, Sein-Echaluce, & García-Peñalvo, et al., 2014; Wise & Cui, 2018).
2016; Gütl, Rizzardini, Chang, & Morales, 2014; Joo, So, & Kim, 2018). To study learners’ social behaviors in MOOC forums, quantitative
For the past few years, many researchers have been trying to find po­ approaches were frequently adopted such as using surveys to ask
tential factors that may explain the low engagement and low completion learners to report their level of engagement (Liu et al., 2019), or simply
rates in MOOCs. These factors often include course design, learners’ counting the frequencies of social behaviors including posting, replying
motivation, social interaction, peer support, scaffolding from the in­ and liking posts by using the system generated log data (Anderson,
structors, learners’ characteristics etc. (Fidalgo-Blanco et al., 2016; Huttenlocher, Kleinberg, & Leskovec, 2014; De Barba, Kennedy, &
García-Peñalvo, Fidalgo-Blanco, & Sein-Echaluce, 2018; Liu, Zou, Shi, Ainley, 2016). However, the quantity of participation does not neces­
Pan, & Li, 2019; Stiller & Bachmaier, 2017). Among them, the social sarily guarantee the quality of participation (Meyer, 2004). Therefore,
factor was found to play an important role in learners’ engagement and instead of solely collecting survey or log data, some researchers started
performance in MOOCs through stimulating higher-order thinking skills to conduct content analyses on the discussion posts to obtain an in-depth
(Al-Rahmi, Alias, Othman, Marin, & Tur, 2018; Wang, Guo, He, & Wu, understanding of students’ social interactions in the forums. For
2019; Zhang, Yin, Luo, & Yan, 2017). In order to create a more socially example, some analyzed students’ posts to identify topics or themes
conducive environment and decrease learners’ feelings of isolation in emerged from discussions (Ezen-Can, Boyer, Kellogg, & Booth, 2015), or

* Corresponding author.
E-mail address: ellenzou@utexas.edu (W. Zou).

https://doi.org/10.1016/j.chb.2020.106582
Received 14 January 2020; Received in revised form 20 September 2020; Accepted 21 September 2020
Available online 23 September 2020
0747-5632/© 2020 Elsevier Ltd. All rights reserved.
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

examine students’ sentiments towards the topics discussed (Wen, Yang, & Pudelko, 2003), and sense of community (McMillan & Chavis, 1986;
& Rose, 2014), or monitor students’ social and cognitive behaviors in Rovai, 2002). Under the umbrella of this generic community definition,
relation to learning outcomes (Gašević, Joksimović, Eagan, & Shaffer, researchers defined more specific community types. For example,
2019). Building on these prior explorations, this study focuses on Garrison, Anderson, and Archer (2001) proposed the Community of
examining learners’ social presence, conceptualized as the ways one Inquiry (CoI) framework, which addresses the teaching, cognitive and
presents himself or herself socially and often considered an important social aspects of learning while learners engage in online communities.
factor that predicts learners’ perceived learning and satisfaction The CoI framework posits that in the absence of face-to-face interaction,
(Arbaugh, 2008; Cobb, 2011; Gunawardena & Zittle, 1997). Specifically, participants in online learning environments must strive to recreate the
this study aims to examine the role social presence plays in one’s pres­ social and knowledge building processes that occur via moment by
tige within the learner network of a MOOC. The learner network is moment negotiation of meaning found in the classroom. In the context
constructed by all learners who interacted with one another in the fo­ of MOOCs, these dynamics of interaction and negotiation are manifested
rums of the MOOC. Previous literature has established positive corre­ in the discussion forums.
lations between learners’ prestige, often quantified by the number of To cultivate a sense of community in MOOCs, it is essential to
replies an individual receives from others, and achievement of better encourage participants’ social presence in order to foster social in­
learning outcomes (Cho, Gay, Davidson, & Ingraffea, 2007; Hommes teractions for the meaning-making and knowledge building processes.
et al., 2012; Vargas et al., 2018; Yang & Tang, 2003). Social presence has been a concept of interest to researchers and prac­
Although the importance of social presence in online learning set­ titioners since the emergence of distance education (McKerlich, Riis,
tings has been well documented (Joksimović, Gašević, Kovanović, Anderson, & Eastman, 2011). Originally developed to explain the effect
Riecke, & Hatala, 2015; Picciano, 2002; Rovai, 2002), the majority of telecommunications media can have on communication, social presence
research in this area is situated within the formal education context was used to describe the degree of salience (i.e., quality or state of “being
rather than MOOCs. In the context of MOOCs, learners may have very there”) between two communicators using a communication medium
different engagement patterns compared to formal online courses with (Short, Williams, & Christie, 1976). While in the CoI framework, social
synchronous participation by a bounded cohort of students. Some re­ presence is conceptualized as “the ability of participants in a CoI to
searchers pointed out the assumption of a continuity of interactions in project their personal characteristics into the community, thereby pre­
traditional online courses, which does not hold in MOOC contexts senting themselves to other participants as ‘real people’” (Garrison et al.,
(Poquet et al., 2018). In MOOCs, a small group of learners may partic­ 2001, pp. 89). The CoI framework proposed that social presence, along
ipate persistently throughout the course, whereas a large number of with teaching and cognitive presence are three essential elements to
learners participate intermittently, meaning that they engage and develop a successful learning community. Researchers highlighted the
disengage randomly. Given the unique participation patterns in MOOCs, importance of social presence in online learning, because it helps stu­
there is a high possibility that social presence may unfold differently in dents translate online activities into peer interactions that feel real
MOOCs compared to traditional small-scale online courses. This study during the process of interpersonal communications (Pursel, Zhang,
attempts to fill the gap by exploring social presence and its relationship Jablokow, Choi, & Velegol, 2016). Peer interactions not only make
with learners’ prestige in the learner network of a MOOC. The finding students feel more connected with one another within the community,
may provide an in-depth understanding of learners’ social presence in a they also enable students to engage in more meaning-making conver­
constructivist online learning context like MOOC, as well as inform sations with peers, providing them more opportunities to practice
MOOC learners on how to strategically present themselves in the course higher-order thinking that may ultimately lead to improved learning
to achieve maximum prestige in the learning community, engage in outcomes (Dixson, 2010).
more valuable conversations and ultimately achieve productive learning Previous studies provide ample evidence that social presence en­
outcomes. Besides, as a methodological contribution, this study pro­ hances cognition in social learning environments (Arbaugh, 2008; Blau
poses a computational linguistics model based on the latest deep & Caspi, 2008; Cobb, 2011; Gunawardena et al., 2001; Gunawardena &
learning techniques to analyze students’ forum posts. This deep learning Zittle, 1997; Kozan & Richardson, 2014). For example, Gunawardena
enabled text classification model achieved higher accuracy compared to and Zittle (1997) examined the influence of social presence as a pre­
traditional machine learning models, which has the potential to be dictor of satisfaction within a computer-mediated conferencing (CMC)
applied to similar contexts to address the challenge of processing large environment and found that social presence accounted for 58% of
scale text data generated by learners. variance in student satisfaction. In a later study, Gunawardena et al.
(2001) found that social presence facilitates the building of trust and
2. Theoretical background self-disclosure. Arbaugh (2008) examined 55 online MBA courses to
determine if social presence, cognitive presence and teaching presence
2.1. Social presence in online learning could predict students’ learning outcomes. He found that social presence
was positively associated with students’ perceived learning. Cobb’s
Vygotsky’s social constructivist theory (1980) proposed that (2011) work on nursing education found that social presence was also
knowledge is co-constructed and that individuals learn from one another highly correlated to both student satisfaction and perceived learning.
in a knowledge community. Wenger, Trayner, and De Laat (2011) define Specifically, social presence accounted for 36% of the variance in
communities as “the development of a shared identity around a topic or perceived learning.
set of challenges”. A community is formed by people who share a Despite the purported benefits of social presence in helping learners
concern, a set of problems, or a passion about a topic, and deepen their establish rapport with peers and achieve better perceived learning and
knowledge and expertise in a particular area by interacting on an satisfaction, one limitation of previous studies is that most of them
ongoing basis. As a result, they develop a unique perspective on their examined social presence in small scale online courses with a relatively
topic as well as a body of common knowledge, practices, and ap­ homogeneous group of learners who participated at the same pace
proaches. They also develop personal relationships and a collective (Arbaugh, 2008; Cobb, 2011). Whereas in MOOCs where learners
identity through participating in a community (Wenger, McDermott, & engaged in different patterns, their social presence was not thoroughly
Snyder, 2002). Nistor, Dascalu, Serafin, and Trausan-Matu (2018) studied. Besides, most literature studied the correlation between social
argued that this collective identity implies the development of further presence and learning outcomes but rarely explored how social presence
social, cognitive and emotional characteristics, such as shared history mediates the learning process which leads to improved learning out­
and collective memory (Malinen, 2015), social networks and social comes. Given these limitations, this study attempts to fill the gaps by
capital (Brint, 2001), group cohesion (Garrison & Arbaugh, 2007; Henri focusing on the association between social presence and learners’

2
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

prestige in the learner network in a MOOC, given the assumption that topic complexity and dialog complexity having different predicting
higher prestige leads to better learning outcomes. power in identifying the central, intermediate, and peripheral layers of
the communities. This study is an early exploration of the
2.2. Social network analysis in MOOCs socio-cognitive structures of learner network through both network
analysis and content analysis.
In recent years, Social Network Analysis (SNA) has gained increasing
attention as a methodology in the domain of education (De Laat, Lally, 2.3. Content analysis in MOOC discussion forums
Lipponen, & Simons, 2007). SNA is often applied to analyze the net­
works of interconnected forum users to investigate structural patterns Vygotsky’s social constructivist theory stressed that knowledge
and the underlying relational organization of learning communities in construction is socially situated and shaped in collaboration through
online courses. The gap of participation between active and inactive language. Thus, through language, people are able to learn and to ach­
learners was studied by Poquet and Dawson (2016). Their findings ieve knowledge that would be impossible without social interaction. In
showed that regular forum users shaped a denser and more centralized the context of MOOC discussion forums, the language that enables
communication network due to more opportunities to establish con­ knowledge construction manifests itself in the textual data, such as posts
nections with peers. On the level of individual learners, studies used and response threads, which are useful in understanding learners’
various centrality metrics to measure the positions/roles of individuals emotions, cognitive processes, and social interactions cross various
in a network, such as degree centrality (which is the average value of subject contexts (Chung & Pennebaker, 2014). The increasing popu­
number of in-links and out-links), betweenness centrality (which mea­ larity of MOOCs gave rise to enormous amounts of textual data from
sures how often a node appears on the shortest path between nodes in learners’ online discourse, which poses challenges for instructors and
the network) and closeness centrality (which measures average distance researchers to efficiently process the data to generate insights to inform
from a given starting node to all other nodes in the network). Studies course design and facilitation. Therefore, many researchers started to
have been investigating how learners’ centrality metrics predict their incorporate Natural Language Processing (NLP) techniques to facilitate
learning outcomes. for example, Joksimović et al. (2016) examined the content analysis for different purposes. For instance, Dascalu, Dessus,
association between degree, closeness, betweenness centrality and aca­ Bianco, Trausan-Matu, and Nardy (2014) proposed an NLP enhanced
demic performance (completion and distinction status). Results showed platform—ReaderBench to assist learners, teachers, and researchers in
that degree centrality was significantly associated with learning interpreting large amounts of textual data such as assessing reading
outcome across two MOOCs; effect of betweenness and closeness were materials, detecting reading strategies, etc. By using the platform, Das­
only found in one MOOC but not in the other. Jiang, Fitzhugh, and calu, McNamara, Trausan-Matu, and Allen (2018) were able to conduct
Warschauer (2014) also examined associations between social centrality cohesion network analysis (CNA), which is an enriched SNA that in­
and academic performance (certificate, completion, and distinction volves text content and discourse structure analysis considering se­
status). They conducted the study on MOOCs in algebra and finance. The mantic cohesion while modeling interactions between participants.
results found from the two courses were inconsistent: degree and They used CNA to analyze chat conversations from students and assess
betweenness were positively correlated with learning performance in the quality of students’ degree of participation. They then compared the
the algebra course while no significant correlation was found between results of CNA to human-scored results on the basis of the relevance to
any centrality metric and learning performance in the finance course. the central concepts of the conversation. The results revealed that the
These inconsistent results were echoed in some other studies (Houston, CNA indices were strongly correlated to the human evaluations of the
Brady, Narasimham, & Fisher, 2017), suggesting that the correlations conversations, and predicted 54% of the variance in the human ratings
between learners’ centrality and their learning outcomes can be quite of participation. Their study demonstrates promising support for the use
ambiguous. Besides the aforementioned centrality metrics, authority of a combination of automated content analysis and network analysis to
score (Kleinberg, 1999) is an alternative often used to gauge how evaluate learners’ degree of engagement. In another study, Wise, Cui,
valuable the information an individual provides and how important his Jin, and Vytasek (2017) used an NLP incorporated tool–Lightside Re­
or her connections are. In the context of MOOC, students with a higher searcher’s Workbench to categorize MOOC discussion threads based on
authority score are those who attract more attention in the discussions their relevance to the course content. The results demonstrated a good
and have more connections with equally prestigious peers. In other reliability (accuracy ranged from 0.80 to 0.85) for all starting and reply
words, they are popular students in the forums because the nature of posts in the online statistics and psychology courses. Besides deter­
their posts is in some way interesting or remarkable from the others’ mining content relevance, the advancement of NLP techniques enables
point of view, thus triggering more responses than others’ posts do. researchers to analyze textual data from various perspectives, such as
Studies found a strong negative correlation between authority score and identifying the topics being discussed in the forums (Atapattu & Falkner,
drop-out rate, meaning that students with higher authority scores were 2016), analyzing learners’ emotions (Hu, Dowell, Brooks, & Yan, 2018;
less likely to drop out in MOOCs (Rosé et al., 2014; Yang, Sinha, Xing, Tang, & Pei, 2019), differentiating types of cognitive behaviors
Adamson, & Rosé, 2013). (Kovanović et al., 2016), and detecting confusion during discussions
So far, the use of SNA to examine how learning unfolds in MOOC (Agrawal, Venkatraman, Leonard, & Paepcke, 2015) and help-seeking
forums has been relatively limited in the literature. Besides, the majority behaviors (Almatrafi, Johri, & Rangwala, 2018).
of the studies adopted a quantitative approach to analyze learners’ roles With the trend of research interests shifting from quantitative anal­
in forums, which yielded limited insights regarding the qualitative na­ ysis to qualitative analysis based on large scale text data in MOOCs, the
ture of learners’ interactions. There is a need to examine learners’ posts number of studies in content analysis is increasing rapidly in recent
in relation to their positions in the network to provide rich contextual years. However, these early explorations of students’ online discourse
information for researchers to understand the dynamics of learner in­ still face many challenges. For one thing, the coarse-grained analysis (e.
teractions (Dowell et al., 2015). To this end, Nistor, Dascalu, Tarnai, and g., binary classification such as negative/positive emotions, on-topic/
Trausan-Matu (2020) attempted to use both network centrality metrics off-topic posts) often lead to ambiguous results. For another, the exist­
and dialog analysis (topic complexity, dialog complexity etc.) to identify ing text analysis models often rely heavily on the topics or subjects of the
the socio-cognitive structures of online learning communities, namely posts, thus lacking generalizability to accurately analyze the discourse
classifying the central, intermediate, and peripheral participants in the from a different discipline (Kovanović et al., 2016). To address these
communities. The peripheral participants had the lowest communica­ issues, this study proposes a more robust text analysis model based on
tive centrality (both in-degree and out-degree) while the central par­ the latest algorithms in deep learning to analyze text data on a
ticipants had the highest communicative centrality. They also found fine-grained level. Meanwhile, the model is built upon a well-validated

3
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

theoretical framework of social presence (Shea et al., 2010) to system­ (3) What social presence indicators are more common among
atically and thoroughly examine students’ posts. The findings can pro­ learners with high and low prestige?
vide more nuanced and insightful details to understand the social (4) Among the learners with high prestige, what social presence in­
dynamics in MOOC forums. dicators are more common in their posts that received high
response rate?

2.4. Automatic text analysis with deep learning 4. Methods

In recent years, deep learning is rapidly gaining popularity in the 4.1. Research context
field of computer science. It allows computational models that are
composed of multiple processing layers to learn representations of data Participants were 456 students who registered in Data Visualization
with multiple levels of abstraction. These methods have dramatically for Storytelling and Discovery in Journalism, a professional development
improved the state-of-the-art in the fields of speech recognition, visual MOOC designed and managed by a large research university in Southern
recognition, object detection and many other domains (LeCun, Bengio, United States. This MOOC primarily targets at journalism professionals
& Hinton, 2015). In the field of NLP, algorithms of deep learning have but it is also open to the public for free. It was deployed on Moodle and
also been widely used. Among these algorithms, BERT (Bidirectional lasted around four weeks in June 2018. This MOOC consists of four
Encoder Representations from Transformers) stood out with exceptional modules. In each module, there are learning materials such as readings,
performance ever since it was introduced by researchers at Google AI in instructional videos, quizzes, and a discussion forum. Students were
2018 (Devlin, Chang, Lee, & Toutanova, 2018). It was considered rev­ encouraged to participate in the forum by the end of each module to
olutionary in the Machine Learning community by excelling in a wide discuss certain questions regarding the topic of the module and reflect
variety of NLP tasks due to (1) its original self-supervised learning on the concepts and techniques they have learned. Before posting in the
methods such as masked language models and next sentence prediction forums and interacting with peers, learners were expected to watch the
(Devlin et al., 2018) and (2) its use of the transformer architecture that instructional videos, finish the readings and quizzes that may help them
makes use of the self-attention mechanism to better learn and retain the familiarized with the new concepts and techniques. Typically, there
meaning of a sequence of words (Vaswani et al., 2017). BERT’s inno­ would be several open-ended questions/tasks posted by the instructor in
vative self-supervised learning methods as well as the use of each forum that required learners to respond based on what they have
self-attention allow it to better capture the contextual meanings of learned from the videos and readings. Specifically, there were a lot of
sentences. Besides the advantage of contextual learning, BERT’s supe­ questions/tasks that required learners to create data visualization arti­
rior performance on various downstream tasks such as text classification facts (e.g., interactive graphs/maps) of their chosen topics. Examples of
is also due to leveraging the benefits of its pretrained parameters for these artifacts include graphs/maps of mortality rate of Ebola in Africa,
transfer learning. Transfer learning allows models to inherit an suicide rate in Singapore, consumer satisfaction in France etc. Given the
outstanding base performance on tasks such as natural language infer­ diverse backgrounds of learners, they chose vastly different topics. After
ence and paraphrasing (Fedus, Goodfellow, & Dai, 2018), which helped posting their own answers to the questions, learners were encouraged to
generate more accurate results for tasks such as text classification (Liu, read their peers’ posts and post comments. Through the constant con­
Sun, Lin, & Wang, 2016). Recent works have shown that BERT for versations in which learners present their own ideas and critique one
transfer learning can significantly improve model performance even another’s answers, they deepen their understanding of the topic and co-
with limited data (Lan et al., 2019). For example, Liu et al. (2019) construct knowledge by stimulating and participating in critical dis­
achieved an accuracy of 83.2% on a dataset for high school English courses. To obtain a completion certificate of the MOOC, learners had to
reading comprehension by optimizing BERT, while the previous best answer 75% of the questions in the forums and reply to at least two posts
prediction accuracy rate based on the same dataset was 44.1% in 2017. in each forum to give feedback to their peers.
Nogueira and Cho (2019) adopted BERT to attend a data mining In this study, the learner network was formed by aggregating the
competition for question-answering. Their results ranked the 1st among number of interactions among learners across all four forums (one forum
all the contestants, and have outperformed the previous best score by for each module). Interaction is defined as “replying to” others’ posts.
27%. Therefore, in this study, BERT was selected for text analysis. Each learner is a node, while an edge is established when one learner
replies to another in the forum, regardless of which forum the interac­
3. Research questions tion occurs in.

The review of previous literature reveals that: (1) How learners’ 4.2. Building a text classification model to identify social presence
social presence mediates the learning process in MOOC context was not indicators
thoroughly studied. Specifically, the impact of social presence on stu­
dents’ positions in the learner network was unclear; (2) learners’ posts In the context of MOOCs with a massive learner base that generated
provides rich contextual information for researchers to understand the huge amounts of text data, traditional qualitative analysis approach
dynamics of learner interactions, yet few studies adopted content anal­ with manual coding is labor-intensive and impractical. To address this
ysis in combination with SNA; (3) the existing text classification models methodological challenge, this study proposes to adopt the latest ma­
still face many challenges in analyzing text based on more sophisticated chine learning techniques to build a text classification model to auto­
theoretical models. In response to these issues, the current study at­ matically classify the forum posts based on a well-validated theoretical
tempts to build a more robust text classification model to identify framework of social presence (Table 1). This social presence framework
different social presence indicators in learners’ posts, and examine what was first proposed by Rourke, Anderson, Archer, and Garrison (1999)
social presence indicators correlate with their prestige in the learner and revised by Shea et al. (2010). It hypothesizes modes of social
network, which is measured by both in-degree and authority score. presence including the textual demonstration of affective expression, open
Specifically, this study aims to answer the following questions: communication, and group cohesion, which are necessary to establish a
sense of trust and, ideally, membership in a community dedicated to
(1) How do learners’ social presence correlate with their prestige in joint knowledge construction. Under these three dimensions, there are
the learner network in the MOOC? multiple indicators that further describe different aspects of social
(2) How much of the variance in student’s prestige can be explained presence. We excluded several indicators in the original framework (e.
by their social presence? g., Addresses the group using inclusive pronouns, Use of humor,

4
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

Table 1
Theoretical framework and coding scheme of social presence, adapted from Shea et al. (2010).
Aspects Indicators Definitions Examples

Affective Positive emotions Pleasant emotions “i think the information you gathered is really interesting”
expression Negative emotions Unpleasant emotions “I got a bit confused now because i was later working with a lot of
similar ones”
Self-disclosure Present details of one’s personal life such as professional “I’m also from ohio (dayton), so i get your hot water situation”
and family background, life experiences, personal
preferences etc.
Open Referencing others Direct references to contents of others’ posts “Like @Becky suggested with 11 countries listed it could well be
Communication displayed on a world map”
Asking questions Raise questions to others “What do you think?” “Improvements to suggest?”
Complimenting others Remarks expressing praise and admiration to others “I find it very informative, also you did a great job with the legends and
notes attached”
Expressing gratitude Grateful remarks to others, thank others “Thanks, your comments are valid and important”
Expressing agreement Agree with others “I think you’re right, it seems to describe the relationship better”
Disagreement/ Disagree with others, or doubts, concerns, criticisms “I don’t think map chart can add any value for this life expectancy
doubts/criticism information”
Offering advice Offering suggestions to others “I would suggest creating three charts for each of the year”
Personal opinion/ Personal opinions/judgments on a specific topic; or “my trick was to use the ‘who won where’ map three times to make the
reflection reflections on one’s learning/problem-solving process comparisons easier”
Sharing resources Provide external resources that support learning and “You can check out some examples here: https://www.tableau.
problem-solving com/best-beautiful-data-visualization-examples”
Group Cohesion Using vocatives Address or refer to others by name “Hi Chris, I went to the link that you included in your critique”

Unconventional emotion expression) since they were extremely rare (N <


10) in our dataset. We removed them because such low occurrence could Table 2
Description of features extracted.
not make a reliable training dataset to train our model. Table 1 shows
the social presence indicators we retained for training the model. Features Description
A total of 4650 posts were extracted from the forums in the MOOC, Part-of-Speech Features Grammar tag of each word such as verb, adjectives,
which consists of 23,755 individual sentences. There were three phases (POS) and adverbs, punctuation
in building and validating the text classification model: (1) 3500 sen­ Named Entity Recognition Information extracted from words such as names of
Features (NER) people, organizations, and locations
tences were randomly selected and manually coded into different cate­ BERT’s word embedding Numeric representations of words by using BERT’s
gories of social presence (Table 1). Since the posts in this MOOC were features (WE) word embeddings (e.g., the word “book” and the
relatively long and hard to describe using a single indicator of social word “dog” may have very different numeric vector
presence, we examined the posts at sentence level, meaning that each representations that are distant from one another,
since they are usually used in different contexts; but
sentence was assigned a code of social presence. Three researchers were
“cat” and “dog” may have numeric vector
involved in this qualitative coding process. Each of them was randomly representations that are closer to one another)
assigned around 1200 sentences for manual coding. They met constantly
and compared codes until the agreement rate reached 100%. (2) All
labeled data were used to train and validate the text classification model test if these linguistic features would add any value to the overall per­
by testing different algorithms. (3) The best performing algorithm from formance than training only raw data. To handle the POS and NER tags
phase 2 was applied to automatically classify the rest of the unlabeled properly, we first specified a comprehensive list of POS and NER tags as
sentences into different categories of social presence. special tokens for the BERT’s tokenizer so that these tags will not be split
Among the thirteen social presence indicators we retained, the ma­ by BERT’s subword tokenizer. We then resized BERT’s token embed­
chine learning approach was only applied to identify ten of them. Three dings with the new vocabulary size such that these special tokens would
social presence indicators, namely Asking questions, Sharing resources and get learned vector values during training.
Using Vocatives, were identified by detecting the presence of certain To compare the performance of BERT with tradition machine
linguistic markers. Specifically, a sentence would be labeled as Asking learning algorithms, we also used Random Forest to build our classifier
questions if there is a question mark at the end of the sentence; if there is to provide a base performance. Random Forest was selected because of
a hyperlink within the sentence which indicates the sharing of content its ensemble feature from a set of decision trees that usually yields
on an external webpage, it would be labeled as Sharing resources; Using desirable results (Kovanović et al., 2016; Liu, Kidziński, & Dillenbourg,
Vocatives, defined as addressing or referring to others by name, was 2016; Xu, Guo, Ye, & Cheng, 2012). Python packages PyTorch and
identified based on the presence of a name mentioned within the stu­ Scikit-learn were used to test two types of supervised machine learning
dents’ name list in the MOOC. We used this simpler approach to capture algorithms: Transfer learning with BERT and Random Forest.
these three social presence indicators because it is more efficient and The python package Bert-as-service was used to extract the word
effective than machine learning approach. For the model training and embedding features to apply to Random Forest, while BERT applied its
evaluation, we used the Python packages Transformers by HuggingFace inherent word embedding features to build the classifier. The afore­
(Wolf et al., 2019) and Scikit-learn (Pedregosa et al., 2011). mentioned POS and NER tags were added to the raw data for training
using Random Forest. For BERT, however, simply concatenating POS
4.2.1. Feature extraction and building the classifier and NER tags with raw text would be problematic, as BERT would treat
We adopted three features in building the text classifier: Part-of- these tags as unknown words. Therefore, we customized the word em­
Speech (POS), Named Entity Recognition (NER), and BERT’s word beddings in BERT by specifying special tokens as linguistic markers, so
embedding (WE) features (see the description of each in Table 2). Spe­ that they would have contextual numeric representations that are
cifically, we consider NER an important feature due the context of this different from the tokens in raw text.
MOOC. This is because learners’ responses in the forums involved a lot Depending on the algorithms, the labeled dataset was split differ­
of names of specific organizations, locations, numbers, and duration. We ently in the training process. For Random Forest, 80% of the dataset was
trained raw text inputs concatenated with POS and NER tags in order to sampled randomly for 5-fold cross validation while the rest was used for

5
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

testing. For BERT, the dataset was split into three parts: training, vali­ 4.4. Correlation analysis between learners’ social presence and their
dation, and testing sets with a ratio of 60%, 20%, and 20% respectively. prestige in the forum
Specifically, 60% of the labeled data were used to train the BERT model
to detect features and classify posts. Cross-validation was not used After calculating the frequency of occurrence of each type of social
because deep learning models like BERT are usually much more presence for every individual learner, we then converted the frequency
expensive to train than traditional machine learning models. It would be of occurrence to percentage of occurrence, meaning that each social
impractical to apply cross validation since our BERT model has 12 layers presence indicator was presented as a percentage of all the posts of each
with 110 million parameters in the training process. Therefore, we used learner. For example, if a learner posted ten sentences, and two of them
20% of the labeled data which were not included in the training set to were questions, then the percentage of Asking questions for this learner is
validate the BERT models. The rest of the 20% served as testing data to 0.2. The purpose of using percentage is to control the effects of the
compare the labels generated by the BERT model and the labels given by number of posts on learners’ prestige by solely focusing on the impact of
human coders. different social presence indicators. Later, correlation analyses were
conducted to detect the correlation between social presence and
4.2.2. Model evaluation learners’ prestige in the forum measured by in-degree centrality and
Metrics such as accuracy, precision, recall, F-measure, and Mat­ authority score.
thew’s correlation coefficient were used to show the performance of
each model (see Table 3). After comparison, the model built on BERT 4.5. Using social presence to predict learners’ prestige in the learner
and NER features was chosen to analyze the remaining posts, due to its network
superior performance over other models. It is interesting to note that
although prior research oftentimes concluded that adding linguistic In order to explore how much of the variance of learners’ prestige
features improved the performance of Random Forest, this study shows (both in-degree and authority scores) can be explained by social pres­
decreased robustness of Random Forest when the linguistic features ence, multiple linear regression analysis was conducted, with social
were used in conjunction with word embeddings. presence indicators as independent variables and students’ in-degree
and authority scores as dependent variables. Logarithmic trans­
4.3. Measuring learners’ prestige in the forums formation was performed on the data before running regression analysis
due to the high skewness of data.
Learners’ prestige in the forums was measured using two SNA pa­
rameters: in-degree centrality and authority score. Specifically, in- 4.6. Comparing social presence in the posts of learners with high and low
degree centrality is determined by the total number of replies one re­ prestige
ceives from others. A learner with a high in-degree value indicates that
he or she receives many replies from peers. Authority score, on the other To compare how social presence differ between learners with high
hand, takes one step further by taking into account the quality of one’s and low prestige, we selected the top 20% learners (n = 84) based on
connections. A learner with a high authority score suggests that his or their authority scores, and the bottom 20% learners (n = 88) with an
her posts not only receive many responses, but those responses come authority score of 0, and analyzed all of their posts to see whether there
from peers who also received many replies. This implies that the posts of are distinct characteristics in terms of social presence. Authority score
high authority learners may be more valuable and worthy of further was chosen to measure learners’ prestige because conceptually it cap­
discussion than those from low authority learners. The calculations of tures learners’ level of connection in a network by taking into account
both in-degree centrality and authority score were completed in Gephi, both the quantity and quality of one’s social ties. We then conducted
an open-source network analysis and visualization software. statistical tests on the percentage of each type of social presence in their
posts to compare whether the differences between these two groups
were significant.

Table 3 4.7. The social presence of posts with high response rate within high
Evaluation of machine learning models for text classification. authority learners
Models Features Accuracy Precision Recall F1 Matthew’s
correlation To delve deeper into identifying the characteristics of posts that
coefficient received the most replies, we extracted all posts from learners in the high
BERT Raw text 0.80 0.81 0.75 0.77 0.77
authority group (n = 84). Since not all of their posts had equally high
BERT POS 0.80 0.81 0.78 0.79 0.77 response rate, we only included the top 20% of their posts (n = 318) in
BERT NER 0.83 0.82 0.80 0.81 0.81 terms of number of responses to identify their social presence features.
BERT POS + 0.80 0.79 0.78 0.79 0.77 We selected posts within the high authority group in order to explore
NER
whether there are consistent patterns of social presence in their highest
Random Raw text 0.33 0.57 0.24 0.23 0.30
Forest responded posts.
Random POS 0.37 0.52 0.27 0.26 0.33
Forest 5. Findings
Random NER 0.39 0.62 0.29 0.30 0.36
Forest
Random POS + 0.44 0.65 0.35 0.35 0.41
5.1. The correlations between social presence and learners’ prestige in the
Forest NER forum
Random WE 0.56 0.57 0.46 0.46 0.51
Forest Spearman’s correlation analyses were conducted to detect the cor­
Random POS + 0.51 0.59 0.41 0.41 0.45
relation between social presence and learners’ prestige in the forum
Forest WE
Random NER + 0.55 0.57 0.45 0.45 0.49 measured by in-degree centrality and authority score. We chose Spear­
Forest WE man’s correlation since the data was not normally distributed. Due to
Random POS + 0.51 0.53 0.40 0.40 0.44 the conceptual similarity between in-degree centrality and authority,
Forest NER + the correlation patterns between these two prestige measures and
WE
learners’ social presence were remarkably strong (Table 4). A very high

6
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

Table 4 didn’t find any pair of them displaying inter-correlations over 0.9
The correlations between social presence, in-degree and authority score. (Nistor et al., 2020). Therefore, we included all of them in the prediction
In-degree Authority model. Logarithmic transformation was conducted due to the high
skewness of the data. The results of in-degree measure showed a sig­
Total number of sentences .744*** .708***
Offering advice .136** .138** nificant R2 of 0.58, F (12, 455) = 50.35, p < .000. Specifically, six social
Expressing agreement .150** .134** presence indicators, namely Offering advice, Expressing gratitude, Positive
Complimenting others .181*** .201*** emotions, Self-disclosure, Asking questions and Sharing resources, were
Disagreement/doubts/criticism -.100* -.121** found to be significant predictors of in-degree measure (see Table 5),
Expressing gratitude .375*** .380***
Negative emotions -.152** -.138**
which implies that these social presence indicators in combination
Personal opinion/reflection -.044 -.083 accounted for 58% of the variance in the number of replies students
Positive emotions .192*** .231*** received from peers.
Self-disclosure .407*** .401*** While using authority scores as dependent variables, the results also
Asking questions .313*** .241***
showed a significant R2 of 0.43, F (12, 455) = 28.23, p < .000. Six social
Referencing others .173*** .139**
Sharing resources .478*** .454*** presence indicators, Offering advice, Expressing gratitude, Positive emo­
Using vocatives .380*** .390*** tions, Self-disclosure, Sharing resources and Using vocatives were found to
be the significant predictors of authority score (see Table 6), suggesting
***. Correlation is significant at the 0.001 level (corrected by the sequential
Bonferroni method).
that these social presence indicators in combination explained 43% of
**. Correlation is significant at the 0.01 level. the variance in learners’ authority scores. Compared with the results of
*. Correlation is significant at the 0.05 level. using in-degree measure as dependent variables, Asking questions con­
tributes to the prediction of the number of replies one received, but has
correlation was found between in-degree and authority scores (r = no impact on one’s authority score; whereas Using vocatives contributes
0.932), indicating that they are similar to one another in terms of to the prediction of one’s authority score, but has no impact on the
measuring learners’ prestige in a network. number of replies one received.
In-degree value describes the number of replies a learner receives in
the discussion forums. Results from correlation analyses showed that
5.3. Comparing social presence in the posts of learners with high and low
there were significant correlations between in-degree and five social
prestige
presence indicators, namely Expressing gratitude, Self-disclosure, Asking
questions, Sharing resources and Using vocatives, though the correlations
To compare how social presence differed between learners with high
were moderate (Cohen, 2013). This implies that learners with relatively
and low prestige in the learner network, we selected the top and bottom
higher response rate devoted a higher percentage of their posts to ex­
20% learners based on their authority scores, which is a more compre­
press gratitude to others (r = 0.371), disclose personal experiences (r =
hensive measure of prestige than in-degree value. All of the selected
0.401); ask questions about course content or feedback from peers (r =
learners’ posts were analyzed to see whether there were distinct char­
0.313), share external resources (r = 0.478) and address peers by names
acteristics in terms of social presence. We calculated the percentage of
(r = 0.380). Likewise, these social presence indicators also moderately
each type of social presence in each learner’s posts they created
correlated with learners’ authority scores, except for Asking questions,
throughout the course, then conducted Mann–Whitney U tests to
which has a smaller correlation coefficient (r = 0.241) with authority
examine whether the differences between the high and low authority
scores, indicating that Asking questions had weaker association with
groups were significant. We chose Mann–Whitney U test since the data
one’s authority score than on in-degree value.
was not normally distributed. Ten social presence indicators were found
Besides the five social presence indicators mentioned above, both in-
to have significant differences between the high and low authority
degree and authority scores also correlated with Positive emotions, of­
groups (see Table 7, Appendix A): Complimenting others, Disagreement/
fering advice, Expressing agreement, Complimenting others, and Referencing
doubts/criticism, Expressing gratitude, Negative emotions, Personal opinion/
others, but the correlations were relatively weak (r < 0.3), indicating
reflection, Self-disclosure, Asking questions, Sharing resources, Using
that these social presence still affected learners’ in-degree and authority,
vocatives.
though the effects were less powerful. Interestingly, two social presence
indicators, Disagreement/doubts/criticism and Negative emotions, were
5.3.1. Least occurred social presence indicators in high authority learners
found to have negative, though weak, correlations with in-degree and
Consistent with the correlation result, two social presence indicators,
authority scores (r > − 0.2), meaning learners who expressed Disagree­
Disagreement/doubts/criticism and Negative emotions, occurred at a low
ment/doubts/criticism or Negative emotions were less likely to receive
percentage in high authority learners’ posts (Fig. 1). Specifically, 89% of
responses from peers and resulted in lower prestige in the forum. Last
high authority learners expressed Disagreement/doubts/criticism at a very
but not least, Personal opinion/reflection was the only type of social
low percentage (less than 5%), whereas only 48% of the low authority
presence that had not significant correlation with in-degree or authority,
learners expressed Disagreement/doubts/criticism at such a low ratio.
implying that simply stating personal thoughts or reflecting on one’s
Similarly, the majority of high authority learners (78%) expressed low
learning does not contribute to one’s status in the network in any way.

Table 5
5.2. Using social presence to predict learners’ prestige in the learner Regression analysis between social presence indicators and in-degree measure.
network Effect Estimate SE 95% CI p

LL UL
In order to examine how much of the variance in students’ prestige
can be explained by their social presence, multiple linear regression Intercept .522 .063 .398 .647 .000***
Advice .131 .053 .026 .235 .014*
analysis was conducted, with twelve social presence indicators as in­ Gratitude .142 .037 .070 .214 .000***
dependent variables and students’ in-degree and authority scores as Positive emotion .152 .052 .049 .255 .004**
dependent variables. We excluded Personal opinion/reflection from the Self-disclosure .236 .071 .096 .376 .001**
independent variables because it was not significantly correlated with Question .129 .050 .030 .228 .011*
Share resources .285 .058 .171 .398 .000***
learners’ prestige (Table 4). We also examined the correlations between
the twelve social presence indicators to check multicollinearity, but *p < .05. **p < .01. ***p < .001.

7
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

Table 6 and identify the consistent patterns of their highly responded posts, we
Regression analysis between social presence indicators and authority score. extracted all of high authority learners’ posts and analyzed the top 20%
Effect Estimate SE 95% CI p of them (n = 318) in terms of number of responses. Fig. 2 shows the
average percentage of each social presence indicator in their highly
LL UL
responded posts. It is noticeable that Personal opinion/reflection is the
Intercept -.006 .003 -.011 -.001 .027* dominant social presence that occupies 52% of these high authority
Offering advice .007 .002 .003 .012 .002**
Expressing gratitude .005 .002 .001 .008 .004**
learners’ most responded posts. This is expected since those who
Positive emotions .005 .002 .001 .010 .016* received many responses oftentimes contribute a lot of original thoughts
Self-disclosure .012 .003 .006 .018 .000*** and reflection (typically explaining how they approached the problem),
Sharing resources .009 .002 .004 .014 .000*** which ultimately helped them achieved high prestige in the forum.
Using vocatives .008 .002 .003 .012 .001**
Expressing gratitude, the second most frequent social presence indicator,
*p < .05. **p < .01. ***p < .001. makes up 11% of their posts. This is also anticipated since the well-
crafted posts of these high authority learners triggered a lot of feed­
back from peers, which led to them thanking peers’ feedback. The
Table 7 amount of gratitude high authority learners expressed also reflects that
Comparing social presence in the posts of high and low authority learners. they engaged in more conversations with different peers than average or
High authority Low authority p low authority learners. Using vocative and Sharing resources both occupy
group group around 7% of their posts. Using vocative is typically used in combination
N Mean rank N Mean rank with Expressing gratitude (e.g., “Hi Alex! Thanks for your feedback”).
Offering advice 84 89.77 88 87.38 .394 Therefore, the frequent occurrence of Using vocative also reflects the
Expressing agreement 84 88.54 88 84.55 .587 frequent conversations high authority learners had with their peers.
Complimenting others 84 101.86 88 71.84 .000* Besides, Sharing resources also seems to be a good approach to trigger
Disagreement/doubts/criticism 84 77.15 88 95.42 .016* responses. Interestingly, among these highly responded posts, high au­
Expressing gratitude 84 114.65 88 59.62 .000*
Negative emotions 84 73.67 88 98.75 .001*
thority learners expressed more Positive emotions (6%) than Negative
Personal opinion/reflection 84 70.74 88 101.55 .000* emotions (4%).
Positive emotions 84 92.69 88 80.59 .110
Self-disclosure 84 100.82 88 72.84 .000* 6. Discussions and conclusions
Asking questions 84 93.78 88 79.55 .004*
Referencing others 84 107.00 88 66.93 .000*
Sharing resources 84 116.04 88 58.30 .000* In order to understand learners’ social presence in MOOC context,
Using vocatives 84 107.64 88 66.32 .000* this study attempts to examine the relationship between social presence
*. The significant level is 0.05 (two-tailed).
and learners’ prestige in the learner network of a MOOC. An automated
text classification model based on the latest machine learning tech­
niques was developed to identify different social presence indicators
percentage of Negative emotions (less than 5% of their posts), while only
from forum posts, while two metrics (in-degree and authority score) in
38% of the low authority learners expressed so little Negative emotions.
social network analysis (SNA) were used to measure learners’ prestige in
the learner network. Results revealed that certain social presence in­
5.3.2. Least occurred social presence indicators in low authority learners
dicators such as Asking questions, Expressing gratitude, Self-disclosure,
In low authority learners, six social presence indicators were found
Sharing resources and Using Vocatives have positive correlations with
to occur at a low percentage: Complimenting others, Expressing gratitude,
learners’ prestige, while the expressions of Disagreement/doubts/criticism
Asking questions, Self-disclosure, Sharing resources and Using vocatives. For
and Negative emotions were counterproductive to learners’ prestige. The
instance, as shown in Figs. 1 and 85% of the low authority learners
findings will inform MOOC learners in terms of how to strategically
devoted less than 5% of their posts in Sharing resources. By comparison,
present themselves in the discussion forums to increase the possibilities
only 32% of the high authority learners shared resources at such a low
of peer interaction and achieve productive learning outcomes. While for
proportion. Such stark contrast in the ratio of Sharing resources implies
MOOC instructors, this study will potentially inform them how to pre­
that learners may receive significantly more attention in the forum if
sent themselves to gain bigger influence in the learning network and
they share more resources in their posts. The same pattern was also
mediate the discussions more effectively as a facilitator.
found in Complimenting others, Expressing gratitude, Self-disclosure, Asking
To combat the challenge of analyzing large scale of text data, this
questions and Using vocatives, which were found to occur much more
study proposed a text classifier based on BERT, a latest and revolu­
frequently in high authority learners than in low authority learners.
tionary model in NLP, to analyze forum posts in a MOOC in terms of
social presence. Our model leverages BERT’s exceptional capacity to
5.3.3. Commonalities between high and low authority group
achieve higher accuracy due to its sensibility to detect contextual in­
Referencing others is found to be the least occurred category among all
formation of words and its powerful pre-trained data, which is a sig­
social presence indicators, both in the low authority and high authority
nificant progress compared to traditional methods that isolated words
group. Personal opinion/reflection, on the other hand, exhibited a rela­
from their contexts, such as BOW approach (Almatrafi et al., 2018; Wise,
tively balanced distribution of ratio in both groups (Fig. 1). However,
Cui, & Vytasek, 2016; Wise et al., 2017). By comparing the performance
there were still noticeable variances between the two groups. For
of BERT models when adding different linguistic features, this study
example, almost half of the low authority learners (49%) devoted more
concluded that BERT with NER yielded the best results on our dataset.
than 30% of their posts in expressing personal opinions and reflections,
Our model achieved comparable performance in a more challenging text
while only 14% of high authority learners did the same. This pattern
classification task (classifying text in ten categories) with previous
implies that in general, Personal opinion/reflection occupied less space in
classifiers trained to classify text in relatively less categories (typically
high authority learners’ posts compared to those of the low authority
less than five categories) (Hu et al., 2018; Kovanović et al., 2016; Xing
learners.
et al., 2019).
Correlation analyses were conducted to investigate the association
5.4. The social presence of posts with high response rate between social presence and learners’ prestige. Results revealed that
both in-degree and authority scores were positively correlated with
To delve deeper into the social presence of high authority learners certain social presence indicators including Asking questions, Expressing

8
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

Fig. 1. Comparisons of four social presence indicators between high and low authority learners.

gratitude, Self-disclosure, Sharing resources and Using Vocatives. This im­ the forums. Besides these five social presence indicators, weaker but
plies that to gain higher popularity or prestige, one should include significant correlations with in-degree and authority scores were found
higher percentage of these five social presence components in their in Offering advice, Expressing agreement, Complimenting others, Positive
posts. An example could be “Hi Chris, I am glad to hear that you had a good emotions, and Referencing others, revealing that giving advice, agree­
time in Paris, which is my hometown! And thank you for your feedback on my ments, compliments, positive emotions and referencing others’ mes­
graph, I created it using Tableau (www.tableau.com). I am a beginner so i sages may also help learners foster more interactive dialogues and gain
don’t know many other visualization tools, would you recommend some?” popularity in the forum, though less effective than Asking questions,
This example covers the five social presence indicators that had rela­ Expressing gratitude, Self-disclosure, Sharing resources and Using Vocatives.
tively stronger correlations with a learner’s popularity and prestige: it Despite the discrepancies regarding the effects of social presence in
starts with addressing an individual by name, discloses personal infor­ previous studies, with some found social presence have significant
mation, expresses gratitude, shares the link of a tool to complete the contribution to learning outcomes (Cobb, 2011; Gunawardena & Zittle,
task, and ends with a question. By adding up these essential elements in 1997) while others proved that it served only small or ancillary func­
a post, the author personalizes the conversation, provides valuable re­ tions in learning (Díaz, Swan, Ice, & Kupczynksi, 2010; Ke, 2010; Shea &
sources, and presents himself/herself as a friendly and constructive Bidjerano, 2009), the results of this study established a positive link
member of the community, thus more likely to achieve higher prestige in between social presence and learners’ network prestige in the learning

9
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

trigger responses, learners who experience negative emotions may also


consider turning their frustration into specific questions and ask peers or
the instructor for possible assistance.
Personal opinion/reflection was the only type of social presence that
had not significant correlation with in-degree or authority, implying that
simply stating personal thoughts or reflecting on one’s learning does not
contribute to one’s status in the network in any way. In our research
context, each module of the MOOC required students to complete a data
visualization related task (e.g., creating or critiquing a figure) and post
their results and reflections in the forums. Completing the required tasks
and writing a brief reflection (coded as Personal opinion/reflection) on the
task were the minimum requirement to get the course credit. As a result,
most students only did the minimum for the course credit, while some
committed students put in extra time and effort to read and comment on
the posts of their peers, which eventually helped them achieve higher
authority in the learner network. Therefore, it is expected in this context
Fig. 2. The social presence of high authority learners’ posts with high that Personal opinion/reflection does not distinguish learners with high
response rate. and low prestige.
The multiple linear regression analysis showed six social presence
community of a MOOC. indicators, namely Offering advice, Expressing gratitude, Positive emotions,
Self-disclosure, Asking questions and Sharing resources, were significant
An interesting finding was that Disagreement/doubts/criticism were
negatively correlated with learners’ prestige, suggesting the expression predictors of in-degree measure, and in combination accounted for 58%
of the variance in in-degree. Whereas Offering advice, Expressing grati­
of Disagreement/doubts/criticism is linked to the decrease of one’s pres­
tige in the forums. However, the correlations were small (r < 0.2) tude, Positive emotions, Self-disclosure, Sharing resources and Using voca­
tives were found to be the significant predictors of authority score and in
(Cohen, 2013). In a social learning context, it’s not uncommon to see the
emergence of disagreements, typically when a learner raises doubts or total explained 43% of the variance in learners’ authority scores. These
concerns when critiquing the work of others. In doing so, they may risk results suggest that when learners provide advice, express gratitude and
offending others by pointing out the problems, thus not conducive in positive emotions, share personal background/stories and relevant re­
establishing rapport with peers. Nonetheless, it is important to bring up sources, they are more likely to elevate their centrality in the learner
conflicting viewpoints because dissenting information may disrupt one’s network, which is exhibited by both attracting more replies and con­
existing cognitive framework and create a state of disequilibrium, trig­ necting with more prestigious peers. In our research context, what stu­
gering a learner to reflect on his or her prior knowledge and make ad­ dents required to do in the forums were typically completing a data
justments to accommodate the new information. This process enables visualization task (creating an interactive graph of a given/self-chosen
higher order thinking and is crucial for knowledge construction, ac­ theme) and critiquing the work of peers. The social presence in­
cording to the view of constructivist learning (Piaget, 2013; Vygotsky, dicators we found significantly predict learners’ prestige seem to follow
1980). Therefore, it is beneficial and sometimes necessary to express the typical interaction patterns emerged from the collaborative dialog in
Disagreement/doubts/criticism. the forums. That is, a learner started a thread, posted his or her response
Our findings also suggest that the expression of Negative emotions is to the assignment that oftentimes contained external links to his or her
negatively correlated with learners’ prestige, which is echoed by the data visualization artifact and/or the dataset/tools he or she used to
relatively lower percentage of Negative emotions (4%) in high authority create that artifact (Sharing resources). His or her post then attracted
learners’ most responded posts. However, due to the diverse back­ peers’ positive comments and advices (Positive emotions and Offering
grounds of learners in MOOC contexts, it is natural that some of them advice), with the disclosure of one’s personal background/experience
experienced struggles because of the lack of prerequisite knowledge, (Self-disclosure) and expressions of gratitude for others’ new information
resulting in the emergence of negative emotions such as feeling chal­ and advice (Expressing gratitude) weaving in and out of the conversation
lenged and confused. These posts might not invite as many responses as threads. This flow of communication reflects a typical process of expe­
riential learning, in which participation in specific activities and the
those that convey more positive emotions, they are still necessary
because voicing confusion, frustration and complaints help MOOC in­ subsequent negotiation of meaning trigger a cognitive process based on
cycles of making an experience, reflecting on it in terms of observable
structors locate the struggling learners and the difficulties in the course
content. Over the last five years, there have been a lot of research efforts artifacts or abstract concepts, and planning follow-up experiences (Kolb,
1984). More interestingly, the regression results of using in-degree
focusing on identifying learners’ emotions, confusions and help-seeking
behaviors by analyzing their forum posts (Agrawal et al., 2015; Alma­ measure as dependent variables showed that Asking questions contrib­
utes to the prediction of in-degree, but not to one’s authority score;
trafi et al., 2018; Chandrasekaran, Kan, Tan, & Ragupathi, 2015;
Hecking, Hoppe, & Harrer, 2015). These studies developed new tech­ whereas Using vocatives significantly predicts one’s authority score, but
has no impact on the in-degree. This finding suggests that asking ques­
niques attempting to accurately detect learners’ struggles during the
learning process. As these new techniques are gradually being incor­ tions may increase the number of replies one receives, but not neces­
sarily attract the responses from well-connected peers. Whereas
porated in the MOOC forums, learners’ explicit expression of negative
emotions when encountering difficulties will help alert the instructors to addressing peers by names (Using vocatives) in forum conversations,
though may not have a significant impact on triggering more responses,
step in and provide timely support. Previous studies on MOOCs provided
ample evidence that struggling learners were most vulnerable to lose does help to establish social ties with more well-connected peers. Ac­
cording to Rourke et al. (1999), the researchers who first proposed the
motivation and drop out due to delayed or lack of timely support
(Agrawal et al., 2015; Wen et al., 2014; Xing, Chen, Stein, & Marcin­ social presence framework, group cohesion is exemplified by activities
kowski, 2016). However, in online learning contexts with massive that build and sustain a sense of group commitment. Vocatives are an
number of learners, it is hard for the instructors to identify struggling important expression of group cohesion because they connote feelings of
learners if the learner did not voice their problems or ask for help. Based closeness and association. The teacher immediacy literature has also
on our finding that Asking questions in a post is an effective strategy to discovered an empirical connection between addressing students by
name and cognitive, affective, and behavioral learning (Christenson &

10
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

Menzel, 1998; Gorham, 1988; Gorham & Christophel, 1990; Sanders & infrequently in both groups. These findings imply that both Personal
Wiseman, 1990). Seeking to explain this connection, Kelly and Gorham opinion/reflection and Referencing others had marginal association with
(1988) found support for a relationship between vocatives and imme­ learners’ prestige in the network.
diacy of recall. Eggins and Slade (1997) support the use of vocatives to Interestingly, though Personal opinion/reflection did not make a sig­
facilitate social presence, noting “the use of vocatives would tend to nificant contribution to learners’ authority score, it dominated a high
indicate an attempt by the addresser to establish a closer relationship ratio (32%–65%) in the most responded posts of high authority learners.
with the addressee”. Our findings imply that, besides the aforemen­ Conversely, Asking questions and Self-disclosure had significant correla­
tioned five social presence indicators, asking questions and explicitly tions with learners’ authority score, yet they seemed to occur at a lower
addressing the post to a specific peer in a thread by using names are percentage among the most popular posts. These patterns are expected
conducive to increase one’s likelihood to attract more attention and because most learners only asked a limited amount of questions, or
build social ties with more well-connected peers, which in turn elevate shared a very brief personal experience in one post. The majority portion
one’s own centrality in the learning community. of their posts were devoted to explaining their thoughts, the way they
According to Lave and Wenger (1991), participants in a learning approached the questions or solved the problems they were assigned.
community display a certain diversity in terms of their positions in the Another interesting finding in the most responded posts is that Offering
relatively stable communication and interaction patterns (peripheral vs. advice has the potential to attract replies from peers, though this pattern
intermediate vs. central participants). The social component of these only appears in 5% of the most responded posts.
structures rests on the social identity of their members within the In all, while previous studies mostly focused on the relationship
community, and is inseparably intertwined with the cognitive compo­ between network centrality and learning outcomes (Jiang et al., 2014;
nent. Specifically, central participants assume more responsibility and Joksimović et al., 2016), this study filled the gap of existing literature on
perform more difficult tasks than peripheral members; therefore, their social presence by presenting the association between learners’ network
identity is that of an expert, which is both a cognitive attribute and a centrality and social presence. By introducing social presence as a
socially negotiated status (Lave & Wenger, 1991). In our context, the act framework to examine posts in the discussion forum, this study revealed
of Sharing resources (providing external links to one’s data visualization that learners’ prestige in MOOC forums was also related to the ways they
artifact or the dataset/tools used to create that artifact) characterizes a presented themselves. Our findings are significant not only to reinforce
more committed learner who put in the effort to explore interactive tools the crucial nature of social presence to active, participative learning
to build a more sophisticated artifact and shared the details/resources of activities in online learning environments, but also lend insight into
how to build it, compared to less committed learners who simply copied learners’ strategies of leveraging social presence to become more
and pasted static figures they created to the forum. Besides, those who influential in social learning contexts like MOOCs.
were willing to give advices to others also demonstrate more commit­
ment to learning or more expertise in the topic. Furthermore, the use of 7. Limitation
vocatives, positive tones, the expression gratitude, and the disclosure of
one’s own background/stories all manifest one’s attempt to establish In this study, we examined the link between social presence and
closer relationships with others. This echoes the study of Gunawardena learners’ prestige in the learning community of a MOOC. Due to the
et al. (2001) who found that social presence facilitates the building of unique context of this study, the results may not be generalizable.
trust. All these communication patterns help us distinguish central and Further studies are needed to investigate the impact of social presence
peripheral participants based on Lave and Wenger’s (1991) definition. using datasets from MOOCs of other topics to test the reliability and
The types of social presence we identified in the regression analysis validity of our findings. Methodologically, this study used SNA and
facilitate the knowledge construction in a learning community, which is content analysis to examine the role of social presence played in
reified in material or conceptual artifacts that enable participation at a learners’ engagement during the learning process. Future studies may
higher level. And this interplay of participation and reification, as explore other integrative methods that combine network and semantic
described by Wenger (1998), are the quintessence of the learning pro­ analysis, such as cohesion network analysis, to model the interactions
cess in communities of practice. between learners from both social and cognitive perspectives.
The comparisons of social presence between learners with high and
low authority scores echo the patterns found in the correlation analyses, CRediT authorship contribution statement
with high authority learners composing their posts with higher per­
centage of Asking questions, Expressing gratitude, Self-disclosure, Sharing Wenting Zou: Conceptualization, Methodology, Writing - original
resources and Using Vocatives, and low percentage of Disagreement/ draft. Xiao Hu: Supervision, Conceptualization, Methodology, Writing -
doubts/criticism and Negative emotions. Additionally, Personal opinion/ review & editing. Zilong Pan: Methodology, Visualization, Writing -
reflection occupied high percentage in the posts of both high and low review & editing. Chenglu Li: Data curation, Software, Writing - review
authority learners, while Referencing others occurred equally & editing. Ying Cai: Writing - review & editing. Min Liu: Supervision.

Appendix A. Comparisons of social presence between high and low authority learners

11
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

12
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

. (continued).

13
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

. (continued).

14
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

. (continued).

References Christenson, L., & Menzel, K. (1998). The linear relationship between student reports of
teacher immediacy behaviors and perceptions of state motivation, and of cognitive,
affective and behavioral learning. Communication Education, 47, 82–90.
Agrawal, A., Venkatraman, J., Leonard, S., & Paepcke, A. (2015). YouEDU: Addressing
Chung, C. K., & Pennebaker, J. W. (2014). Using computerized text analysis to track
confusion in MOOC discussion forums by recommending instructional video clips. In
social processes. The Oxford Handbook of Language and Social Psychology, 219–230.
Proceedings of the 8th international conference on education data mining (pp. 297–304).
Cobb, S. C. (2011). Social presence, satisfaction, and perceived learning of RN-to-BSN
New York, NY, USA: ACM.
students in web-based nursing courses. Nursing Education Perspectives, 32(2), 115.
Al-Rahmi, W. M., Alias, N., Othman, M. S., Marin, V. I., & Tur, G. (2018). A model of
Cohen, J. (2013). Statistical power analysis for the behavioral sciences. Academic press.
factors affecting learning performance through the use of social media in Malaysian
Dascalu, M., Dessus, P., Bianco, M., Trausan-Matu, S., & Nardy, A. (2014). Mining texts,
higher education. Computers & Education, 121, 59–72.
learner productions and strategies with ReaderBench. In Educational data mining (pp.
Almatrafi, O., Johri, A., & Rangwala, H. (2018). Needle in a haystack: Identifying learner
345–377). Cham: Springer.
posts that require urgent response in MOOC discussion forums. Computers &
Dascalu, M., McNamara, D. S., Trausan-Matu, S., & Allen, L. K. (2018). Cohesion network
Education, 118, 1–9.
analysis of CSCL participation. Behavior Research Methods, 50(2), 604–619.
Anderson, A., Huttenlocher, D., Kleinberg, J., & Leskovec, J. (2014, April). Engaging with
De Barba, P. G., Kennedy, G. E., & Ainley, M. D. (2016). The role of students’ motivation
massive online courses. In Proceedings of the 23rd international conference on world
and participation in predicting performance in a MOOC. Journal of Computer Assisted
wide web (pp. 687–698). ACM.
Learning, 32(3), 218–231.
Arbaugh, J. (2008). Does the community of inquiry framework predict outcomes in
De Laat, M., Lally, V., Lipponen, L., & Simons, R. J. (2007). Investigating patterns of
online MBA courses? International Review of Research in Open and Distance Learning, 9
interaction in networked learning and computer-supported collaborative learning: A
(2), 1–21.
role for social network analysis. International Journal of Computer-Supported
Atapattu, T., & Falkner, K. (2016, April). A framework for topic generation and labeling
Collaborative Learning, 2(1), 87–103.
from MOOC discussions. In Proceedings of the third (2016) ACM conference on
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep
learning@ scale (pp. 201–204). ACM.
bidirectional transformers for language understanding. arXiv preprint arXiv:
Blau, I., & Caspi, A. (2008). Social presence in online discussion groups: Testing three
1810.04805.
conceptions and their relations to perceived learning. Social Psychology of Education,
Díaz, S. R., Swan, K., Ice, P., & Kupczynski, L. (2010). Student ratings of the importance
11(3), 323–346.
of survey items, multiplicative factor analysis, and the validity of the community of
Chandrasekaran, M. K., Kan, M. Y., Tan, B. C., & Ragupathi, K. (2015). Learning instructor
inquiry survey. The Internet and Higher Education, 13(1–2), 22–30.
intervention from MOOC forums: Early results and issues. arXiv preprint arXiv:
Dixson, M. D. (2010). Creating effective student engagement in online courses: What do
1504.07206.
students find engaging? The Journal of Scholarship of Teaching and Learning, 1–13.
Cho, H., Gay, G., Davidson, B., & Ingraffea, A. (2007). Social networks, communication
Dowell, N. M., Skrypnyk, O., Joksimovic, S., Graesser, A. C., Dawson, S., Gašević, D., &
styles, and learning performance in a CSCL community. Computers & Education, 49
Kovanovic, V. (2015). Modeling learners’ social centrality and performance through
(2), 309–329.

15
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

language and discourse. In Proceedings of the 8th international conference on LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436.
educational data mining (pp. 205–257). Madrid: IEDMS. Liu, W., Kidziński, Ł., & Dillenbourg, P. (2016a). Semi-automatic annotation of mooc
Eggins, S., & Slade, D. (1997). Analyzing casual conversation. Washington, DC: Cassell. forum posts. In State-of-the-Art and future directions of smart learning (pp. 399–408).
Ezen-Can, A., Boyer, K. E., Kellogg, S., & Booth, S. (2015). March). Unsupervised Singapore: Springer.
modeling for understanding MOOC discussion forums: A learning analytics Liu, Y., Sun, C., Lin, L., & Wang, X. (2016b). Learning natural language inference using
approach. In Proceedings of the fifth international conference on learning analytics and bidirectional LSTM model and inner-attention. arXiv preprint arXiv:1605.09090.
knowledge (pp. 146–150). ACM. Liu, M., Zou, W., Shi, Y., Pan, Z., & Li, C. (2019). What do participants think of today’s
Fedus, W., Goodfellow, I., & Dai, A. M. (2018). MaskGAN: Better text generation via MOOCs: An updated look at the benefits and challenges of MOOCs designed for
filling in the_. In Proceedings of the sixth international conference on learning working professionals. Journal of Computing in Higher Education, 1–23.
representations (ICLR). Malinen, S. (2015). Understanding user participation in online communities: A
Fidalgo-Blanco, Á., Sein-Echaluce, M. L., & García-Peñalvo, F. J. (2016). From massive systematic literature review of empirical studies. Computers in Human Behavior, 46,
access to cooperation: Lessons learned and proven results of a hybrid xMOOC/ 228–238.
cMOOC pedagogical approach to MOOCs. International Journal of Educational McKerlich, R., Rils, M., Anderson, T., & Eastman, B. (2011). Student perceptions of
Technology in Higher Education, 13(1), 24. teaching presence, social presence, and cognitive presence in a virtual world.
García-Peñalvo, F. J., Fidalgo-Blanco, Á., & Sein-Echaluce, M. L. (2018). An adaptive MERLOT Journal of Online Learning and Teaching, 7(3), 324–336.
hybrid MOOC model: Disrupting the MOOC concept in higher education. Telematics McMillan, D. W., & Chavis, D. M. (1986). Sense of community: A definition and theory.
and Informatics, 35(4), 1018–1030. Journal of Community Psychology, 14(1), 6–23.
Garrison, D. R., Anderson, T., & Archer, W. (2001). Critical thinking, cognitive presence, Meyer, K. A. (2004). Putting the distance learning comparison study in perspective: Its
and computer conferencing in distance education. American Journal of Distance role as personal journey research. Online Journal of Distance Learning Administration,
Education, 15(1), 7–23. 7(1). Retrieved from http://www.westga.edu/~distance/ojdla/spring71/meyer71.
Garrison, D. R., & Arbaugh, J. B. (2007). Researching the community of inquiry pdf.
framework: Review, issues, and future directions. The Internet and Higher Education, Nistor, N., Dascalu, M., Serafin, Y., & Trausan-Matu, S. (2018). Automated dialog
10(3), 157–172. analysis to predict blogger community response to newcomer inquiries. Computers in
Gašević, D., Joksimović, S., Eagan, B. R., & Shaffer, D. W. (2019). SENS: Network Human Behavior, 89, 349–354.
analytics to combine social and cognitive perspectives of collaborative learning. Nistor, N., Dascalu, M., Tarnai, C., & Trausan-Matu, S. (2020). Predicting newcomer
Computers in Human Behavior, 92, 562–577. integration in online learning communities: Automated dialog assessment in blogger
Gorham, J. (1988). The relationship between verbal teacher immediacy behaviors and communities. Computers in Human Behavior, 105, 106202.
student learning. Communication Education, 37, 40–53. Nogueira, R., & Cho, K. (2019). Passage Re-ranking with BERT. arXiv preprint arXiv:
Gorham, J., & Christophel, D. (1990). The relationship of teachers’ use of humor in the 1901.04085.
classroom to immediacy and student learning. Communication Education, 39, 46–61. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., …
Gunawardena, C. N., Nolla, A. C., Wilson, P. L., Lopez-Islas, J. R., Ramirez-Angel, N., & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. The Journal of
Megchun-Alpizar, R. M. (2001). A cross-cultural study of group process and Machine Learning Research, 12, 2825–2830.
development in online conferences. Distance Education, 22(1), 85–121. Piaget, J. (2013). The construction of reality in the child (Vol. 82). Routledge.
Gunawardena, C. N., & Zittle, F. J. (1997). Social presence as a predictor of satisfaction Picciano, A. G. (2002). Beyond student perceptions: Issues of interaction, presence, and
within a computer-mediated conferencing environment. American Journal of Distance performance in an online course. Journal of Asynchronous Learning Networks, 6(1),
Education, 11(3), 8–26. 21–40.
Gütl, C., Rizzardini, R. H., Chang, V., & Morales, M. (2014, September). Attrition in Poquet, P., & Dawson, D. (2016). April). Untangling MOOC learner networks. In
MOOC: Lessons learned from drop-out students. In International workshop on learning Proceedings of the sixth international conference on learning analytics & knowledge (pp.
technology for education in cloud (pp. 37–48). Cham: Springer. 208–212). ACM.
Hecking, T., Hoppe, H. U., & Harrer, A. (2015, August). Uncovering the structure of Poquet, O., Kovanović, V., de Vries, P., Hennis, T., Joksimović, S., Gašević, D., et al.
knowledge exchange in a MOOC discussion forum. In 2015 IEEE/ACM international (2018). Social presence in massive open online courses. International Review of
conference on advances in social networks analysis and mining (ASONAM) (pp. Research in Open and Distance Learning, 19(3).
1614–1615). IEEE. Pursel, B. K., Zhang, L., Jablokow, K. W., Choi, G. W., & Velegol, D. (2016).
Henri, F., & Pudelko, B. (2003). Understanding and analysing activity and learning in Understanding MOOC students: Motivations and behaviours indicative of MOOC
virtual communities. Journal of Computer Assisted Learning, 19(4), 474–487. completion. Journal of Computer Assisted Learning, 32(3), 202–217.
Hommes, J., Rienties, B., de Grave, W., Bos, G., Schuwirth, L., & Scherpbier, A. (2012). Rosé, C. P., Carlson, R., Yang, D., Wen, M., Resnick, L., Goldman, P., et al. (2014, March).
Visualising the invisible: A network approach to reveal the informal social side of Social factors that contribute to attrition in MOOCs. In Proceedings of the first ACM
student learning. Advances in Health Sciences Education, 17(5), 743–757. conference on Learning@ scale conference (pp. 197–198). ACM.
Houston, S. L., II, Brady, K., Narasimham, G., & Fisher, D. (2017, April). Pass the idea Rourke, L., Anderson, T., Archer, W., & Garrison, D. R. (1999). Assessing social presence
please: The relationship between network position, direct engagement, and course in asynchronous text-based computer conferencing. The Journal of Distance
performance in MOOCs. In Proceedings of the fourth (2017) ACM conference on Education, 14(3), 51–70.
learning@ scale (pp. 295–298). ACM. Rovai, A. P. (2002). Sense of community, perceived cognitive learning, and persistence in
Hu, J., Dowell, N., Brooks, C., & Yan, W. (2018, June). Temporal changes in affiliation asynchronous learning networks. The Internet and Higher Education, 5(4), 319–332.
and emotion in MOOC discussion forum discourse. In International conference on Sanders, J., & Wiseman, R. (1990). The effects of verbal and nonverbal teacher
artificial intelligence in education (pp. 145–149). Cham: Springer. immediacy on perceived cognitive, affective, and behavioral learning in the
Jiang, S., Fitzhugh, S. M., & Warschauer, M. (2014, July). Social positioning and multicultural classroom. Communication Education, 39, 341–353.
performance in MOOCs. In Workshop on graph-based educational data mining (Vol. Shea, P., & Bidjerano, T. (2009). Community of inquiry as a theoretical framework to
14). foster “epistemic engagement” and “cognitive presence” in online education.
Joksimović, S., Gašević, D., Kovanović, V., Riecke, B. E., & Hatala, M. (2015). Social Computers & Education, 52(3), 543–553.
presence in online discussions as a process predictor of academic performance. Shea, P., Hayes, S., Vickers, J., Gozza-Cohen, M., Uzuner, S., Mehta, R., & Rangan, P.
Journal of Computer Assisted Learning, 31(6), 638–654. (2010). A re-examination of the community of inquiry framework: Social network
Joksimović, S., Manataki, A., Gašević, D., Dawson, S., Kovanović, V., & De Kereki, I. F. and content analysis. The Internet and Higher Education, 13(1–2), 10–21.
(2016, April). Translating network position into performance: Importance of Short, J. A., Williams, E., & Christie, B. (1976). The social psychology of
centrality in different network configurations. In Proceedings of the sixth international telecommunications. New York: John Wiley & Sons.
conference on learning analytics & knowledge (pp. 314–323). ACM. Stiller, K. D., & Bachmaier, R. (2017). Dropout in an online training for trainee teachers.
Joo, Y. J., So, H. J., & Kim, N. H. (2018). Examination of relationships among students’ European Journal of Open, Distance and E-Learning, 20(1), 80–95.
self-determination, technology acceptance, satisfaction, and continuance intention Vargas, D. L., Bridgeman, A. M., Schmidt, D. R., Kohl, P. B., Wilcox, B. R., & Carr, L. D.
to use K-MOOCs. Computers & Education, 122, 260–272. (2018). Correlation between student collaboration network centrality and academic
Ke, F. (2010). Examining online teaching, cognitive, and social presence for adult performance. Physical Review Physics Education Research, 14(2), Article 020112.
students. Computers & Education, 55(2), 808–820. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., …
Kelly, D., & Gorham, J. (1988). Effects of immediacy on recall of information. Polosukhin, I. (2017). Attention is all you need. In Advances in neural information
Communication Education, 37, 198–207. processing systems (pp. 5998–6008).
Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of Vygotsky, L. S. (1980). Mind in society: The development of higher psychological processes.
the ACM, 46(5), 604–632. Harvard university press.
Kolb, D. A. (1984). Experiential learning. Englewood Cliffs, NJ: Prentice Hall. Wang, W., Guo, L., He, L., & Wu, Y. J. (2019). Effects of social-interactive engagement on
Kovanović, V., Joksimović, S., Waters, Z., Gašević, D., Kitto, K., Hatala, M., et al. (2016, the dropout ratio in online learning: Insights from MOOC. Behaviour & Information
April). Towards automated content analysis of discussion transcripts: A cognitive Technology, 38(6), 621–636.
presence case. In Proceedings of the sixth international conference on learning analytics Wenger, E. (1998). Communities of practice. Learning, meaning, and identity. Cambridge,
& knowledge (pp. 15–24). ACM. UK: Cambridge University Press.
Kozan, K., & Richardson, J. C. (2014). Interrelationships between and among social, Wenger, E., McDermott, R. A., & Snyder, W. (2002). Cultivating communities of practice: A
teaching, and cognitive presence. The Internet and Higher Education, 21, 68–73. guide to managing knowledge. Harvard Business Press.
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., & Soricut, R. (2019). Albert: A lite Wenger, E., Trayner, B., & De Laat, M. (2011). Promoting and assessing value creation in
bert for self-supervised learning of language representations. arXiv preprint arXiv: communities and networks: A conceptual framework (Vol. 20, pp. 2010–2011). The
1909.11942. Netherlands: Ruud de Moor Centrum.
Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Wen, M., Yang, D., & Rose, C. (2014, July). Sentiment analysis in MOOC discussion
Cambridge: Cambridge University Press. forums: What does it tell us?. In Educational data mining 2014..

16
W. Zou et al. Computers in Human Behavior 115 (2021) 106582

Wise, A. F., & Cui, Y. (2018). Learning communities in the crowd: Characteristics of Xing, W., Tang, H., & Pei, B. (2019). Beyond positive and negative emotions: Looking into the
content related interactions and social relationships in MOOC discussion forums. role of achievement emotions in discussion forums of MOOCs (p. 100690). The Internet
Computers & Education, 122, 221–242. and Higher Education.
Wise, A. F., Cui, Y., Jin, W., & Vytasek, J. (2017). Mining for gold: Identifying content- Xu, B., Guo, X., Ye, Y., & Cheng, J. (2012). An improved random forest classifier for text
related MOOC discussion threads across domains through linguistic modeling. The categorization. Journal of Computers, 7(12), 2913–2920.
Internet and Higher Education, 32, 11–28. Yang, D., Sinha, T., Adamson, D., & Rosé, C. P. (2013, December). Turn on, tune in, drop
Wise, A. F., Cui, Y., & Vytasek, J. (2016, April). Bringing order to chaos in MOOC out: Anticipating student dropouts in massive open online courses. In Proceedings of
discussion forums with content-related thread identification. In Proceedings of the the 2013 NIPS Data-driven education workshop (Vol. 11, p. 14).
sixth international conference on learning analytics & knowledge (pp. 188–197). ACM. Yang, H. L., & Tang, J. H. (2003). Effects of social network on students’ performance: A
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., & Brew, J. (2019). web-based forum study in Taiwan. Journal of Asynchronous Learning Networks, 7(3),
HuggingFace’s transformers: State-of-the-art natural language processing. ArXiv, arXiv- 93–107.
1910. Zhang, M., Yin, S., Luo, M., & Yan, W. (2017). Learner control, user characteristics,
Xing, W., Chen, X., Stein, J., & Marcinkowski, M. (2016). Temporal prediction of platform differences, and their role in adoption intention for MOOC learning in
dropouts in MOOCs: Reaching the low hanging fruit through stacking generalization. China. Australasian Journal of Educational Technology, 33(1).
Computers in Human Behavior, 58, 119–129.

17

You might also like