Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Proc. of 2020 7th Int. Conf. on Information Tech.

, Computer, and Electrical Engineering (ICITACEE)

Classification of Big Five Personality Behavior


Tendencies Based On Study Field with Twitter
Analysis Using Support Vector Machine
Denis Eka Cahyani Anas Falih Faishal
Department of Informatics Department of Informatics
Universitas Sebelas Maret Universitas Sebelas Maret
Surakarta, Indonesia Surakarta, Indonesia
denis.eka@staff.uns.ac.id anas_ishal@student.uns.ac.id

Abstract— Behavioral tendencies can be seen from the public information can be used as an observation to find out
characteristics or personality. A person's reaction to something personality. Twitter is a social media that is often and widely
can be seen through Twitter, from words written in tweets, so used, from the official page kominfo.go.id Indonesia
that their character or personality can be known. Personality occupies the top 5 biggest Twitter users in the world [4].
can change which is influenced by several factors, one of which
Twitter is the highest bridging social media compared to
is the field of study. The field of study is occupied influencing
environmental factors during the study process, for example in Instagram, Facebook, and Snapchat [5]. In tweets written by
the study environment during the lecture. Therefore, this study Twitter users, information about the behavioral tendencies of
conducted a Twitter analysis for the classification of Big Five Twitter users can be obtained.
Personality behavioral tendencies based on the field of study. Some research related to Big Five Traits with the field of
Fields of study are grouped into seven groups namely art / study has been widely discussed. Research by [6] discusses
humanities, law, economics, medicine, political science, some consistent results that have been summarized with a
psychology, and science. The classification process is carried out systematic review of personality in the fields of study namely
in several scenarios for dividing the amount of tweets data. The art or humanities, law, economics, medicine, political
results of the classification with the highest accuracy is in the
science, psychology, and science. In the study [7] built a
data of 300 tweets, with an accuracy value of 80.5% SVM and
82% MNB. Comparison of the Big Five Traits with previous personality prediction system from Indonesian Twitter user
research has in common for each field of study in addition to information. Evaluation results using 10 fold cross-validation
economic. showed the system achieved the highest average accuracy
with Support Vector Machine 76.2310% and 97.9962% with
Keywords—BFI, Big Five Personality, Field of Study, SVM, XGBoost. Research [8] discusses how information from
Twitter Twitter users can be used to describe the personality of the
Big Five. Testing uses the Support Vector Regression
I. INTRODUCTION method. The best results with a combined model of social
Behavior is a human activity that can be observed and behavior features and linguistic bigram with the smallest
studied [1]. The dominant aspect of determining a person's MAE value of 0.2739. Then research [9] presents a method
behavior is personality. Personality shows the characteristics of how to predict the personality of Twitter users, with
and habits of people in behavior. Personality is owned by a general information on Twitter profiles. Testing using ZeroR
person since childhood and can change over time because it and Gaussian Process with 10-fold cross-validation yields an
is influenced by several factors. Influencing factors such as MAE of 11-18% in all five traits. Research by [10]
Education and environmental factors. One example of classifying personality based on the Twitter text using several
Education and environmental factors that influence classification methods namely MNB, KNN, and SVM with
personality is the occupied field of study. Fields of study such English and Indonesian language datasets. Accuracy results
as medical education, or law, influence both education and obtained are 63% MNB, 60% KNN, and 61% SVM. In
the environment [2]. research [11] classification with Twitter data found that
One theory of personality that has been widely studied is Support Vector Machine (SVM) has a better accuracy value
the Big Five Personality. Big Five began to be discovered by with a value of 99.89% compared to Multinomial Naïve
Klages 1926, from 4500 traits that are described into five big Bayes (MNB).
traits [3]. The five traits are Openness, Conscientiousness, From the discussion presented above, this study will
Extraversion, Agreeableness, and Neuroticism (OCEAN). conduct a Twitter analysis of UNS (Universitas Sebelas
Traits in a person can be determined in several ways, one of Maret) students to classify behavioral tendencies based on the
which uses a personality test. One of the instruments of the field of study with the traits of Big Five Personality. The
Big Five Personality is the Big Five Inventory (BFI). BFI is classification uses the SVM and MNB methods as a
the Big Five personality test that is the easiest and most comparison.
efficient compared to other Big Five tests or instruments. The
advantage of BFI is its short measurement, which saves a lot II. STUDY LITERATURE
of testing time. This will avoid the boredom and fatigue of A. Big Five Personality
the respondents to get maximum results [3].
Besides tests, personality is also seen in the The personality dimension of the Big Five model is the
personality structure that has been widely studied and is a
characteristics of daily life. Nowadays, people's activities can
reference for research in measuring a person's personality [9].
be seen through social media that they have, such as daily
Five Big Five factors by Golberg [3] are Extraversion (E)
activities, expression, and issuing opinions. The social media
related to relationships. Extroverted individuals tend to be

978-1-7281-7226-2/20/$31.00
Authorized licensed use limited to: Central©2020
Michigan IEEE 140on May 14,2021 at 05:04:48 UTC from IEEE Xplore. Restrictions apply.
University. Downloaded
more sociable, friendly, and hospitable. Extraversion is TABLE I. THE BFI ASSESSMENT
characterized by having a tendency to be active, confident, Traits Statement Number
Extraversion 1, 6R, 11, 16, 21R, 26, 31R, 36
dominant, and show more positive emotions. Contrary to Agreeableness 2R, 7, 12R, 17, 22, 27R, 32, 37R, 42
Extraversion, Introversion (I) tends to be quiet, aloof, shy, Conscientiousness 3, 8R, 13, 18R, 23R, 28, 33, 38, 43R
and calm [12]. Neuroticism 4, 9R, 14, 19, 24R, 29, 34R, 39
Agreeableness (A) tends to follow others. Individuals Openness 5, 10, 15, 20, 25, 30, 35R, 40, 41R, 44
who have Agreeableness have great and cooperative trust. TABLE II. RESEARCH REVIEW OF BIG FIVE [6]
While the negative nature of Agreeableness is a cold Field of Study Big Five Traits
individual, not easy to obey or believe, and antagonistic. Arts/Humanities (+) Neuroticism, Openness, and
Agreeableness has a consistent relationship with social Agreeableness
(-) Conscientiousness
support [12]. Conscientiousness (C) means having a cautious
Psychology (+) Neuroticism, Openness, and
trait, usually having a character that is responsible, persistent, Agreeableness
organized, and reliable. These individuals can be described as Political Science (+) Openness and Extraversion
individuals who are organized, ambitious, focused on Medicine (+) Openness, Extraversion and
achievement, and discipline. Conscientiousness with a high Agreeableness
Economics (+) Extraversion
score indicates hard worker, sensitive, punctual, and Law (+) Extraversion
persevering. Conversely, if the low scores of these Sciences (-) Agreeableness
individuals tend to be lazy, careless, disorganized, give up
easily, and have no purposeful direction [13].
= (1)
Neuroticism (N) is related to emotional stability. If the
score is high, individuals tend to be anxious, emotional, IDF measures how important a term is. When calculating
temperamental, fragile, and self-pitying. Conversely, TF, all terms are considered equally important. But, certain
individuals will feel complacent, gentle, and calm. Openness terms which may appear many times are not as important as
deals with openness such as finding a different and diverse “are”, “from”, etc. So it is calculated using IDF [14]. For the
experience. Usually, individuals who have high scores tend weight of each term or t is obtained from the results of the tf-
to be creative, imaginative, curious, and broad-minded. These idf in Equation 3.
five terms are better known as OCEAN's personality [13].

B. Big Five Inventory = log _ (2)

Big Five Inventory is one of the Big Five Personality


instruments to meet the need for short instruments that = , (3)
measure prototypical components of the Big Five. BFI E. Support Vector Machine
consists of 44 items developed to observe personality ratings.
BFI aims is to make a short inventory that allows flexible and The basic concept of SVM Classification is to create an
efficient assessment. One or two characteristic adjectives ideal hyperplane in a higher dimensional component space
serve as the main point that provides clarification or for mapping information by minimizing risk [15]. The
contextual information. For example one of the BFI items that hyperplane is built using support vectors, data that is closer
became the original trait of Openness “whether original, to the hyperplane. This data is located at the boundary of the
emerged with new ideas” and the Conscientiousness trait slice with the first-class called support vector + (positive),
“Persevering until the task was completed” [3]. and the second class support vector - (negative). The distance
BFI scoring is in points 1 - 5 for each statement. Several between support vectors is called the margin, where the
statements represent the negative of the traits that will be maximum margin indicates a good hyperplane. The equation
reverse-scores so that those who agree strongly point 1, and of the hyperplane line is as follows
so on. To find out the results of BFI, each point will be = + (4)
averaged based on traits. The highest average indicates that = dimension vector n
the dominant individual has the traits personality. The = bias term
statement number on BFI with its trait is shown in Table I In the data closest to the hyperlane, a line is drawn as a
where the number contained 'R' is a reverse-score [3]. support vector (Figure 1), where
( )≥1; class 1 (5)
C. Field Study ( )≤ −1; class 2 (6)
The field of study is the knowledge that is taught and If the data cannot be separated linearly, [16] said that it
researched which is recognized by academic journals. In [6] can map input data to higher dot product dimensions known
there are several fields of study with results that are consistent as feature space. The input data space is still nonlinear but can
with personality classifications. The prediction of Traits for be changed to the feature space and look for the hyperplane.
each field of study can be shown in Table II [6].
D. TF-IDF
TF-IDF (Term Frequency–Inverse Document Frequency)
is the result of IDF multiplication with TF (Frequency Term
in the document) [14]. TF measures how often a term occurs
in a document. each document has a different text length, and
a term will likely appear more in the document length.
Fig. 1. SVM Illustration [16]

141on May 14,2021 at 05:04:48 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: Central Michigan University. Downloaded
1 to 5. Then in determining the labeling of each student is
determined by the highest score as a tendency of a trait (high),
the other properties are considered weak (low).
Twitter crawling data is taken by crawling based on the
student account id using the R programming language. The
results of the tweets data are divided into 4 parts classification
scenarios, namely scenario 1 using 30 data tweets on each
account, scenario 2 uses 100 data tweets, scenario 3 uses 300
tweets data, and scenario 4 uses the overall data of the tweets.
Fig. 2. Illustration of mapping input data to feature space [16] Imbalance of the amount on each Twitter account causes a
Scenario of the amount of data tweets performed.
Because of the difficulty of finding the right B. Preprocessing
transformation function ( ) for a data, it can use a kernel
function that satisfies Mercer's theorem that the resulting At the text preprocessing stage, the tweets will be
kernel matrix must be positively semi-definite. Kernel processed in several stages. The first stage is to change all
functions commonly used are Linear, Polynomial, Sigmoid, uppercase letters to lowercase (case folding), delete Twitter
and Radial Base Function (RBF) [16]. features that are not used in the classification, namely
removing mention tweets, hashtags, and URL links. Then
F. Confusion Matrix delete the characters that are not used such as numbers and
Testing in this study uses a confusion matrix. Accuracy punctuation or special characters, eliminating words that are
does not distinguish labels correctly in different classes. So not meaningful such as conjunctions, and words other than
the negative class correctly entered in the calculation [17]. Indonesian. After that, the slang words are normalized to their
The accuracy value is calculated from Equation 7. Whereas basic form. Finally, the Stemming stage which changes the
Precision shows the positive case value is predicted to be a affected words into basic words.
true positive. Precision is calculated by Equation 8. Recall is C. Word Weighting
the proportion of Real Positive cases that are correctly
predicted Positive [17]. Recall is calculated by Equation 9. The next stage is to weight the terms or words which will
be classified. Word weighting uses TF-IDF which measures
= (7) the frequency of terms (Equation 1) and how important a term
= (8) is (Equation 2). The result of multiplying the term frequency
with how important a term is word weight (Equation 3).
= (9)
D. Classification
III. METHODOLOGY In the classification stage using two methods, namely
This research methodology consists of data collection, SVM and MNB. Scenario 1 will classify data using SVM
preprocessing, word weighting, classification, and evaluation. after weighting with TF-IDF. Then Scenario 2 also will
Figure 3 shows the research method. classify data using MNB after weighting with TF-IDF. The
SVM method is used to classify students into 5 class traits,
Data
namely Agreeableness, Openness, Neuroticism,
Conscientiousness, and Extraversion with the highest label
PPreprocessing traits as high traits, and the others are considered low. The
stages of classification are shown in Figure 4.
Word Weighting

Classification

Evaluation

Fig. 3. Research Methodology

A. Data Collection
The research data uses two types of data namely BFI test
data and Twitter crawling data. The data was taken from the
FISIP, FSRD, FKIP, FEB, FH, FK, and FMIPA faculties
UNS students. Data in the form of student self data, twitter
account id, and filling the BFI test questionnaire. The student
data is then grouped based on the field of study into 7 groups,
namely art or humanities, economy, law, medicine, political
Fig. 4. Classification Stage
science, psychology, and science. BFI test data is used to
determine the traits of the Big Five on each student who was
previously validated by experts from the UNS Psychology
lecturer. BFI test results in the form of scores with a range of

142on May 14,2021 at 05:04:48 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: Central Michigan University. Downloaded
E. Evaluation TABLE V. THE EXAMPLE OF TEXT PREPROCESSING

After the data is processed and classified, the final stage Text Preprocessing The results of Text Preprocessing
is to evaluate the results of the classification. The evaluation Stage
Case Folding @eventkotasolo ngerasa punya jiwa musik?
in this study uses measurements of accuracy (Equation 7), yuk cus ikutan acara ini :) more info
precision (Equation 8), and recall (Equation 9). In the https://t.co/t4FnAgdr0E #pingfest2015
evaluation stage, the system classification results will be https://t.cp/SCELytdmbo
compared with the questionnaire results that have been Remove mention, ngerasa punya jiwa musik? yuk cus ikutan
hastags and link acara ini :) more info
labeled by humans. Remove number and ngerasa punya jiwa musik yuk cus ikutan
special character acara ini more info’
IV. RESULT AND DISCUSSION Normalization ngerasa punya jiwa musik yuk ikutan acara
The results and analysis of this study are as follows. ini more info
Remove Foreign ngerasa punya jiwa musik yuk ikutan acara
A. Data Collection Word ini
Eliminate stopword ngerasa jiwa musik yuk ikutan acara
The student data collected was 91 students and their Stemming rasa jiwa musik yuk ikut acara
Twitter accounts. The frequency of students for each field of
TABLE VI. EXAMPLE OF TF-IDF
study is 8 art/humanities students, 10 economics students, 6
law students, 26 medicine students, 10 political science Term TF DF IDF TF-IDF
students, 13 psychology students, and 18 science students. T1 T2 T3 T1 T2 T3
The results of the BFI questionnaire test were validated by Mundur 1 1 0 2 Log(4/2) 0.30 0.30 0
= 0.30
UNS Psychology lecturers. Examples of BFI test score results Tuhan 2 0 0 2 Log(4/2) 0.60 0 0
can be seen in Table III where the highest value on each = 0.30
account's traits shows the dominant traits used to label the Sakit 0 1 1 2 Log(4/2) 0 0.30 0.3
classification. = 0.30 0
Kepala 0 0 2 2 Log(4/2) 0 0 0.6
TABLE III. EXAMPLE OF BFI SCORE RESULT = 0.30
Field of Study ID E A C N O
Arts/Humanities A 3.9 3.2 2.8 4.0 3.4 D. Classification
Economics J 3.3 3.9 3.1 4.1 3.4 SVM classification is performed on each Big Five trait by
Law T 2.4 3.9 3.4 2.8 3.2
Medicine Y 3.8 4.2 3.6 2.1 4.0
dividing data using 10-fold cross-validation. Each trait is
Psychology Y1 2.6 3.7 3.0 3.9 3.2 classified with 4 dataset scenarios namely scenario 1 using 30
Sciences Y2 3.0 3.7 3.2 3.0 3.6 tweets, scenario 2 data 100 tweets, scenario 3 data 300 tweets,
E = Extraversion, A = Agreeableness, C = Conscientiousness and scenario 4 overall data tweets. Examples of classification
N = Neuroticism, O = Openness calculations with SVM with the distribution of vector data on
The total number of tweets obtained from 91 accounts the properties of Extraversion are shown in Figure 5.
totaled 49919 tweets. Table IV shows the frequency
distribution of Students towards Big Five Traits for each
scenario the number of data tweets.
TABLE IV. DISTRIBUTION OF STUDENTS TO BIG FIVE TRAITS
Distribution of Students
Traits
Scen 1 Scen 2 Scen 3 Scen 4
Extraversion 7 5 3 9
Agreeableness 29 21 17 29
Conscientiousness 3 2 1 2
Neuroticism 14 8 5 17
Openness 23 21 14 34 Fig. 5. Examples of Data “Extraversion” Distribution
Total 76 57 40 91
In figure 5 the red line is hyperplane, above the line is the
B. Preprocessing
extraversion class, and below the line is the non-extraversion
The text preprocessing stage consists of seven stages. The class. Points e and f are the closest points to the boundary,
sample data used is “@EventKotaSolo Ngerasa punya jiwa which are then drawn as support vectors.
musik? yuk cus ikutan acara ini :) more info The dot on the hyperplane is a weight vector ( →).
https://t.co/t4FnAgdr0E #pingfest2015 Where → can be known from the difference in points e and
https://t.cp/SCELytdmbo” The example of text preprocessing f.
for one data tweet in Table V. → = (3,3)−(2,1) = ( ,2 ) (10)
C. Word Weighting Then from the Equation (5) (6) generated:
+2 + = −1; with dot e (11)
Weighting uses the TF-IDF method to measure the
3 +6 + = 1; with dot f (12)
frequency and number of terms that appear on tweets. The
The value b is obtained from Equation 12,
following are examples of tweets that will be weighted TF-
= 1−9 (13)
IDF. Examples of words performed by the TF-IDF
The value is obtained from Equations (11) and (13),
calculation, namely “mundur”, “Tuhan”, “sakit”, dan
“kepala” are shown in Table VI. +2 +1−9 = −1
6 =2
=1/3

143on May 14,2021 at 05:04:48 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: Central Michigan University. Downloaded
So the value b from Equation (13),
= 1− 9 1/3
= −2
The equation ( ) is obtained from Equation (4) and (10) is
( )=1/3 1+2/3 2−2 (14)
Suppose we enter the point b data, we get the value ( )
is 3, which meets the Equation (5) requirements, then the data
b is extraversion class data.
E. Evaluation and Analysis
The test uses several scenarios for the amount of data, Fig. 7. MNB Accuracy Results
namely 30 tweets (scenario 1), 100 tweets (scenario 2), 300
tweets (scenario 3), and overall tweets (scenario 4). Then
testing uses cross-validation to divide each data scenario into
training data and testing data. The cross-validation division
uses 10 iterations of data.
The results of the average value of accuracy in the highest
SVM method in scenario 3 is 80.5% (Figure 8) then scenario
2 is 79.6%, scenario 4 is 79.5%, and scenario 1 is 77.7%. This
shows that data with a lot of tweets that are balanced and have
a better accuracy value. Figure 6 shows that
Conscientiousness traits have high accuracy compared to Fig. 8. Comparison of SVM and MNB Accuracy
other traits in Scenarios 1, 2, 3, or 4. While Agreeableness
traits have the lowest accuracy values in Scenarios 1, 2, and In Precision, scenario 3 has the highest average value with
3. In Scenarios 4, Traits Openness values the lowest accuracy 72% both SVM and MNB. Scenarios 2 and 4 MNB are better
compared to other traits. This is due to the imbalance than SVM (Figure 9). Whereas in Recall scenarios 1 and 2
distribution of data frequencies, where Conscientiousness SVM and MNB have the same values of 78% and 80%
traits have the fewest frequencies, and Agreeableness traits (Figure 10). The highest recall in scenario 3 is SVM with a
have the most frequencies in scenarios 1, 2, and 3 in Table value of 81% and in scenario 4, SVM has a better value with
IV. 79%.

Fig. 6. SVM Accuracy Results Fig. 9. Comparison of SVM and MNB Precision

Furthermore, experiments were also carried out using the


MNB method. The results of MNB in Figure 8 show the same
pattern as SVM where scenario 3 has the highest average
accuracy value of 82%. While the second highest in scenario
4 is 79.3%, then scenario 2 is 78.2%, and scenario 1 is 73.5%.
Figure 7 also shows that conscientiousness traits have the
highest accuracy value compared to other traits, both in
scenarios 1, 2, 3, and 4.
Comparison of SVM and MNB accuracy is the result of
SVM and MNB having almost the same pattern, with Fig. 10. Comparison of SVM and MNB Recall
scenario 3 having the highest average accuracy value and the
lowest average accuracy value in scenario 1 (Figure 8). In A comparison of research results with Anna Vedel's
scenarios 1 and 2 the average value of SVM accuracy is research [6] has almost the same distribution of properties for
higher at 77.7% and 79.6% compared to MNB with 73.5% each field of study. The field of economic studies in the data
and 78.2%, while in scenario 3 the average value of MNB obtained has a different nature from the results of the Vedel
accuracy is higher by 82% compared to SVM with 80.5%. review research because the data obtained from the field of
This shows that SVM works better on fewer data and can economy studies are limited to a small amount of data and
work well with less knowledge compared to MNB. there are no accounts in the field of economic studies that
have extraversion traits. Table VII shows the comparison

144on May 14,2021 at 05:04:48 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: Central Michigan University. Downloaded
traits of each field study between the Results of Anna Vedel Future research can reduce the limits for the dataset, for
and Questionnaire. example the addition of case studies of university students in
TABLE VII. COMPARISON TRAITS OF EACH FIELD OF STUDY
Surakarta or Indonesia and the duration of data collection.
Field of Study Results of Results of Similarity Furthermore, further research can compare personality
Anna Vedel Questionnaire parameters other than Big Five Personality such as Dark Trait
[6] Personality, personality labeling with the direction of
Art/Humanities Neuroticism, Neuroticism, Neuroticism, Psychologist experts, and other features to determine a
Openness, Openness, Openness,
Agreeableness Agreeableness, Agreeableness person's personality.
Extraversion
Economics Extraversion Agreeableness, - REFERENCES
Openness,
Neuroticism, [1] S. Notoatmodjo, “Pendidikan Dan Perilaku Kesehatan”, Jakarta: Rineka
Law Extraversion Agreeableness, Extraversion Cipta, 2003.
Extraversion [2] S. Hakimi, E. Hejazi, M. G. Lavasani, “The Relationships Between
Medicine Openness, Agreeableness, Agreeableness, Personality Traits and Students’ Academic Achievement”, Journal of
Procedia - Social and Behavioral Sciences, Volume 29, pp. 836 – 845,
Extraversion, Openness Openness
Agreeableness 2011.
[3] O. P. John, and S. Srivastava, “The Big Five Trait Taxonomy: History,
Political Openness, Openness, Openness
measurement, and theoretical perspectives”, Handbook of Personality,
Science Extraversion Neuroticism,
Guilford Press, Vol. 2, pp.102-138, 1999.
Agreeableness,
[4] K. M. Carley, M. M. Malik, M. Kowalchuck, J. Pfeffer, and P.
Conscientiousness
Landwehr, “Twitter usage in Indonesia,” 2015.
Psychology Neuroticism, Agreeableness, Agreeableness,
[5] J. Phua, S.V. Jin, & J.J. Kim, “Uses and gratifications of social
Openness, Openness, Openness,
networking sites for bridging and bonding social capital: A comparison
Agreeableness Extraversion, Neuroticism
of Facebook, Twitter, Instagram, and Snapchat”, Computers in Human
Neuroticism
Behavior, no.72, pp.115–122, 2017.
Sciences Agreeableness Agreeableness, Openness [6] A. Vedel, “Big Five personality group differences across academic
Openness majors: A systematic review”, Personality and Individual Differences
Journal/ELSEVIER, 92, 1–10, (2016).
There is a significant difference in the relationship [7] V. Ong, A.D.S. Rahmanto, Williem, D. Suhartono, et all, Personality
between behavioral tendencies of Big Five Personality with Prediction Based on Twitter Information in Bahasa Indonesia. In
Proceedings of the Federated Conference on Computer Science and
the field of study and most of it is in accordance with the Information Systems, Vol. 11, pp. 367–372, 2017.
references in [6], although only the field of study of [8] A.T. Damanik and M.L. Khodra, Prediksi Kepribadian Big 5 Pengguna
economics has nothing in common. In the field of art Twitter dengan Support Vector Regression. Jurnal Cybermatika, vol. 3,
/humanities studies, behavioral tendencies are innovative, issue 1, pp.14–22, 2015.
[9] J. Golbeck, C. Robles, M. Edmondson, and K. Turner, ‘‘Predicting
curious, but also emotional and easily anxious when faced personality from twitter,’’ in Privacy, Security, Risk and Trust (PASSAT)
with a problem (Openness, Neuroticism, Agreeableness). and 2011 IEEE Third Inernational Conference on Social Computing
Meanwhile, law studies have behavioral tendencies that are (SocialCom), 2011 IEEE Third International Conference on, pp. 149---
energetic, ambitious, communicative, dominant in groups, 156, 2011.
and tend to be easily bored (Extraversion). Furthermore, the [10] B.Y. Pratama, & R. Sarno, Personality classification based on Twitter
text using Naive Bayes, KNN and SVM. Proceedings of 2015
field of medicine studies tends to be helpful, cooperative, International Conference on Data and Software Engineering, ICODSE
forgiving, and merciful behavior, as well as innovative, have 2015, pp. 170–174, 2016.
a high curiosity (Agreeableness, Openness). Then, the field [11] D. A. Kurniawan, S. Wibirama and N. A. Setiawan, “Real-time traffic
of political science studies has behavioral tendencies that are classification with Twitter data mining”, 2016 8th International
Conference on Information Technology and Electrical Engineering
easily focused, highly tolerant, and have creative and (ICITEE), Yogyakarta, 2016.
imaginative thinking (Openness). Then, the field of [12] A.B. Bakker, K.I. Van Der Zee, K.A. Lewig, & M.F. Dollard, “The
psychology studies has behavioral tendencies that are almost Relationship Between the Big Five Personality Factors and Burnout: A
the same as the field of medicine studies, like to help, Study Among Volunteer Counselors”, The Journal of Social
Psychology, Vol. 135, issue 5.
cooperative, high tolerance, and innovative (Agreeableness, [13] T.A. Judge, C.A. Higgins, C.J. Thoresen, & M.R. Barrick, The big five
Openness, Neuroticism) and the field of science studies tend personality traits, general, Journal of Personel psychology, volume 52,
to behave friendly, avoid conflict more, and tend to like go 1999.
along (Agreeableness). [14] J. Han, and M.Kamber, “Data Mining Concepts and Techniques”, San
Fransisco : Elsavier, 2006.
[15] I. Kumar, J. Virmani, H.S. Bhadauria, & M.K. Panda, “Classification of
V. CONCLUSION AND FUTURE WORK Breast Density Patterns Using PNN, NFC, and SVM Classifiers”. Soft
The classification using the SVM method can work well Computing Based Medical Image Analysis. Elsevier Inc, 2018.
for tweets analysis of behavioral tendencies of Big Five [16] R. Gholami, and N. Fakhari, Support Vector Machine: Principles,
Parameters, and Applications. In Handbook of Neural Computation,
Personality. The highest accuracy results reached 80.5% in pp.515-535, 2017.
SVM and in MNB it reached 82% using scenario 3 with 300 [17] D.M. Powers, Evaluation: from Precision, Recall and F-measure to
tweets. In the Precision and Recall values scenario 3 has the ROC, Informedness, Markedness and Correlation. Journal of Machine
highest value with a Precision value of 72% both SVM and Learning Technologies, volume 2, issue 1, pp.37–63, 2011.
MNB, while the highest Recall value with SVM is 81%.
Behavioral tendencies of Big Five Personality have a
significant difference to the field of study occupied and most
of the behavioral tendencies of Big Five Personality are in
accordance with previous research that is a reference.

145on May 14,2021 at 05:04:48 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: Central Michigan University. Downloaded

You might also like