Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Modeling human performance in complex search and

retrieve environment using supervised and unsupervised


machine learning techniques

Shashank Uttrani [0000-0003-2601-2125], Sakshi Sharma [0000-0002-0229-0289], Mahavir Dabas


[0000-0002-5171-5546]
, Bhavik Kanekar [0000-0002-7964-1083], and Varun Dutt [0000-0002-2151-8314]

Applied Cognitive Science Lab, Indian Institute of Technology Mandi, HP, India – 175075
shashankuttrani@gmail.com

Abstract. Prior research has evaluated human performance in complex search-


and-retrieve environment. However, little is known about how supervised and
unsupervised machine learning techniques could account for human performance
in such complex scenarios. The primary objective of this research is to develop
supervised and unsupervised machine learning models to account for human
actions in a complex search-and-retrieve environment. A total of 50 participants
were recruited to participate in the study to perform as human agents in a
simulation developed using Unity 3D. The environment consisted of four
building with targets (having positive reward) and distractors (having negative
reward) present inside those buildings. Participants were tasked to explore the
environment and maximize their score by collecting as many target items while
avoiding distractor items present in the environment. The experiment consisted
of training and test phases which differed in the availability of feedback, items
present in the environment, and the duration. Next, machine learning models
were developed to account for human actions using supervised and unsupervised
techniques such as Decision Tree, Random Forest, Support Vector Classifier
(SVC), Multilayer Perceptron (MLP), and K-nearest neighbor (KNN). These
models were trained using data collected in the training phase of the experiment
and their performance was evaluated on data collected in the test phase. Results
revealed that KNN, an unsupervised learning model, performed better in
predicting human actions on the test dataset compared to supervised learning
models such as decision tree, random forest, MLP, and SVC. We highlight the
main implications of our findings for the human factors research community.

Keywords: Search-and-Retrieve task, Multilayer Perceptron, Support Vector


Classifier, K-nearest neighbor, Human Performance.

1 Introduction

Prediction of human action in complex and high demand cognitive tasks has been a
topic of great interest in the neuroscience, psychology, and machine learning
community [1]. In simple terms, the problem of predicting human actions can be looked
as a n-class classification problem [2]. Numerous classification algorithms have been
2

developed and enhanced to cater to the problem of n-class classification such as


decision tree [3], random forest [4], support vector classifier (SVC) [5], multilayer
perceptron (MLP) [6], and K-nearest neighbor [7]. These machine learning algorithms
are either supervised learning or unsupervised learning in nature.
Decision trees have been employed to recognize human actions using smartphone
sensor data [8]. Authors showed that human behavior and action data can be capture
using accelerometer present in the smart phones [8]. This data can be used the recognize
the orientation and actions of human such as standing, sitting, sleeping, walking, and
running. [8] used the decision tree algorithm to recognize and predict human actions
with high accuracy using the accelerometer data.
Similarly, support vector machines have been used to recognize human actions in a
video using spatio-temporal feature descriptor [9]. The authors used surveillance and
sports videos to capture and label human actions and trained the support vector
classifier to recognize human action from those videos [9]. This technique provided a
faster method to recognize human actions than decision trees [9].
Various state-of-the-art reinforcement learning and cognitive models have also been
developed to account for human actions in complex scenarios [10]. Recent
developments have shown that deep reinforcement learning algorithms such as Deep
Q-learning network (DQN) [11] have been successful in defeating human in video
games like Atari [12]. Also, sophisticated models like Soft-Actor Critic [13] have been
used to play highly complex and strategic games like DOTA [14] and Call of Duty [15].
Research has also demonstrated the use of cognitive models developed using cognitive
architectures like ACT-R [16] in modeling human actions in aviation such as air-
refueling and aircraft maneuvering [17].
Such developments in the field of computational modeling have led to an increase
in the interest of modeling human actions in complex search-and-retrieve environments
such as an on-going military operation [10]. Researcher have evaluated human
performance in simple and complex cognitive tasks; however, little is known about how
machine learning algorithms would account for human actions in complex cognitive
tasks.
Although prior research has evaluated human performance in various cognitive-
demand tasks and developed machine learning models to predict human agent’s actions
[10], however, little is known how these machine learning models would account for
human actions in complex search-and-retrieve simulated environments. The primary
objective of this research is to develop computational models to predict human actions
in a complex search-and-retrieve environment using supervised and unsupervised
machine learning techniques such as decision tree, random forest, SVC, MLP, and
KNN.
In what follows, we detail the experiment design, participant demographic, and
models used to account for human actions in the search-and-retrieve environment.
Next, we present the model results and discuss the implication of our findings in the
discussion section. Finally, we close the paper by defining the limitations of our study
and future scope of this research.
3

2 Methodology

2.1 Participants

Fifty participants were recruited from the Indian Institute of Technology Mandi to
perform as human agents in the search-and-retrieve simulation experiment.
Participants’ age ranged between 18 and 31 years and the mean age of the recruited
participants was 25.5 years with a standard deviation of 3.4 years. Out of fifty recruited
participants, 70% were males and the rest were females. More than 90% of the
participants were pursuing postgraduate degrees while the rest were pursuing
bachelor’s degrees. About 90% of the participants had a major in STEM related subjects
while the rest belonged to Arts and Humanities. Upon successful completion of the
study, participants were thanked and renumerated a base payment of INR 40 (USD
0.22) for their participation in the study. The top three scorers of the experiment were
provided a bonus of INR 20 each.

2.2 Experiment Design

The search-and-retrieve simulation environment was developed using Unity 3D [18], a


professional game development engine, to collect human data. The simulation consisted
of four under construction buildings with target and distractor items available in the
environment. The objective of the simulation game was to maximize the score by
collecting as many target items present in the environment while avoiding distractor
items. Upon collecting each target item, human agent received a positive reward of +5
points whereas upon collecting each distractor item, a negative reward of -5 points was
awarded. The experiment was divided into training and test phase which differed on the
availability of feedback, number of items present in the environment (target or
distractor), and the duration of the phase. In the training phase, feedback was present
in the form of score whereas, in the test phase, feedback was absent. Moreover, there
was difference in the number of target items (14 in training and 28 in test) and distractor
items (7 in training and 14 in test). Furthermore, the training phase was 15 minutes long
whereas the test phase was 10 minutes long.

2.3 Procedure
The search-and-retrieve environment was developed to collected human data in a
cognitive challenging and complex search-and-retrieve task. Later, human participants
were recruited to participate in the study to perform as human agents in the search-and-
retrieve environment. The collected data along with recorded gameplay video across
the training phase was used to train machine learning models and data collected in the
test phase was used to evaluate the performance of trained models.
4

2.4 Dataset

Upon completion of data collection, the dataset of the gameplay of all participants was
prepared consisting of timestamps, actions, coordinates, and 1000 principal component
values of screen grab. The recorded gameplay videos of each participant were divided
into frames corresponding to timestamps against each action. Next, each captured frame
was converted into its vectorized form, and its dimensions were reduced by taking 1000
major components using Principal Component Analysis (PCA) technique [19]. Thus,
the final training and test datasets had 1005 features, namely, timestamp, x-coordinate,
y-coordinate, z-coordinate, and 1000 PCA values against each timestamp. The training
dataset corresponds to data collected during the training session of the gameplay for all
the participants. Similarly, the test dataset corresponds to data collected during the test
session of the gameplay for all the participants.

2.5 Evaluation Metrics


The purpose was this research was to develop machine learning algorithms to predict
actions (Left, Right, or Forward) of an agent in a simulated search-and-retrieve
environment using the human activity dataset recorded during empirical study. Thus,
the research problem boils down to a three-class classification using different machine
learning techniques. Therefore, to evaluate the performance of each model, we trained
the models using the training dataset, i.e., the dataset recorded during the training phase
of the gameplay, and evaluated the performance using the test dataset, i.e., the dataset
recorded during the test phase of the gameplay. Thus, the accuracy of each model was
calculated using Eq. 1.
𝐶𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐶𝑙𝑎𝑠𝑠𝑒𝑠
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = ∗ 100 (1)
𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠

3 Models

3.1 Decision Tree

Decision tree algorithm is a supervised machine learning technique that builds a


hierarchical model using a tree structure [3]. The tree contains decision nodes having
an attribute (or feature) associate to each node. Each decision node can have two or
more branches associated to a value or a range of value of the attribute [3]. However,
the leaf node has no branches, and it contains the target value. The training data is split
into smaller subsets to maximize the homogeneity at each decision node [3]. The
execution of the algorithms starts from the root node and the model traverses down the
tree based on the decision rules at each node [3]. The accuracy of the model may vary
on the following hyperparameters: criterion of the split, maximum depth of the tree,
and the number of the samples for a split at each decision node. The splitting criterion
can either be Gini Impurity or Entropy method [20]. The maximum depth of the tree is
the number of edges from the root node to the decision leaf [3].
5

3.2 Random Forest

The random forest algorithm is also a supervised machine learning technique [4]. It
extends on the on the works of decision tree algorithm and develops an ensemble of
many individual decisions trees [4]. The main concept behind using an ensemble of
different tree lies in the fact that relatively uncorrelated decision trees combined
together as an ensemble can outperform any individual tree [4]. Bootstrap aggregation
is employed to ensure low correlation between individual tree by making each tree to
sample from the dataset with replacement [4]. Also, a major difference between a
decision tree and a random forest is that in a decision tree all features are considered
before making a decision node [4]. However, in a random forest each individual tree
has to pick only a random subset of available feature set. All the hyperparameters of a
random forest remain the same except one, i.e., the number of individual trees in the
forest.

3.3 Support Vector Machine

Support Vector Classifier (SVC) is also a supervised machine learning algorithm based
on the support vector machines [5]. The objective of the SVC algorithm is to classify
the dataset by mapping the input to a higher dimensional feature space and fit a
hyperplane in that feature space in such a way that the margin between each class is
maximum [5]. The objective function penalizes the model for each misclassification
[5]. The accuracy of this model may vary due to the following hyperparameters: kernel,
gamma, and degree of regularization [5]. The shape of the hyper-plane is determined
by the kernel and the gamma is the kernel coefficient [5]. The overfitting is controlled
by the degree of regularization.

3.4 Multilayer Perceptron

A multilayer perception or MLP is another supervised machine learning which builds


on the biological neural network responsible for mammalian intelligence [6]. It is a
fully connected feed forward neural network consisting of a minimum of three layers,
namely, the input layer, the hidden layer, and the output layer. In each layer, a collection
of computational nodes (or perceptron) are present to perform simple weight
aggregation operation [6]. Also, a non-linear activation function is used at the output
each neuron to produce and forward the response to next layer. All the weights are
initially assigned a random value between 0 and 1 and later updated using the
backpropagation algorithm [21] to minimize the error in prediction. Thus, number of
layers, number of units in each layer, and the activation function are the
hyperparameters of the MLP for calibration [6].

3.5 K-Nearest Neighbor

K-nearest neighbor (KNN) is an unsupervised machine learning algorithm which works


by storing all the training data and makes a prediction for the test the data by using a
6

target value “k” for its nearest neighbors [7]. The calculation of nearest neighbor among
the training dataset is made using the distance from the test data point [7]. Any one of
the distances among several distance metrics such as Euclidean distance, Manhattan
distance, and Minkowski distance can be used for this algorithm [7]. Thus, distance
metric and the number of nearest neighbors are the two hyperparameters for this
algorithm [7].

3.6 Model Evaluation

All the models, supervised and unsupervised, were trained using the training dataset,
and their performance was evaluated using the test dataset. During model training,
different hyperparameters associated with respective models were calibrated using the
grid search algorithm [22]. Upon calibration using the training dataset, the calibrated
parameters were fixed and test data were used to evaluate the accuracy of each model
using eq. 1.
For the decision tree algorithm, Gini impurity and Entropy criterion were used while
ranging the maximum depth of the decision tree between 1 and 50. Also, the minimum
number of samples for a split at each decision node were varied between 2 and 10.
Similarly, for the random forest algorithm, both Gini impurity and Entropy criterion
were used while the number of estimators were ranged between 2 and 512 (in the steps
of 2i, where i = 1, 2, 3 ... 9). Moreover, for calibrating the support vector classifier,
linear, poly, and rbf kernels were used with different values of gamma and degree of
regularization. Also, we calibrated the MLP for different number of hidden layers and
number of nodes in each layer along with different activation functions such as sigmoid,
tanh, and ReLu. Furthermore, the KNN algorithm was calibrated using different values
of nearest ranging between 1 to 100 and different distance metrics.

4 Results

All the machine learning models (decision tree, random forest, SVC, MLP, and KNN)
were trained using the training dataset and calibrated using the hyperparameters as
discussed above. Table 1 shows the performance (accuracy) of each model along with
their best calibrated hyperparameters. In the decision tree model, entropy criterion
produced better results compared to Gini impurity and the accuracy stagnated at the
maximum depth of 6. Similarly, for the random forest algorithm, entropy criterion
produced higher accuracy with 256 estimators. Thus, random forest model performed
better than decision tree model. However, the support vector classifier could not
perform accurately compared to the decision tree or random forest models. Moreover,
the best results produced by the MLP model had an accuracy of 60.02% with 3 hidden
layers and tanh activation function. The first, second, and third layers had 10, 30, and
10 neuron units, respectively. The best performance was shown by KNN algorithm
using Euclidean distance metric and K=49. The KNN algorithm produced an accuracy
of 61.43% in predicting the actions using the calibrated hyperparameters.
7

Table 1. Performance of different models and their optimal hyperparameters.

Model Optimal Hyper-parameters Accuracy


Criterion = Entropy, maximum depth = 6,
Decision Tree 60.98%
minimum sample split = 6
Random Forest Criterion = Entropy, number of estimators = 256 61.28%
Support Vector Classifier Kernel = linear, gamma = “auto”, C = 2000 48.20%
5 layers = [1004, none], [10, tanh], [30, tanh], [10,
Multilayer Perceptron 60.02%
tanh], [3, ReLu]
K-nearest neighbor Distance = Euclidean, K = 49 61.43%

Therefore, the rank of the machine learning as per their performance would be KNN,
random forest, decision tree, MLP and SVC.

5 Discussion and Conclusion

Although prior research has evaluated human performance in complex search-and-


retrieve tasks using simulated environments, however, little was known about how
machine learning algorithms (supervised and unsupervised) would account for human
actions in such complex and cognitive demanding tasks. The primary objective of this
research was to develop machine learning models using decision tree [3], random forest
[4], SVC [5], MLP [6], and KNN [7] algorithm to predict human actions in complex
search-and-retrieve simulated environments. The collected human data of fifty
participants during the training phase was used to training different machine learning
algorithms and calibrate the hyperparameters using grid search. Subsequently, test data
obtained from the test phase data collection was used to evaluate the performance of
machine learning models with their calibrated parameters.
Results show that KNN, an unsupervised machine learning algorithm outperformed all
the supervised machine learning algorithms. Also, algorithms like decision tree and
random forest performed slightly better than MLP. However, SVC could not account
for human actions in test scenario compared to other machine learning models. One
reason for such poor performance of SVC might be due to methodology to project the
original dataset onto a higher dimensional hyper-plane. Since our dataset was already
consisting of 1005 features, SVC would have failed to classify (find a hyperplane)
human actions properly.
This paper contributes to the computational modeling community by developing
supervised and unsupervised machine learning models to predict human actions in
complex search and retrieve environments. These models can be deployed in physical
robots to mimic human strategies and decision-making skills in an on-ground military
operation.
Although the machine learning models have produced promising results that can be
translated to real world implementations, however, there are few limitations of our
research. Being a lab-based simulated study, the data collection was done on computer
systems using simulated games and not in real arenas. However, the simulation was
8

developed using a professional game development engine, Unity 3D [18] to give


immersive and real-world experience to the participants. All the recruited participants
were students at the Indian Institute of Technology Mandi with no real experience of
search-and-retrieve missions. To overcome this limitation, we provided a 15-minute
training session of the game to participants to make them accustomed to the simulated
environment.
Numerous ideas can be taken forward as future scope of this research. First,
computational cognitive models such as instance-based learning [23] can be developed
to account for human actions in complex search-and-retrieve environments. Second, an
ensemble of cognitive and machine learning model can be developed to combine the
best of both the world and account human actions such scenarios. Also, these models
can be used as a second agent to understand how humans perform as a team with
machines [24] in high demand cognitive tasks such as search-and-retrieve scenarios.

References
1. Fong, R. C., Scheirer, W. J. and Cox, D. D. Using human brain activity to guide machine
learning. Scientific Reports, 8, 1 (2018/03/29 2018), 5397.
2. Vrigkas, M., Nikou, C. and Kakadiaris, I. A. A Review of Human Activity Recognition
Methods. Frontiers in Robotics and AI, 2 (2015-November-16 2015).
3. Quinlan, J. R. Learning decision tree classifiers. ACM Computing Surveys (CSUR), 28, 1
(1996), 71-72.
4. Pal, M. Random Forest classifier for remote sensing classification. International journal of
remote sensing, 26, 1 (2005), 217-222.
5. Lau, K. and Wu, Q. Online training of support vector classifier. Pattern Recognition, 36, 8
(2003), 1913-1920.
6. Rosenblatt, F. The perceptron: a probabilistic model for information storage and
organization in the brain. Psychological review, 65, 6 (1958), 386.
7. Keller, J. M., Gray, M. R. and Givens, J. A. A fuzzy k-nearest neighbor algorithm. IEEE
transactions on systems, man, and cybernetics, 4 (1985), 580-585.
8. Fan, L., Wang, Z. and Wang, H. Human Activity Recognition Model Based on Decision
Tree. City, 2013.
9. Chathuramali, K. G. M. and Rodrigo, R. Faster human activity recognition with SVM. City,
2012.
10. Vohra, I., Uttrani, S., Rao, A. K. and Dutt, V. Evaluating the Efficacy of Different Neural
Network Deep Reinforcement Algorithms in Complex Search-and-Retrieve Virtual
Simulations. Springer International Publishing, City, 2022.
11. Hester, T., Vecerik, M., Pietquin, O., Lanctot, M., Schaul, T., Piot, B., Horgan, D., Quan, J.,
Sendonaris, A. and Osband, I. Deep q-learning from demonstrations. City, 2018.
12. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D. and
Riedmiller, M. Playing atari with deep reinforcement learning. arXiv preprint
arXiv:1312.5602 (2013).
13. Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H.,
Gupta, A. and Abbeel, P. Soft actor-critic algorithms and applications. arXiv preprint
arXiv:1812.05905 (2018).
14. Berner, C., Brockman, G., Chan, B., Cheung, V., Dębiak, P., Dennison, C., Farhi, D.,
Fischer, Q., Hashme, S. and Hesse, C. Dota 2 with large scale deep reinforcement learning.
arXiv preprint arXiv:1912.06680 (2019).
9

15. Serafim, P. B. S., Nogueira, Y. L. B., Vidal, C. and Cavalcante-Neto, J. On the development
of an autonomous agent for a 3d first-person shooter game using deep reinforcement
learning. IEEE, City, 2017.
16. Anderson, J. R., Matessa, M. and Lebiere, C. ACT-R: A theory of higher level cognition
and its relation to visual attention. Human–Computer Interaction, 12, 4 (1997), 439-462.
17. Stevens, C., Fisher, C. R. and Morris, M. B. Toward Modeling Pilot Workload in a Cognitive
Architecture. City, 2021.
18. Xie, J. Research on key technologies base Unity3D game engine. City, 2012.
19. Abdi, H. and Williams, L. J. Principal component analysis. Wiley interdisciplinary reviews:
computational statistics, 2, 4 (2010), 433-459.
20. Grabmeier, J. L. and Lambe, L. A. Decision trees for binary classification variables grow
equally with the Gini impurity measure and Pearson's chi-square test. International journal
of business intelligence and data mining, 2, 2 (2007), 213-226.
21. Van Ooyen, A. and Nienhuis, B. Improving the convergence of the back-propagation
algorithm. Neural networks, 5, 3 (1992), 465-471.
22. Liashchynskyi, P. and Liashchynskyi, P. Grid search, random search, genetic algorithm: a
big comparison for NAS. arXiv preprint arXiv:1912.06059 (2019).
23. Gonzalez, C. and Dutt, V. Instance-based learning: Integrating sampling and repeated
decisions from experience. Psychological Review, 118, 4 (2011), 523-551.
24. Lyons, J. B., Wynne, K. T., Mahoney, S. and Roebke, M. A. Trust and human-machine
teaming: A qualitative study. Elsevier, City, 2019.

You might also like