ETRI Journal - 2021 - Chung

Received: 26 November 2020 Revised: 16 March 2021 Accepted: 1 July 2021
DOI: 10.4218/etrij.2020-0446
ORIGINAL ARTICLE
Real-world multimodal lifelog dataset for human behavior

study
Seungeun Chung Chi Yoon Jeong Jeong Mook Lim Jiyoun Lim
Kyoung Ju Noh Gague Kim Hyuntae Jeong
Artificial Intelligence Laboratory,

Electronics and Telecommunications Abstract
Research Institute, Daejeon, Republic of
To understand the multilateral characteristics of human behavior and physiolog-
Korea
ical markers related to physical, emotional, and environmental states, extensive
Correspondence lifelog data collection in a real-world environment is essential. Here, we pro-
Chi Yoon Jeong, Artificial Intelligence
pose a data collection method using multimodal mobile sensing and present a
Laboratory, Electronics and
Telecommunications Research Institute, long-term dataset from 22 subjects and 616 days of experimental sessions. The
Daejeon, Republic of Korea. dataset contains over 10 000 hours of data, including physiological, data such
Email: iamready@etri.re.kr
as photoplethysmography, electrodermal activity, and skin temperature in addi-
Funding information tion to the multivariate behavioral data. Furthermore, it consists of 10 372 user
Electronics and Telecommunications
labels with emotional states and 590 days of sleep quality data. To demonstrate
Research Institute, Grant/Award Number:
21ZS1100 feasibility, human activity recognition was applied on the sensor data using a
convolutional neural network-based deep learning model with 92.78% recog-
nition accuracy. From the activity recognition result, we extracted the daily
behavior pattern and discovered five representative models by applying spectral
clustering. This demonstrates that the dataset contributed toward understand-
ing human behavior using multimodal data accumulated throughout daily lives
under natural conditions.
K E Y WO R D S
data collection, human behavior pattern, lifelog, real-world dataset
1 I N T RO DU CT ION standing the multilateral characteristics of human behav-

ior and physiological markers correlating with the individ-
With recent advancements in mobile devices and wear- uals' physical, emotional, and environmental states. How-
ables, including smartwatches, health trackers, and fitness ever, only a few lifelog datasets collected in the wild reflect
bands, collecting continuous sensor data at high resolu- the multifarious aspects of human life.
tion has become possible through off-the-shelf devices. Many publicly available activity datasets [1–5] contain
Since such devices sense various physiological and behav- behavioral data measured using an inertial measurement
ioral information continuously over the long term, many unit (IMU) and target human activity recognition (HAR)
researchers have focused on improving human health and through sensor data analysis [6,7]. Although vision-based
well-being by analyzing lifelog data. Extensive real-world HAR [8] is another interesting research topic, here, we
lifelog datasets are an essential prerequisite for under- only focus on on-body sensor data. Designing these activity
This is an open access article under the term of Korea Open Government License (KOGL) Type 4: Source Indication + Commercial Use Prohibition + Change Prohibition
(http://www.kogl.or.kr/info/licenseTypeEn.do).
1225–6463/$ © 2021 ETRI
426 wileyonlinelibrary.com/journal/etrij ETRI Journal. 2022;44(3):426–437.

22337326, 2022, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.4218/etrij.2020-0446 by Air University, Wiley Online Library on [22/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
CHUNG ET AL. 427
datasets follows from identifying activities of daily liv- a sleep-tracking sensor and ecological momentary assess-
ing (ADL), recorded by following a dedicated protocol ments [38] on sleep quality, respectively.
in a controlled environment with a restricted number of The contribution of this paper is as follows:
subjects. Because the dataset size is limited, data aug-
• We propose a lifelog data collection method that exten-
mentation is applied to expand the usable dataset [9].
sively observes multilateral characteristics of behav-
Datasets collected under laboratory conditions are insuf- ioral, emotional, and environmental states of human life
ficient for representing natural behavioral traits during using multimodal mobile sensing.
everyday life. • We present a long-term dataset including 28 consec-
Recently, research has shifted focus to lifelogging [10,11] utive days of experimental sessions obtained from 22
personal information, such as activities, locations, and subjects each, to obtain 616 days. The dataset contains
environments, and lifelog information retrieval [12,13]. more than 2.26 TB of data including 10 372 user labels
Several lifelog datasets [14,15] are predominantly multi- with emotional states, over 10 000 h of various sensor
media data and semantic contexts, whereas available raw data including physiological data, and 590 days of sleep
sensor data are quantitatively small. Additionally, subject quality data.
numbers are significantly limited because of the privacy • Finally, we demonstrate the feasibility of the dataset
concerns incurred by visual data collection. The ExtraSen- through applications including HAR and human behav-
sory [16–18] dataset is the most exhaustive study con- ior pattern extraction for practical use.
ducted in a free-living condition and focused mainly on
accumulating behavioral sensor data.
A review [19] on mobile sensing for the behavior 2 RELATED WORK
study categorized assessable behavioral features into three:
the physical movements (physical activity and mobil- This section reviews previous works on activity and lifelog
ity patterns), social interactions (face-to-face encounters datasets. Table 1 presents a summary of the datasets
and computer-mediated communication), and daily activ- reviewed in this section.
ities (mediated and non-mediated activities). Many public 2.1 Activity datasets
datasets contain some aspects of the aforementioned fea- The opportunity [5] dataset collected multiple on-body
tures, however, deficiencies exist in measuring multilateral IMUs and environmental sensor data from four subjects in
perspectives of daily behavioral patterns, emotional state, a living lab, with labels representing modes of locomotion,
and sleep quality. actions, and objects. The PAMAP2 [4] dataset used three
We focus on the emotional state as one essential deter- IMUs from the wrist, chest, and ankle, added to a heart
minant associated with behavioral phases of everyday life rate monitor worn on nine subjects, containing 18 labels
[20,21], and quality of life [22,23]. Since physiological on diverse activities, such as walking and playing soccer.
responses are widely accepted vital markers corresponding The IMU data contained in the UCI-HAR [1] dataset fol-
to emotional states [24,25], we actively accumulate multi- low from a smartphone mounted on the left side of the
ple physiological signals including photoplethysmography waist, where 30 subjects performed six activity protocols.
(PPG), electrodermal activity (EDA), and skin temperature Since these datasets were collected in a controlled environ-
naturally with self-reported emotional states. Although ment from experimental protocols of simple and low-level
there exist some physiological emotion datasets [26–28], of activities, they are inadequate for analyzing multilateral
emotions provoked by artificial stimuli, such as video clips characteristics of human behavior that are complex.
and music recordings, specifically targeted for the emotion The mHealth [2] dataset gathered 12 common ADLs in
an out-of-lab environment from ten subjects, where the
recognition domain.
smartphone app self-recorded the activity protocol exe-
Sequel to clinically investigated studies on the relation- cution. Afterward, the video-recorded data were used as
ship between the quality of life and sleep quality [29,30], the ground-truth activity label. The experiment employed
which correlates notably with non-sleep daily phenom- three on-body IMUs on the chest, right wrist and left ankle,
ena [31], such as mood [32] and physical activities [33], and a 2-lead ECG sensor on the chest. Although the data
sleep quality is also considered in our lifelog dataset. collection protocol was unconstrained, the dataset covers
Although it is subjective, several efforts have tried to evalu- only two h of activity data.
ate sleep quality using objective measures [34,35] together The ExtraSensory [16] dataset obtained labeled sen-
with subjective questionnaires [36,37]. We expand the sor data from smartphones and smartwatches uncon-
lifelog dataset by collecting comprehensive data, includ- strained for approximately one week (7.6 days on aver-
ing objective and subjective sleep qualities measured using age) from 60 subjects, where the participation dura-
428 CHUNG ET AL.
TABLE 1 Summary of activity and lifelog datasets

Dataset Subjects Duration Labeling On-body Sensor Behavioral Data Physiological Data
Opportunity [5] 4 25.0 h Protocol 7 IMUs A, G, M -
PAMA2P [4] 9 10.0 h Protocol 3 IMUs, HR A, G, M HR
UCI-HAR [1] 30 1.5 h Protocol P A, G -
mHealth [2] 10 2.0 h Video 3 IMUs, ECG A, G, M ECG
NTCIR-14 [15] 2 30 days Metadataa P, W, camera Image, semantic contextb HR, blood glucose
ImageCLEF [14] 2 30 days Metadataa P, camera Image, semantic contextb -
ExtraSensory [16] 60 7 days Self-report P, W A, G, M, GPS, audioc , phone state -
ETRI (Ours) 22 28 days Self-report P, W A, G, M, GPS, audioc , phone state PPG, EDA, temperature
Smartphone (P), Smartwatch (W), Accelerometer (A), Gyroscope (G), Magnetometer (M), Heart Rate (HR)
a Attribute and category of the place and objects detected using image processing.
b Semantic locations, physical activities or transportation media captured by an application.
c Periodically recorded background noise.
tion varied from 2.9 to 28.1 days. The participants Then, the system architecture and data collection protocol
self-reported their activity and context by selecting rele- are explained, to gather an extensive dataset represent-
vant labels from the app that offers more than 100 labels. ing the participants’ behavioral, emotional, and environ-
Although data were collected in a real-world scenario, mental states using multimodal mobile sensing. Finally,
neither the number of participants nor experiment dura- we discuss the details and statistics of the dataset. The
tion was sufficient to completely reflect the long-term dataset is publicly available at https://nanum.etri.re.kr/
aspects of human behavior under natural conditions. share/schung/ETRILifelogDataset2020?lang=En_us.
Existing datasets focused on the activity or context and
lacked sensor data that indicated the psychological or 3.1 System design
physiological states.
The goal of our lifelogging system is to accumulate
large-scale lifelog data for the long term that offer
2.2 Lifelog datasets
data-driven descriptions of human life from various per-
The NII testbeds and community for information access spectives. Therefore, we are primarily concerned with a
research (NTCIR) lifelog dataset [15] is provided by the user-friendly interface for our system, thereby enhancing
information retrieval research community, which treats experimental participation eventually and guaranteeing
lifelogging as an application of information retrieval. the reliability of the data. Three requisite aspects of lifelog
These datasets contain 30 days of data logged by two data, the behavioral, environmental and social, and physi-
subjects with approximately 26 GB of image, location, ological traits are defined, and the most suitable hardware
and biometric information. Similarly, the cross-language device combinations that meet our requirements are dis-
evaluation forum (CLEF) group supports the Image- cussed. Furthermore, we organize the system with a min-
CLEF lifelog dataset [14], to encourage cross-language imum number of devices to reduce inconveniences that
annotation and image retrievals. The ImageCLEF lifelog occur during experimental sessions. The system is com-
dataset contains 1.5 months of data from two partici- posed of a smartphone, a wrist-worn health tracker, and a
pants, including visual data, biometric information, and sleep-quality monitoring sensor.
semantic contents. Repeatedly labeling activity variations in everyday life
Since participants are highly limited (two or three sub- is challenging and almost impossible to log each activ-
jects) in existing lifelogging datasets [15,39,40], it lacks ity without exception. Since labeling rates are most cru-
enough data to insufficiently derive a general lifelog cial for determining the accessibility and feasibility of
model. Additionally, existing datasets concentrate on the experiment, frequently changing low-level activities
the surroundings [14,41,42] rather than the participants' (for example, sitting, walking, standing, lying, and run-
behavioral or emotional characteristics, thus, falling short
ning) were omitted. We designed labels only for high-level
of understanding compound aspects of one's actual life.
behaviors, including semantic contexts. Since long-term
experiments can deteriorate participation over time, main-
3 ETRI LIFELOG DATA SET taining the fidelity of the experiment for a prolonged
period was considered. A user statistics page was embed-
This section first introduces our system design consider- ded in the lifelogging application that delivers the partici-
ation for collecting lifelog data under natural conditions. pant's daily and weekly statistics and compares it with that
CHUNG ET AL. 429
of other participants. First, the statistics are computed in a access to the collected data. To efficiently manage large
central server that integrates lifelog data from participants, amounts of lifelog data, the server uses non-relational
and then, shared on the client device. MongoDB DBMS, widely used in big data and real-time
web applications. This server web application is imple-
mented using the spring framework. The smartphone
3.2 Lifelogging system architecture
application mainly comprises the labeling user interface
Our proposed lifelogging system consists of a mobile appli- (UI), data collection, user statistics, and data transfer
cation on an android smartphone, a wrist-worn health modules.
tracker, and a sleep-quality monitoring sensor, as shown
Labeling UI: User labels comprehend a broad range of
in Figure 1. The lifelogging server collects data from
everyday activities and semantic contexts from the
our smartphone application and cloud servers, providing
time use survey published by Statistics Korea. Table 2
summarizes categories collected for activity, social
state, and place labels. To reduce the number of inter-
actions required for each label input, we designed the
labeling UI to integrate all elements in a single page,
as shown in Figure 2A. The labeling UI displays the
16 high-level activity options hierarchically. The user
first selects a category and then chooses the final label
from the sub-category options. The context labels are
designed to indicate a companion, if any, the intensity
of social interaction, and the semantic place informa-
tion. Emotion labels follow Russell's circumplex model
of affect [43], which presents emotion along two axes:
arousal and valence axes. Our application offers two
7-point Likert scales corresponding to these axes.
An interface for entering repeated routines or pre-
arranged schedules in advance was incorporated and
requests confirmation for each event to mitigate iter-
ated label input. A summary of the selected labels,
composed of a user's daily routine is visualized in
FIGURE 1 Lifelogging system architecture
chronological order on the timeline screen, as shown
in Figure 2B. User labels appear as icons to represent
TABLE 2 Categories for activity, social state, and place labels
the context information, while emotions are mapped
Label Category Sub-category (if exists) as colors to display mood changes in a simple definite
Activity (16) Sleep, personal care, work, study, way. Our application allows the user to interactively
housework, caregiving, media,
insert new labels or modify existing ones by reviewing
entertainment, sports, hobby, free
the timeline to improve user convenience.
time, shopping, regular activity
Transport (Mode of transportation) On foot, Data collection: The 3-axis inertial sensors were sampled
by bus, by car as a driver, by car at 50 Hz, while the GPS coordinates were recorded
as a passenger, by subway or train, every five seconds. Audio input from a microphone
by personal mobility was recorded at 22 050 Hz every 30 min, cap-
Meal (Amount of intake) Large, moder- turing background noise and ambient environment.
ate, small Recorded audio data were processed on the smart-
Social (Conversation) Active, moderate,
phone to extract audio features composed of 13 Mel
passive
frequency cepstral coefficients (MFCCs) for every 40
Socialstate (3) Alone -
Two With a family member, a friend, ms frame with 30 ms overlap. Phone state information
a colleague, an acquaintance, a includes activity types and application usage statis-
stranger tics. The activity type was retrieved from the Google
Group With family members, friends, awareness application programming interface (API)
colleagues, acquaintances, every minute, indicating the device state, such as in a
strangers vehicle, on a bicycle, on foot, still, or unknown. The
Place (5) Home workplace, restaurant, other
application usage statistics captured the duration of an
indoor, outdoor
application running in the foreground every 30 min.
430 CHUNG ET AL.
FIGURE 2 Lifelogging application screens. (A) Labeling UI, (B) Timeline, (C) User Statistics, (D) Data upload
Finally, the weather information, including tempera- server. Finally, the experiment operator manually fetches
ture, humidity, and air pollutant concentration (that data from the cloud to our lifelogging server.
is, PM 10 and PM 2.5), from the GPS location was To investigate the possible relationship between an
acquired using the AirVisual API. individual's daily routine and sleep quality, we collected
User statistics: Our application visualizes various aspects data during sleep at night. We adopted the Withings
of statistical data to offer useful information related to sleep-tracking mat [45] that is placed under the mattress
one's daily routine. Added to the number of daily labels during sleep once installed. Using a network connection
collected by the user, the distribution of aggregated via the Withings health mate application installed on the
activity and emotion labels are presented by week to smartphone, sleep data were automatically uploaded to
deliver the user's lifelog trend in the longer term, as the Withings cloud server where our lifelogging server
shown in Figure 2C. The average of all participants is periodically fetches data through the Withings data API.
also shown to compare statistics with others, which Sleep data provide various sleep quality measures, such as
inform data collection in the experiment. the duration of wakefulness, light sleep, deep sleep, and
Data transfer: User labels and sensor data are accumulated REM sleep during sleep. It also provides general informa-
in the local storage during the experimental session tion, including the start and end times of sleep and the time
and transferred to our data collection server via the taken to fall asleep and awaken.
WiFi network whenever the user terminates the exper-
iment at day end. Figure 2D illustrates the data transfer 3.3 Data collection protocol
UI, which also shows a summary of daily labels before
The duration of the experiment was from August 30, 2020,
uploading the data to the server.
to October 8, 2020, with 13 male and 9 female partici-
We adopted an Empatica E4 [44] wristband to obtain pants from a metropolitan city in South Korea. Their ages
medical-grade physiological signals and behavioral data. ranged from 20 to 35 years and most were office workers
The 3-axis accelerometer data were sampled at 32 Hz, who commute periodically during weekdays. Reportedly,
while blood volume pressure was measured from a PPG all participants work more than three days a week. The
sensor at 64 Hz. Skin conductance values from an EDA demographic information of the participants is summa-
sensor and peripheral skin temperature from an infrared rized in Table 3. The experiment was performed using
thermopile were both recorded at a sampling rate of 4 Hz. institutional review board approval (P01-202009-22-002).
To retrieve the sensor data stored in the local memory of We deployed our lifelogging application in person dur-
the E4 device, the user connects the E4 device to a PC via a ing the orientation of our study. The participants were
USB interface. Then, Empatica's preinstalled E4 manager rewarded daily, accumulating weekly incentives for com-
software automatically uploads data to Empatica's cloud pleting each week's experiment without missing any day.
CHUNG ET AL. 431
TABLE 3 Demographic information of participants that they maintained their daily routine and that the
Participants Age Gender Height BMI Dominant number of devices managed during the experiment was
(yr) (cm) (kg/m2 ) hand reasonable, whereas 31.8% opined the labeling method
Participant 1 27 M 166 27.6 L caused some nuisance in their everyday life. E4 wrist-
Participant 2 25 F 168 29.1 L band recorded 63.6% respondents as the most disturb-
Participant 3 32 F 166 22.5 L ing device, because of the limited user interface of the
Participant 4 28 M 171 23.9 R research-purpose device. However, 90.9% of the partici-
Participant 5 31 M 177 23.9 R pants responded that statistical labels were informative
Participant 6 22 F 164 26.0 R
and useful in organizing their daily experiences. Addition-
Participant 7 30 M 173 28.1 R
ally, 95.5% of the participants wanted to join additional
Participant 9 20 M 170 27.7 R experiments and 59.1% answered affirmed willingness to
Participant 10 25 M 184 23.1 R continue the experiment over four weeks with the same
Participant 11 32 M 175 25.5 R protocol. This implies that our experimental design was
Participant 12 30 F 160 21.1 R acceptable for lifelogging in a real-world scenario. Overall,
Participant 13 35 M 177 27.1 R the survey results reflect the necessity of automatic lifel-
Participant 14 26 F 165 24.8 R ogging and self-quantification technologies for organizing
daily experiences.
Participant 16 27 F 160 18.8 R
Participant 17 28 F 170 20.8 R 3.4 Dataset statistics
Participant 19 29 M 169 22.1 R We accumulated over 2.26 TB of data with 10 372 user
Participant 20 26 M 174 19.2 R labels, representing the semantic activity, place, social sta-
Participant 21 27 F 166 20.7 R tus, and emotion at the time of labeling. Data representing
Participant 22 30 F 163 25.6 R behavioral traits consist of IMU data, GPS coordinates,
activity types, and phone states indicating the app usage
statistics, while environmental data contains audio fea-
tures reflecting ambient noise and weather information.
From Table 4, our dataset contains twice more data than
the ExtraSensory dataset. Also, our dataset contains 8,141
h of physiological data and 590 days of sleep-quality mea-
surements during night sleep. Formally, we assert owning
the largest dataset containing physiological data in addi-
tion to the activity and lifelog data collected in a real-world
FIGURE 3 Post-experiment survey results
scenario.
Figure 4 summarizes the occurrences and distributions
Thus, most participants completed the 28 consecutive days of activity, place, social state, and emotion labels. Among
of the experiment period. 16 labels listed as the activity option, eight activities includ-
The participants started the experiment in the morning ing transport (20.1%), work (17.9%), meal (14.0%), media
with their everyday routine and continued collecting data (11.8%), personal care (10.9%), free time (5.9%), housework
for at least 12-h daily. The sleep-quality questionnaires (5.1%), and social (3.5%) comprised a large proportion
were first self-reported in the morning. The participants (89.2%) of activity labels. Responding to the coronavirus
were instructed to insert a label for contextual changes, (COVID-19) pandemic, many participants unexpectedly
including physical activity, semantic place, social status, worked from home during the experiments. Consequently,
and emotion. At the end of the 12-h experiment, partici- 43.8% of the place labels were home, while only 18.8%
pants sent the data to the server. An experiment supervisor was denoted as the workplace. The social state distribution
managed the data quality daily and encouraged the par- showed that the vast majority of labels (58.0%) were tagged
ticipants to actively engage in the experiment. Finally, when participants were alone with little social interaction.
participants answered the second sleep-related survey that Figure 4D show the occurrence of emotion labels in
recorded the stress perception level (degree of physical and 2D according to Russell's circumplex model of affect.
emotional stress) and the amount of caffeine and alcoholic The 7-point valence and arousal scales range from cate-
beverage intake during the day before sleep. gory 1 (negative) to category 7 (positive) and category 1
After the experiment, we surveyed the experimental (relaxed) to category 7 (activated) on the x- and y-axes,
protocol and devices used. The results are summarized respectively. Category 4 denotes a neutral state for both
in Figure 3. Among the 22 participants, 81.8% affirmed arousal and valence scales and occurs most at (9.9%)
432 CHUNG ET AL.
TABLE 4 Overview of the dataset and details of the data

ExtraSensory [16] ETRI Dataset (Ours)
Sensor Data details (Sampling frequency) Hours Data details (Sampling frequency) Hours
Accelerometer 3-axis (40 Hz) 5138 3-axis (50 Hz) 9824
Gyroscope 3-axis (40 Hz) 4864 3-axis (50 Hz) 10 735
Magnetometer 3-axis (40 Hz) 4708 3-axis (50 Hz) 10 660
Watch Accelerometer 3-axis (25 Hz) 3512 3-axis (32 Hz) 8141
GPS Location Longitude, latitude, altitude (variable) 3512 Longitude, latitude, accuracy (5 s) 11 638
Audio 42-ms frame 13 MFCC (per min) 5036 40-ms frame 13 MFCC (30 min) 12 149
Phone State Physical phone state (per min) 5138 Foreground app usage time (30 min) 12 223
Activity type - - Transportation mode, activity (per min) 10 563
Weather - - Temperature, humidity, PM 10/2.5 (per h) 12 485
PPG - - Blood volume pressure (64 Hz) 8141
EDA - - Skin conductance (4 Hz) 8141
Infrared Thermopile - - Skin temperature (4 Hz) 8141
Sleep Sensor - - Time in bed, total sleep time, score (per day) 590 days
FIGURE 4 Histograms for user label occurrence and their distributions. (A) Activity labels, (B) Place labels, (C) Social labels, (D) Emotion
labels (Color bar presents the number of occurrences while the number stands for their distribution)
4.1 HAR
We focus on HAR, which predicts low-level activities
(walking or sitting) by analyzing the sensor data from
on-body sensors. Since our experiment accumulated data
of real-world scenarios over the long term, the user label
can correspond more closely with the high-level activity
or semantic context (commuting to work) and inconsis-
FIGURE 5 Deep learning network architecture for HAR tently with phase changes in low-level activities with high
precision. For example, walking and sitting can occur
among 49 combinations of the 7-point A-V scale. The alternately during commuting, but it is almost impossi-
low-arousal positive-valence state (quadrant IV) composed ble to label the low-level activity promptly while in action.
the largest proportion (37.3%) of emotion labels, then the Therefore, activity type data from the Google Awareness
high-arousal positive-valence state (quadrant I, 19.4%), API are the ground truth of low-level activity with four
low-arousal negative-valence state (quadrant III, 6.0%), classes: on foot, still, in vehicle, and unknown.
and high-arousal negative-valence state (quadrant II, 4.5%) The numbers of activity data were highly imbalanced
in that order. following the human activity. Among 10 563 h of activity
4 APPLICAT IONS OF THE type data, only 3% of data corresponded to “in vehicle” and
DATAS ET “on foot,” respectively, while 85% of data represented the
To demonstrate the applicability of the dataset for practi- “still” state. To account for the label imbalance, the per-
cal use, we used the dataset for two applications: HAR and formance and accuracy are presented using the F1-score
human behavior pattern extraction. metric.
CHUNG ET AL. 433
FIGURE 6 Cumulative time of experiment days of each user (solid: weekdays, dotted: weekends)
FIGURE 7 (A) Silhouette coefficients according to the number of clusters. (B) Cumulative time distribution of the five clusters
FIGURE 8 Axes of the radar chart represent the experiment day count corresponding to each cluster
We performed activity recognition using accelerometer the sensor data using a 2.5 s-long window length with 50%
data collected from smartphones and wristbands, respec- overlap. The 3-axis sensor data were transformed into a 15
tively. Five seconds of sensor data were processed before × 15 matrix and fed into the convolutional neural network
the timestamp of the activity type data and we segmented architecture. A network with four convolutional layers and
434 CHUNG ET AL.
two fully connected layers was used for model training, as each experimental days, starting from about 8:00 AM and
shown in Figure 5. remain active during working hours. The third cluster has
After over 5-fold cross-validation iterations, the average a similar trend at the beginning of the day but shows mod-
recognition accuracy using the smartphone accelerometer erate activity during working hours. The fourth cluster
data was 92.78% with an F1-score of 0.92, while that using starts the experiment late in the morning, while the last
the wristband accelerometer data was 82.9% accurate with cluster presents a behavior pattern that is inactive during
an F1-score of 0.78. Because Google’s activity type is deter- the whole day.
mined by the sensor data embedded in the smartphone, Finally, we visualized the experimental day count of
its accelerometer data showed better performance than the each cluster in radar charts for each user, as shown in
wristband data, as expected. Since the accuracy of smart- Figure 8, where the axes indicate five representative clus-
phone data is reliable, the discrepancy can be resolved ters. The results show a sharp direction on a specific
by learning the feature difference in wristband data from cluster, interpreted as the primary behavior pattern of indi-
the smartphone perspective and applying the model to the viduals. For example, user 02 tends to show moderate to
wristband data using transfer learning, as future work. inactive behavior pattern during the experiment, while
users 26, 27, and 30 show distinct behavior patterns that
4.2 Human behavior pattern extraction belong to a specific cluster such as idle, inactive, vigor-
From the activity recognition results, we further investi- ous pattern, respectively. In this way, users’ daily behavior
gated the daily behavioral patterns of each user, extracted patterns can be recognized and classified by analyzing the
the activity recognition results from the sensor data stream sensor data stream.
and accumulated the occurrence of activities every 10 min,
representing the distribution of activities in unit time. To 5 CO NCLUSIONS AND FUTURE
describe daily behavior patterns, we consider the active WO RK
portion by manipulating the cumulative time, recognized In this paper, we proposed a long-term lifelog data collec-
as “walking”, as a feature vector. The feature vector con- tion method that minimizes data collection disturbance
sists of 144 slots, representing 24 h in a 10-min unit, during everyday life. We presented a lifelog dataset cov-
starting from 5:00 AM. ering 616 days of experimental sessions obtained from 22
Figure 6 shows the cumulative time of walking repre- subjects. The dataset contains more than 2.26 TB of data
sented in the log scale for each user. Each line represents a including 10 372 user labels with emotional state, over
single experiment day, weekdays are solid lines, weekends 10 000 h of various sensor data, and 590 days of sleep qual-
are denoted by dotted lines. Many users have a consistent ity data. So far, our dataset is the largest that comprehends
behavioral pattern, as most experimental results overlap multiple aspects of human life using behavioral, physiolog-
with other days in each figure. As expected, the behav- ical, emotional, and environmental lifelog data collected
ioral patterns during the weekends are different from those in a real-world scenario. Among the 22 participants, 81.8%
of weekdays. As a result, we cluster the feature vector maintained their daily routine and were comfortable with
to discover the difference in behavior patterns between the number of devices handled during the experiment.
weekdays and weekends, and among users. Also, 90.9% of participants responded that the statistics
We use spectral clustering to group feature vectors. To of the labels were informative and useful in organizing
determine the number of clusters, the silhouette coeffi- their daily experiences. The survey result signifies that
cient for measuring the separation distance between the our experimental design was satisfactory for lifelogging in
resulting clusters is computed. The coefficient is nearly natural conditions.
one when samples are far from the neighboring clusters, To illustrate the practicability of our dataset, HAR was
while a negative value indicates that samples are assigned applied to the dataset, especially the sensor data from
to the wrong cluster. From Figure 7A, we plot the average smartphones and wristbands. Our convolutional neural
silhouette coefficient (in red) and the average silhouette network-based deep learning model recognized the activi-
coefficients that are below zero (in blue) according to the ties with 92.78% accuracy using smartphone accelerometer
number of clusters. From the result, k = 5 is a reasonable data while achieving 82.9% accuracy from using wristband
value for the number of clusters. data. From the activity recognition result, we extracted the
Figure 7B presents the clustering result indicating the daily behavior pattern as a feature vector and discovered
cumulative time of five representative clusters with diverse five representative patterns by applying spectral cluster-
behavioral patterns. The first cluster (solid line) represents ing. We confirmed that users' daily behavior patterns are
the experimental days that start early mornings and vig- categorized by analyzing the sensor data stream.
orously continues all day until the end of the experiment. Since our dataset contains several sensor data, we
The second cluster (blue line) indicates the majority of propose that the dataset can be used to understand
CHUNG ET AL. 435
multilateral characteristics of behavioral, emotional, and 4. A. Reiss and D. Stricker, Introducing a new benchmarked dataset
environmental states of human life. As a behavioral for activity monitoring, in Proc. Int. Symp. Wearable Comput.,
(Newcastle, UK), June 2012, pp. 108–109.
trait, the application usage time can be further processed
5. D. Roggen et al., Collecting complex activity datasets in highly rich
to extract information about online activity and social networked sensor environments, in Proc. Int. Conf. Netw. Sens.
interaction from the application category and usage pat- Syst. (INSS), (Kassel, Germany), June 2010, pp. 233–240.
terns. Additionally, the transportation mode classifica- 6. S. Chung et al., Sensor data acquisition and multimodal sensor
tion and extracting points of interest from GPS data can fusion for human activity recognition using deep learning, Sensors
19 (2019), no. 7, 1716.
assist in understanding the mobility pattern and model
7. C. Y. Jeong and M. Kim, An energy-efficient method for human
spatio-temporal routine [46]. Environmental traits such as activity recognition with segment-level change detection and deep
background audio recording can be used to understand learning, Sensors 19 (2019), no. 17, article no. 3688.
the ambiance of the venue, while weather information 8. O. F. İnce et al., Human activity recognition with analysis of angles
can contribute to finding relevant cues affecting behavioral between skeletal joints using a rgb-depth sensor, ETRI J. 42 (2020),
no. 1, 78–89.
or emotional tendencies. From physiological data, we dis-
9. M. Kim and C. Y. Jeong, Label-preserving data augmentation for
cover some significant markers that correspond to specific mobile sensor data, Multidimens. Syst. Signal Process. 32 (2021),
behavioral or emotional events. The sleep data provides no. 1, 115–129.
unusual daily traits or anomalies [47,48] that substantially 10. C. Gurrin, A. F. Smeaton, and A. R. Doherty, Lifelogging:
influence sleep quality. Furthermore, our dataset can be Personal big data, Found. Trends Inf. Ret. 8 (2014), no. 1, 1–125,
doi:10.1561/1500000033.
exploited in developing personalized classifiers that find
11. A. J. Sellen and S. Whittaker, Beyond total capture: A constructive
the unique character of each participant and differenti- critique of lifelogging, Commun. ACM 53 (2010), no. 5, 70–77,
ate one person from others. We leave the aforementioned doi:10.1145/1735223.1735243.
research topics as future work. 12. SIGMM, Lsc ’18: Proceedings of the 2018 ACM Workshop on the
Lifelog Search Challenge, ACM, New York, NY, USA, 2018.
13. C. Gurrin et al., LTA 2016: The first workshop on lifelogging tools
ACKNOWLEDGEMENT
and applications, in Proc. ACM Int. Conf. Multimed. (New York,
This work was supported by Electronics and Telecom- NY, USA), Oct. 2016, pp. 1487–1488.
munications Research Institute (ETRI) grant funded by 14. D. T. Dang Nguyen et al., Overview of imageCLEFlifelog 2019:
the Korean government. [21ZS1100, Core Technology Solve my life puzzle and lifelog moment retrieval, in Proc. Conf.
Research for Self-Improving Integrated Artificial Intelli- Labs Eval. Forum. (Lugano, Switzerland), Sept. 2019.
gence System]. 15. C. Gurrin et al., Overview of the NTCIR-14 lifelog-3 task, in Proc.
NTCIR Conf. (Tokyo, Japan), June 2019, pp. 14–26.
CONFLICT OF INTEREST 16. Y. Vaizman, K. Ellis, and G. Lanckriet, Recognizing detailed
The authors declare that there are no conflicts of interest. human context in the wild from smartphones and smartwatches,
IEEE Pervasive Comput. 16 (2017), no. 4, 62–74.
17. Y. Vaizman et al., Extrasensory app: Data collection in-the-wild
with rich user interface to self-report behavior, in Proc. CHI Conf.
ORCID Hum. Factors Comput. Syst. (New York, NY, USA), Apr. 2018,
Seungeun Chung https://orcid.org/ pp. 1–12.
0000-0001-9815-3985 18. Y. Vaizman, N. Weibel, and G. Lanckriet, Context recogni-
Chi Yoon Jeong https://orcid.org/0000-0001-7089-2516 tion in-the-wild: Unified model for multi-modal sensors and
multi-label classification, Proc. ACM Interact. Mob. Wearable
Jeong Mook Lim https://orcid.org/
Ubiquitous Technol. 1 (2018), no. 4, 1–22.
0000-0002-5535-0417 19. G. M. Harari et al., Smartphone sensing methods for studying
Jiyoun Lim https://orcid.org/0000-0003-4246-3081 behavior in everyday life, Curr. Opin. Behav. Sci. 18 (2017), 83–90.
Kyoung Ju Noh https://orcid.org/0000-0001-8492-8612 20. L. A. Clark and D. Watson, Mood and the mundane: Relations
Hyuntae Jeong https://orcid.org/0000-0003-4339-1673 between daily life events and self-reported mood, J. Pers. Soc.
Psychol. 54 (1988), no. 2, 296–308.
21. P. R. Giacobbi, H. A. Hausenblas, and N. Frye, A naturalis-
REFERENCES tic assessment of the relationship between personality, daily life
1. D. Anguita et al., A public domain dataset for human activity events, leisure-time exercise, and mood, Psychol. Sport Exerc. 6
recognition using smartphones, Esann 3 (2013), 3. (2005), no. 1, 67–81.
2. O. Banos et al., Design, implementation and validation of a novel 22. S. N. Rogers et al., The addition of mood and anxiety domains to
open framework for agile development of mobile health applica- the university of washington quality of life scale, Head & Neck: J.
tions, Biomed. Eng. Online 14 (2015), no. S2, S6. Sci. Specialties Head Neck 24 (2002), no. 6, 521–529.
3. D. Micucci, M. Mobilio, and P. Napoletano, Unimib shar: A 23. K. J. Stewart et al., Are fitness, activity, and fatness associated
dataset for human activity recognition using acceleration data with health-related quality of life and mood in older persons?, J.
from smartphones, Appl. Sci. 7 (2017), no. 10, 1101. Cardiopulm. Rehabil. Prev. 23 (2003), no. 2, 115–121.
436 CHUNG ET AL.
24. K. H. Kim, S. W. Bang, and S. R. Kim, Emotion recognition system 45. Withings, Sleep tracking mat, Available from https://www.
using short-term monitoring of physiological signals, Med. Biol. withings.com/kr/en/sleep [last accessed March 2021].
Eng. Comput. 42 (2004), no. 3, 419–427. 46. M. Barzegar, A. Sadeghi-Niaraki, and M. Shakeri, Spatial expe-
25. L. Shu et al., A review of emotion recognition using physiological rience based route finding using ontologies, ETRI J. 42 (2020), no.
signals, Sensors 18 (2018), no. 7, 2074. 2, 247–257.
26. J. Kim and E. André, Emotion recognition based on physiologi- 47. F. Bergadano, Keyed learning: An adversarial learning
cal changes in music listening, IEEE Trans. Pattern Anal. Mach. framework–formalization, challenges, and anomaly detection
Intell. 30 (2008), no. 12, 2067–2083. applications, ETRI J. 41 (2019), no. 5, 608–618.
27. S. Koelstra et al., Deap: A database for emotion analysis; using 48. J. K. Bii, R. Rimiru, and R. W. Mwangi, Adaptive boosting in
physiological signals, IEEE Trans. Affect. Comput. 3 (2011), no. ensembles for outlier detection: Base learner selection and fusion
1, 18–31. via local domain competence, ETRI J. 42 (2020), no. 6, 886–898.
28. T. Song et al., Mped: A multi-modal physiological emotion
database for discrete emotion recognition, IEEE Access 7 (2019),
12177–12191.
AU THOR BIOGRAPHIES
29. M.-F. Shao et al., Sleep quality and quality of life in
female shift-working nurses, J. Adv. Nurs. 66 (2010), no. 7, Seungeun Chung received her
1565–1572. BS and MS degrees in computer
30. J. Zeitlhofer et al., Sleep and quality of life in the austrian popu- science from Seoul National Uni-
lation, Acta Neurol. Scand. 102 (2000), no. 4, 249–257. versity, Seoul, Rep. of Korea,
31. B. Bower et al., Poor reported sleep quality predicts low positive in 2007 and 2009, respec-
affect in daily life among healthy and mood-disordered persons, J.
tively, and her Ph.D. degree in
Sleep Res. 19 (2010), no. 2, 323–332.
32. D. K. Thomsen et al., Rumination–relationship with negative
computer science from North
mood and sleep quality, Pers. Individ. Differ. 34 (2003), no. 7, Carolina State University, NC, USA, in 2016. From
1293–1301. 2009 to 2010, she worked as a junior researcher at Tmax
33. K. J. Reid et al., Aerobic exercise improves self-reported sleep and R&D Center, Seongnam, Republic of Korea. She is a
quality of life in older adults with insomnia, Sleep Med. 11 (2010), senior researcher in the artificial intelligence labora-
no. 9, 934–940. tory at Electronics and Telecommunications Research
34. S. Ancoli-Israel et al., The role of actigraphy in the study of sleep Institute, Daejeon, Republic of Korea. Her research
and circadian rhythms, Sleep 26 (2003), no. 3, 342–392. interests are in the areas of machine learning with
35. A. D. Krystal and J. D. Edinger, Measuring sleep quality, Sleep expertise in human behavior and context recognition.
Med. 9 (2008), no. 1, S10–S17.
36. D. J. Buysse et al., The pittsburgh sleep quality index: A new instru-
Chi Yoon Jeong received his BS
ment for psychiatric practice and research, Psychiatry Res. 28
(1989), no. 2, 193–213.
and MS degrees in electronic and
37. G. Landry, J. Best, and T. Liu-Ambrose, Measuring sleep qual- electrical engineering from Pohang
ity in older adults: A comparison using subjective and objective University of Science and Technol-
methods, Front. Aging Neurosci. 7 (2015), article no. 166. ogy, Pohang, Republic of Korea,
38. S. Shiffman, A. A. Stone, and M. R. Hufford, Ecological momen- in 2002 and 2004, respectively,
tary assessment, Annu. Rev. Clin. Psychol. 4 (2008), 1–32. and his Ph.D. degree in computer
39. D.-T. Dang-Nguyen et al., in Proc. Int. Workshop Content-based science from the Korea Advanced Institute of Sci-
Multimed. Indexing (New York, NY, USA), June 2017, ence and Technology, Daejeon, Republic of Korea,
pp. 1–6.
in 2018. He is currently a principal researcher in
40. C. Gurrin et al., Ntcir lifelog: The first test collection for lifelog
the artificial intelligence laboratory at Electron-
research, in Proc. Int. ACM SIGIR Conf. Res. Dev. Inf. Retr.
(New York, NY, USA), July 2016, pp. 705–708.
ics and Telecommunications Research Institute,
41. D.-T. Dang-Nguyen et al., Overview of imageCLEFlifelog 2017:
Daejeon, Republic of Korea. His current research
Lifelog retrieval and summarization, in Proc. ImageCLEF 2017 interests include computer vision, pattern recognition,
(Dublin, Ireland), Sept. 2017. machine learning, and emotion recognition.
42. D.-T. Dang-Nguyen et al., Overview of imageCLEFlifelog
2018: Daily living understanding and lifelog moment retrieval, Jeong Mook Lim received his BS
in Proc. Conf. Labs Eval. Forum. (Avignon, France), and MS degrees in computer science
Sept. 2018. from Chungnam National University,
43. J. A. Russell, A circumplex model of affect, J. Pers. Soc. Psychol. Daejeon, Republic of Korea, in 1998
39 (1980), no. 6, article no. 1161. and 2000, respectively. He is currently
44. Empatica, E4 wristband: Real-time physiological signals, Avail- a principal research scientist in the
able from https://empatica.com/research/e4 [last accessed Human Enhancement & Assistive
March 2021].
CHUNG ET AL. 437
Technology research section at Electronics and Daegu, Republic of Korea, in 2000. His research inter-
Telecommunications Research Institute, Daejeon, ests include emotion recognition based on physiolog-
Republic of Korea. His primary research interests ical signals, classification with imbalanced datasets,
include understanding human behavioral patterns and affective computing and interaction, gesture recogni-
psychological states with smartphones. tion, machine learning algorithms for emotion recog-
nition, and embedded software and systems.
Jiyoun Lim received her BS,
MS, and Ph.D. degrees in indus-
trial and systems engineering Hyuntae Jeong received his BS
from Korea Advanced Insti- and MS degrees in electronic engi-
tute of Science and Technology, neering from Chungnam National
Daejeon, Republic of Korea, in 2005, University, Daejeon, Republic of
2007, and 2017, respectively. She Korea, in 1993 and 1995. From
worked at Korea University of Technology and Edu- 1995 to 2000, he worked at Sam-
cation, Cheonan, from 2013 to 2017. She is a senior sung Heavy Industries R&D Center,
researcher in the artificial intelligence laboratory Daejeon, Republic of Korea, developing control
at Electronics and Telecommunications Research systems for ship experiments. Since 2001, he has
Institute, Daejeon, Republic of Korea. Her research been worked at Electronics and Telecommuni-
interests include data engineering and human context cations Research Institute, Daejeon, Republic of
recognition. Korea. He has been involved in the development of
mobile and wearable computing technologies. His
Kyoung Ju Noh is currently a current research interests include cognitive com-
principal researcher in the arti- puting, artificial intelligence, wearable computing
ficial intelligence laboratory at and HCI.
Electronics and Telecommu-
nications Research Institute,
Daejeon, Republic of Korea. She
received her BS and MS degrees How to cite this article: S. Chung, J.M.
in computer science from Chonbuk National Uni- Lim, J. Lim, K. Ju Noh, G. Kim, H. Jeong.
versity, Jeonju, Republic of Korea. She has been Real-world multimodal lifelog dataset for human
involved in communications and personal computing behavior study, ETRI Journal 44(2022), 426–437.
science-related projects for over a decade. She is cur- https://doi.org/10.4218/etrij.2020-0446
rently developing affective computing technology for
human understanding. Her research interests include
human understanding, affective computing, artificial
intelligence, and emotion recognition.
Gague Kim is a principal researcher

in the artificial intelligence labora-
tory at Electronics and Telecommu-
nications Research Institute, Dae-
jeon, Republic of Korea. He received
his Ph.D. in electronic engineering
from Kyungpook National University,

ETRI Journal - 2021 - Chung

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ETRI Journal - 2021 - Chung

Uploaded by

Copyright:

Available Formats

Received: 26 November 2020 Revised: 16 March 2021 Accepted: 1 July 2021

Real-world multimodal lifelog dataset for human behavior

Artificial Intelligence Laboratory,

1 I N T RO DU CT ION standing the multilateral characteristics of human behav-

426 wileyonlinelibrary.com/journal/etrij ETRI Journal. 2022;44(3):426–437.

TABLE 1 Summary of activity and lifelog datasets

TABLE 4 Overview of the dataset and details of the data

Gague Kim is a principal researcher

You might also like