Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Final Project: Data Engineering, 2019 SPRING

Frans Prathama (201930205)


Industrial and Management Engineering, Hankuk University of Foreign Studies

1. Introduction
In this project, we had to perform data preprocessing, transformation and feature extraction based on
human activity dataset. The dataset that we used is PAMAP2 dataset which mainly contains the
accelerometer, gyroscope, magnetometer readings in the x, y, and z direction. The dataset is provided online
by UCI Machine Learning repository. We tried extract various features and use classification technique like
Gaussian Naïve Bayes classification and analyzed the result.

2. Dataset
We used PAMAP2 dataset which contain of physical activities. The completed data and related papers
can be accessed at http://archive.ics.uci.edu/ml/datasets/pamap2+physical+activity+monitoring. The data
was collected by 9 subjects (8 males and 1 female) aged between 27-31 years, wearing 3 IMU sensors and a
Heart rate monitor, and performing 12 different activities. The raw data can be found in space-separated
text-files (.dat), 1 data file per subject per session (protocol or optional). Missing values are indicated with
NaN. One line in the data files correspond to one timestamped and labeled instance of sensory data. The
data files contain 54 columns: each line consists of a timestamp, an activity label (the ground truth) and 52
attributes of raw sensory data. Based on manual, for each data files contains 54 columns per row and the
columns contain following data:
Columns No Attributes
1 Timestamp
2 Activity ID
3 Heart Rate (bpm)
4-20 IMU Hand (Acc, Gyro, and Magnetometer for Hand)
21-37 IMU Chest (Acc, Gyro, and Magnetometer for Chest)
38-54 IMU Ankle (Acc, Gyro, and Magnetometer for Ankle)

The IMU sensory data contains the following items:


Columns No Attributes
1 temperature (°C)
2-4 3D-acceleration data (ms-2), scale: ±16g, resolution: 13-bit
5-7 3D-acceleration data (ms-2), scale: ±6g, resolution: 13-bit*
8-10 3D-gyroscope data (rad/s)
11-13 3D-magnetometer data (μT)
14-17 orientation (invalid in this data collection)

The dataset contains the following activities:


Act ID Activity Act ID Activity
1 lying 12 ascending stairs
2 sitting 13 descending stairs
3 standing 16 vacuum cleaning
4 walking 17 ironing

1
5 running 18 folding laundry
6 cycling 19 house cleaning
7 Nordic walking 20 playing soccer
9 watching TV 24 rope jumping
10 computer work 0 other (transient
11 car driving activities)

3. Procedure
3.1 Data Preparation
The data preparation task involved are the following:
a. Read all the dataset
b. Pre-process the dataset: Handling the missing value
c. Add Signal Magnitude attribute for each Accelerometer, Gyroscope, and Magnetometer data for
feature extraction process
d. Normalization
e. Divide the dataset into training and testing data, and
f. Test data using classification algorithm

3.2 Feature Extraction


The set of features that we extracted from the dataset are Mean, Median, Standard Deviation, Skewness,
and XXX.

3.3 Classification

Result

References
http://archive.ics.uci.edu/ml/datasets/pamap2+physical+activity+monitoring

You might also like