Nagel Ben Project Report

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Final Report

Introduction

Methods

Results

Conclusion

Nagel_Ben_Project_Report Code

Ben Nagel
Control Options throughout the whole document:

Load Libraries:

library(readr)
library(knitr)
library(tidyverse)

Activities <- read_csv("Activities.csv")

## Warning: Missing column names filled in: 'X35' [35], 'X36' [36]

##
## -- Column specification --------------------------------------------------------
## cols(
## .default = col_double(),
## ActivityType = col_character(),
## Date = col_character(),
## Favorite = col_logical(),
## Title = col_character(),
## Calories = col_number(),
## Time = col_time(format = ""),
## AvgPace = col_time(format = ""),
## BestPace = col_time(format = ""),
## ElevGain = col_number(),
## ElevLoss = col_character(),
## ClimbTime = col_time(format = ""),
## BottomTime = col_time(format = ""),
## SurfaceInterval = col_time(format = ""),
## Decompression = col_character(),
## BestLapTime = col_time(format = ""),
## X35 = col_logical(),
## X36 = col_logical()
## )
## i Use `spec()` for the full column specifications.

attach(Activities)

Final Report
Introduction
I have been a competitive runner for a few years now and have been using a Garmin Forerunner 645 watch t
o track my activities for the last 20 months. I often want to go back and find a specific workout or ru
n that was memorable and look at the data from the run, which can be a problem when I do not remember th
e exact date. However, I often can remember how far I ran, or at what pace, which is why I would rememb
er the run or want to go back and find it. My solution to this is to create a shiny app in R that would
allow me to put in ranges for different measurements recorded by my watch that lets me filter down my ac
tivities to find the data for the run I am seeking.
The data for this project was actually easy to get, especially for it not being an existing dataset. Af
ter I save runs from my watch, they are automatically uploaded to the Garmin Connect app. From there, I
found that if I scroll down on the page to load my very first runs with the watch, then there is an opti
on for me to export all the data as an Excel csv file. The data is exported with 30 columns of differen
t recorded data, including basics like distance, calories, and time, all the way to measures like locati
on, minimum temperature, and average stride length. Some examples of exploring the data are shown belo
w.
From my exploratory analysis, there was a few problems in the data which I changed. First of all, there
were some entries in the data that containted "--" where numbers should go. R did not like this, so I j
ust entered in 0's for these cells in Excel. Secondly, when I plotted distance versus time, there was t
wo separate linear paths on the graph, with one following the x-axis. I was able to solve this by addin
g a few different columns to the data for time-based elements in the “Time” column. When I converted al
l these times to seconds using formatting in the Excel file and used this new file in a graph, it led to
one linear path which is how it should be. This was done by creating a new column of data, formatting i
t to be a general number, and setting each cell equal to the default time cell times 86,400, which is th
e number of seconds in a day. This solved the problem. One other little thing in Excel that I fixed wa
s eliminating spaces from the title rows just so they were easier to call in R.
Another unfixable problem from the data is inaccuracy from the wrist-based device, especially with heart
rate and elevation. A graph of heart rate plotted against the average pace of the run is shown below; i
f wrist-based heart rate readings were accurate, there should be a relationship between the two. I can
still use this data to filter through my runs, but it is not something that I can fully trust. One fina
l problem that I had with the data is the presence of large outliers. These are from when I have done c
ertain challenges or silly runs; for example, I did an Everest challenge where I ran up a hill for 21 ho
urs straight, amassing over 100 miles of distance and nearly 30,000 feet of elevation gain. Another exa
mple is when I tried to run a mile as slow as possible. The Everesting outlier can be seen on the first
two plots below. I have removed these points from the data to make my sliders nicer. I also took out s
ome data points that were of very short activities (less than 15 seconds of 0.05 miles), which skewed th
e data and were not representative of anything.

Exploratory Analysis
Here begins exploratory analysis. First, I look at the types of data that was recorded for each run.

#all of the different recorded elements of the data


names(Activities)

## [1] "ActivityType" "Date" "Favorite"


## [4] "Title" "Distance" "Calories"
## [7] "Time" "TimeSec" "TimeMin"
## [10] "TimeHrs" "AvgHR" "MaxHR"
## [13] "AerobicTE" "AvgRunCadence" "MaxRunCadence"
## [16] "AvgPace" "Pace" "BestPace"

## [19] "ElevGain" "ElevLoss" "AvgStrideLength"


## [22] "AvgVerticalRatio" "AvgVerticalOscillation" "TrainingStressScore"
## [25] "Grit" "Flow" "ClimbTime"
## [28] "BottomTime" "MinTemp" "SurfaceInterval"
## [31] "Decompression" "BestLapTime" "NumberofLaps"
## [34] "MaxTemp" "X35" "X36"

Next, I do a basic plots of distance over time and calories burned by distance.
plot(Distance, TimeSec)

plot(Calories ~ Distance)

The following R code gives the dates and times of my last 10 runs by extracting the second columns.

#gives dates and times of my last 10 runs


kable(Activities[1:10, 2])
Date

5/4/2021 11:09

5/3/2021 15:46

5/2/2021 16:41

5/2/2021 12:01

5/1/2021 20:17

4/30/2021 19:31

4/30/2021 16:25

4/29/2021 15:53

4/28/2021 16:32

4/27/2021 15:21

Using a similar method, I can see some of the data measurements from my last run.

#gives first 12 measurements from my last activity


kable(Activities[1, 1:12])

ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR

Running 5/4/2021 FALSE Indianapolis 2.01 225 00:15:48 948 15.8 0.26 139 152
11:09 Running

I was curious if there was a relationship between my heart rate and how fast I ran. Turns out, there was not.

#average heart rate vs. average pace, no real relationship


plot(AvgHR ~ AvgPace)

The following shows the distribution of calories burned on my run as a boxplot.


#distribution of calories burned on my runs
boxplot(Calories)

I checked some minimums and maximums for some data, which will come in handy on my app.

#minimum average stride length (m)


min(AvgStrideLength)

## [1] 0

#maximum elevation gain


max(ElevGain)

## [1] 3432

I can use this to nd when I maxed out some measurements (just including the rst 12 data columns due to the extreme width of
all columns).

#shows activities with my all time max heart rate


kable(Activities[MaxHR == max(MaxHR), 1:12])

ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR

Running 10/10/2020 FALSE Spring eld 4.97 549 00:26:43 1603 26.72 0.45 184 199
14:59 Running

Running 9/13/2019 FALSE Vanderburgh 3.77 281 00:20:52 1252 20.87 0.35 142 199
18:19 County
Running
The following code extracts all of my runs that were between a half and full marathon distance, and then all of my 5K runs. This
is basically what I would want my app to do - enter range and show me the runs that t the criterion. Again, I only include the
rst 12 columns.

#displays all activities between a half marathon and full marathon distance
kable(Activities[Distance >= 13.10 & Distance <= 26.23, 1:12])

ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR

Running 4/11/2021 FALSE Indianapolis 14.21 1728 01:42:53 6173 102.88 1.71 160 178
9:41 Running

Running 3/28/2021 FALSE Indianapolis 14.12 1504 01:40:01 6001 100.02 1.67 158 180
9:12 Running

Running 3/21/2021 FALSE Indianapolis 14.05 1662 01:36:37 5797 96.62 1.61 161 184
9:34 Running

Running 3/2/2021 FALSE Indianapolis 13.12 1222 01:34:32 5672 94.53 1.58 146 165
15:47 Running

Running 2/14/2021 FALSE Indianapolis 15.20 1476 01:52:25 6745 112.42 1.87 149 173
8:33 Running

Running 1/31/2021 FALSE Indianapolis 13.51 1416 01:36:27 5787 96.45 1.61 150 169
8:39 Running

Running 1/24/2021 FALSE Indianapolis 13.12 1405 01:30:49 5449 90.82 1.51 155 175
9:03 Running

Running 1/17/2021 FALSE Brownsburg 13.23 1319 01:32:02 5522 92.03 1.53 149 168
17:12 Running

Running 1/13/2021 FALSE Morgan 13.13 1521 01:36:56 5816 96.93 1.62 160 181
15:48 County
Running

Running 1/10/2021 FALSE Hendricks 15.61 1438 01:39:24 5964 99.40 1.66 153 180
15:26 County
Running

Running 1/3/2021 FALSE Delaware 16.03 1566 01:48:47 6527 108.78 1.81 149 178
13:56 County
Running

Running 12/27/2020 FALSE Mooresville 14.52 1613 01:34:48 5688 94.80 1.58 161 192
15:10 Running

Running 12/20/2020 FALSE Plain eld 14.12 1387 01:37:21 5841 97.35 1.62 149 169
12:36 Running

Running 12/13/2020 FALSE Mooresville 16.05 1684 01:47:43 6463 107.72 1.80 160 185
16:07 Running

Running 12/6/2020 FALSE Morgan 13.12 1305 01:27:07 5227 87.12 1.45 154 168
16:45 County
Running

Running 10/30/2020 FALSE Indianapolis 24.01 2453 03:07:06 11226 187.10 3.12 139 157
21:53 Running

Running 10/25/2020 FALSE Brownsburg 13.24 1276 01:30:07 5407 90.12 1.50 149 164
19:41 Running

Trail 10/18/2020 FALSE Indianapolis 13.13 1401 01:46:55 6415 106.92 1.78 145 163
Running 8:33 Trail Running
ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR

Running 10/4/2020 FALSE Vanderburgh 13.12 1229 01:37:43 5863 97.72 1.63 144 160
15:07 County
Running

Running 9/27/2020 FALSE Indianapolis 13.26 1287 01:50:02 6602 110.03 1.83 141 162
7:37 Running

Running 9/20/2020 FALSE Indianapolis 14.15 1339 01:44:46 6286 104.77 1.75 141 158
17:08 Running

Trail 9/13/2020 FALSE Indianapolis 13.12 1389 01:45:43 6343 105.72 1.76 150 175
Running 9:13 Trail Running

Running 9/6/2020 FALSE Indianapolis 14.15 1290 01:46:52 6412 106.87 1.78 140 155
16:23 Running

Running 7/22/2020 FALSE Indianapolis 14.32 1100 01:45:18 6318 105.30 1.76 134 150
7:40 Running

Running 7/19/2020 FALSE Morgan 17.50 1607 02:14:49 8089 134.82 2.25 144 162
19:43 County
Running

Running 7/12/2020 FALSE Morgan 17.02 1493 02:07:14 7634 127.23 2.12 142 160
15:17 County
Running

Running 7/5/2020 FALSE Indianapolis 16.56 1498 01:58:06 7086 118.10 1.97 145 171
7:09 Running

Running 6/28/2020 FALSE Plain eld 16.01 1818 01:47:03 6423 107.05 1.78 168 188
12:26 Running

Running 6/25/2020 FALSE Grand Haven 13.73 1189 01:40:39 6039 100.65 1.68 144 179
9:39 Running

Running 6/21/2020 FALSE Grand Haven 15.22 1423 01:49:09 6549 109.15 1.82 148 177
19:23 Running

Running 6/14/2020 FALSE Morgan 15.13 1510 01:51:07 6667 111.12 1.85 150 164
16:44 County
Running

Running 5/31/2020 FALSE Morgan 14.01 1444 01:45:27 6327 105.45 1.76 151 160
9:40 County
Running

Running 4/27/2020 FALSE Morgan 20.01 2413 02:45:38 9938 165.63 2.76 154 185
9:03 County
Running

Running 4/18/2020 FALSE Morgan 13.28 1384 01:33:11 5591 93.18 1.55 156 184
15:11 County
Running

Running 4/12/2020 FALSE Morgan 18.05 1705 02:16:52 8212 136.87 2.28 149 171
8:30 County
Running

Running 4/5/2020 FALSE Morgan 17.01 2030 02:28:05 8885 148.08 2.47 159 174
15:33 County
Running

Running 3/22/2020 FALSE Plain eld 14.07 1616 01:38:12 5892 98.20 1.64 161 184
9:05 Running
ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR

Running 3/15/2020 FALSE St 13.15 1401 01:45:25 6325 105.42 1.76 149 159
9:31 Petersburg
Running

Running 3/8/2020 FALSE St Pete 13.19 1281 01:27:22 5242 87.37 1.46 154 182
10:10 Beach
Running

Running 2/9/2020 FALSE Indianapolis 13.25 1367 01:32:07 5527 92.12 1.54 151 176
9:01 Running

Running 2/2/2020 FALSE Indianapolis 13.14 1261 01:32:54 5574 92.90 1.55 149 172
14:10 Running

Running 12/22/2019 FALSE Plain eld 14.03 1453 01:36:16 5776 96.27 1.60 156 177
8:17 Running

Running 12/8/2019 FALSE Indianapolis 13.25 1192 01:32:23 5543 92.38 1.54 146 175
15:20 Running

Running 11/24/2019 FALSE Morgan 14.52 1674 01:42:57 6177 102.95 1.72 162 192
9:11 County
Running

Trail 10/19/2019 FALSE Brown 14.10 1413 02:00:35 7235 120.58 2.01 143 166
Running 16:03 County Trail
Running

Running 10/14/2019 FALSE Putnam 15.52 1427 01:52:25 6745 112.42 1.87 147 172
9:34 County
Running

Running 9/29/2019 FALSE Loogootee 14.06 1545 01:47:05 6425 107.08 1.78 155 171
8:54 Running

Running 9/22/2019 FALSE Indianapolis 15.01 1556 01:43:00 6180 103.00 1.72 156 176
16:19 Running

Running 9/15/2019 FALSE Indianapolis 14.37 1587 01:39:28 5968 99.47 1.66 159 174
10:12 Running

Running 9/8/2019 FALSE Indianapolis 14.11 1430 01:39:04 5944 99.07 1.65 154 175
7:59 Running

#5K's
kable(Activities[Distance > 3.09 & Distance < 3.13, 1:12])

ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR

Running 9/4/2020 FALSE Indianapolis 3.11 297 00:16:11 971 16.18 0.27 167 195
17:26 Running

Running 4/16/2020 FALSE Monrovia 3.11 239 00:15:54 954 15.90 0.27 146 166
18:05 Running

Running 2/11/2020 FALSE Indianapolis 3.11 315 00:21:28 1288 21.47 0.36 145 175
14:52 Running

Running 11/22/2019 FALSE Indianapolis 3.12 265 00:17:11 1031 17.18 0.29 148 162
16:14 Running

Running 11/15/2019 FALSE Indianapolis 3.11 185 00:15:07 907 15.12 0.25 135 157
15:55 Running
ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR

Running 10/1/2019 FALSE Morgan 3.12 352 00:19:25 1165 19.42 0.32 166 180
16:37 County
Running

Running 9/13/2019 FALSE Vanderburgh 3.12 366 00:22:53 1373 22.88 0.38 152 167
18:56 County
Running

I would want to be able to search for different places that I’ve ran, so this shows a list of where all I have ran, and then how many
times I have ran there. I do not believe that the counts of where I have run shows up on the html le for some reason, but I am
keeping this in here anyways.

#shows all of the different places I have ran, and what type of run it was (regular/trail/treadmill)
unique(Title)

## [1] "Indianapolis Running" "Allendale Running"


## [3] "South Bend Running" "Greentown Running"
## [5] "Marion Running" "Louisville Running"
## [7] "Morgan County Running" "Putnam County Running"
## [9] "Birmingham Running" "Hoover Running"
## [11] "Grandville Running" "Zeeland Running"
## [13] "Fulton County Running" "Shelbyville Running"
## [15] "Brownsburg Running" "Mooresville Running"
## [17] "Hamilton County Running" "Monrovia Running"
## [19] "Hendricks County Running" "Plainfield Running"
## [21] "Delaware County Running" "Crawfordsville Running"
## [23] "Hendricks County Trail Running" "Indianapolis Trail Running"
## [25] "Fayette County Running" "Jersey County Running"
## [27] "Effingham Running" "Springfield Running"
## [29] "Vanderburgh County Running" "Carmel Running"
## [31] "Shelby County Running" "Kokomo Running"
## [33] "Greenwood Running" "Porter Running"
## [35] "Lawrence County Running" "Coatesville Running"
## [37] "Grand Haven Running" "Ottawa County Running"
## [39] "Norton Shores Trail Running" "Nashville Running"
## [41] "Brooklyn Running" "Avon Running"
## [43] "Martinsville Trail Running" "St Petersburg Running"
## [45] "St Pete Beach Running" "Madeira Beach Running"
## [47] "Beech Grove Running" "Rochester Running"
## [49] "Brown County Running" "Vigo County Running"
## [51] "Brown County Trail Running" "Romeoville Running"
## [53] "Bolingbrook Running" "Loogootee Running"
## [55] "Upland Running" "Treadmill Running"
## [57] "Morgan County Trail Running" "Parke County Running"

#shows all of the different places I have ran, and what type of run it was (regular/trail/treadmill), an
d how many times I ran there
kable(aggregate(data.frame(count = Title), list(value = Title), length))

value count

Allendale Running 7

Avon Running 5

Beech Grove Running 1

Birmingham Running 4
value count

Bolingbrook Running 1

Brooklyn Running 1

Brown County Running 1

Brown County Trail Running 1

Brownsburg Running 2

Carmel Running 9

Coatesville Running 1

Crawfordsville Running 1

Delaware County Running 6

Ef ngham Running 1

Fayette County Running 1

Fulton County Running 1

Grand Haven Running 5

Grandville Running 3

Greentown Running 1

Greenwood Running 1

Hamilton County Running 1

Hendricks County Running 11

Hendricks County Trail Running 3

Hoover Running 2

Indianapolis Running 635

Indianapolis Trail Running 14

Jersey County Running 3

Kokomo Running 1

Lawrence County Running 1

Loogootee Running 1

Louisville Running 4

Madeira Beach Running 1

Marion Running 3

Martinsville Trail Running 1

Monrovia Running 30

Mooresville Running 41

Morgan County Running 178

Morgan County Trail Running 1

Nashville Running 2

Norton Shores Trail Running 1

Ottawa County Running 1

Parke County Running 2


value count

Plain eld Running 32

Porter Running 1

Putnam County Running 4

Rochester Running 1

Romeoville Running 4

Shelby County Running 3

Shelbyville Running 1

South Bend Running 1

Spring eld Running 3

St Pete Beach Running 12

St Petersburg Running 10

Treadmill Running 1

Upland Running 3

Vanderburgh County Running 13

Vigo County Running 2

Zeeland Running 1

detach(Activities)

This is the end of exploratory analysis and I will now go into methods.

Methods
In terms of analyzing my data, there are two different parts to what I wanted to accomplish on the app. First, I would like a panel
of sliders. Secondly, I would like a data table output. For the sliders, I hope to have two buttons that are movable to include
minimums and maximums. My approach to getting these sliders coded started with looking at the example code from the
Investment Calculator from class. This code came from Dr. Ra que. I made the maximum and minimums for the sliders the
maximum and minimums from the data, so that way it is always customizable from whatever data is read into R. The second part
of my goal for the app is the display of the ltered data table. When the sliders are changed, I would want the data table to
update based on the activities that meet the criteria. I was really struggling with this part at rst, but after nding the
interactive McDonald’s menu shiny app, that was be instrumental in my project since the slider and data table idea is just what I
had in mind. Borrowing code from this project allowed me to get my app working just how I wanted it to. This code came from
Borong Zhang. There are a few packages that I included in the app. First is shiny package to allow the app to be created. The
readr package allows R to easily read my csv le. The dplyr package is a grammar of manipulation for data storage, basically
making it easy to work with data frames. The tidyr package helps with working with my data the way it is set up. The DT package
is an R interface for the data tables. The shinythemes and shinyWidgets allowed me to customize the app aesthetics. A lot of
data cleaning and analysis can be found in my introduction section. Because the app format was easy to translate between
datasets, I repeated the above processes for data off Garmin Connect for my cycling and hiking/walking.

Results
Running App
The following are screenshots of the shiny apps that I created. First is the running app, which was my main focus for the project.
After that are my cycling and hiking/walking apps, which were my secondary focus.

knitr::include_graphics('/Users/Ben42/OneDrive/Desktop/RunningApp.jpg')
Cycling App
knitr::include_graphics('/Users/Ben42/OneDrive/Desktop/CyclingApp.jpg')

Walking/Hiking App
knitr::include_graphics('/Users/Ben42/OneDrive/Desktop/WalkingHikingApp.jpg')
Conclusion
I am very satis ed with my app. I wanted something that allowed me to sort and lter through my data, and I did just that. I
especially like the search bar which makes this process even easier, and the fact that I was able to make apps for my cycling and
walking/hiking activities as well. With these apps, I can give ranges for any data points in order to sort or lter my activities. This
allows me to go back and nd certain activities, or even nd activities where I achieved my maximum or minimums of certain
measurements or some combination of ranges. I can also search for places that I have ran which is pretty cool. Being able to
create shiny apps is another skill that I am very happy to have developed this year and would love to use in the future.

You might also like