Professional Documents
Culture Documents
Nagel Ben Project Report
Nagel Ben Project Report
Nagel Ben Project Report
Introduction
Methods
Results
Conclusion
Nagel_Ben_Project_Report Code
Ben Nagel
Control Options throughout the whole document:
Load Libraries:
library(readr)
library(knitr)
library(tidyverse)
## Warning: Missing column names filled in: 'X35' [35], 'X36' [36]
##
## -- Column specification --------------------------------------------------------
## cols(
## .default = col_double(),
## ActivityType = col_character(),
## Date = col_character(),
## Favorite = col_logical(),
## Title = col_character(),
## Calories = col_number(),
## Time = col_time(format = ""),
## AvgPace = col_time(format = ""),
## BestPace = col_time(format = ""),
## ElevGain = col_number(),
## ElevLoss = col_character(),
## ClimbTime = col_time(format = ""),
## BottomTime = col_time(format = ""),
## SurfaceInterval = col_time(format = ""),
## Decompression = col_character(),
## BestLapTime = col_time(format = ""),
## X35 = col_logical(),
## X36 = col_logical()
## )
## i Use `spec()` for the full column specifications.
attach(Activities)
Final Report
Introduction
I have been a competitive runner for a few years now and have been using a Garmin Forerunner 645 watch t
o track my activities for the last 20 months. I often want to go back and find a specific workout or ru
n that was memorable and look at the data from the run, which can be a problem when I do not remember th
e exact date. However, I often can remember how far I ran, or at what pace, which is why I would rememb
er the run or want to go back and find it. My solution to this is to create a shiny app in R that would
allow me to put in ranges for different measurements recorded by my watch that lets me filter down my ac
tivities to find the data for the run I am seeking.
The data for this project was actually easy to get, especially for it not being an existing dataset. Af
ter I save runs from my watch, they are automatically uploaded to the Garmin Connect app. From there, I
found that if I scroll down on the page to load my very first runs with the watch, then there is an opti
on for me to export all the data as an Excel csv file. The data is exported with 30 columns of differen
t recorded data, including basics like distance, calories, and time, all the way to measures like locati
on, minimum temperature, and average stride length. Some examples of exploring the data are shown belo
w.
From my exploratory analysis, there was a few problems in the data which I changed. First of all, there
were some entries in the data that containted "--" where numbers should go. R did not like this, so I j
ust entered in 0's for these cells in Excel. Secondly, when I plotted distance versus time, there was t
wo separate linear paths on the graph, with one following the x-axis. I was able to solve this by addin
g a few different columns to the data for time-based elements in the “Time” column. When I converted al
l these times to seconds using formatting in the Excel file and used this new file in a graph, it led to
one linear path which is how it should be. This was done by creating a new column of data, formatting i
t to be a general number, and setting each cell equal to the default time cell times 86,400, which is th
e number of seconds in a day. This solved the problem. One other little thing in Excel that I fixed wa
s eliminating spaces from the title rows just so they were easier to call in R.
Another unfixable problem from the data is inaccuracy from the wrist-based device, especially with heart
rate and elevation. A graph of heart rate plotted against the average pace of the run is shown below; i
f wrist-based heart rate readings were accurate, there should be a relationship between the two. I can
still use this data to filter through my runs, but it is not something that I can fully trust. One fina
l problem that I had with the data is the presence of large outliers. These are from when I have done c
ertain challenges or silly runs; for example, I did an Everest challenge where I ran up a hill for 21 ho
urs straight, amassing over 100 miles of distance and nearly 30,000 feet of elevation gain. Another exa
mple is when I tried to run a mile as slow as possible. The Everesting outlier can be seen on the first
two plots below. I have removed these points from the data to make my sliders nicer. I also took out s
ome data points that were of very short activities (less than 15 seconds of 0.05 miles), which skewed th
e data and were not representative of anything.
Exploratory Analysis
Here begins exploratory analysis. First, I look at the types of data that was recorded for each run.
Next, I do a basic plots of distance over time and calories burned by distance.
plot(Distance, TimeSec)
plot(Calories ~ Distance)
The following R code gives the dates and times of my last 10 runs by extracting the second columns.
5/4/2021 11:09
5/3/2021 15:46
5/2/2021 16:41
5/2/2021 12:01
5/1/2021 20:17
4/30/2021 19:31
4/30/2021 16:25
4/29/2021 15:53
4/28/2021 16:32
4/27/2021 15:21
Using a similar method, I can see some of the data measurements from my last run.
ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR
Running 5/4/2021 FALSE Indianapolis 2.01 225 00:15:48 948 15.8 0.26 139 152
11:09 Running
I was curious if there was a relationship between my heart rate and how fast I ran. Turns out, there was not.
I checked some minimums and maximums for some data, which will come in handy on my app.
## [1] 0
## [1] 3432
I can use this to nd when I maxed out some measurements (just including the rst 12 data columns due to the extreme width of
all columns).
ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR
Running 10/10/2020 FALSE Spring eld 4.97 549 00:26:43 1603 26.72 0.45 184 199
14:59 Running
Running 9/13/2019 FALSE Vanderburgh 3.77 281 00:20:52 1252 20.87 0.35 142 199
18:19 County
Running
The following code extracts all of my runs that were between a half and full marathon distance, and then all of my 5K runs. This
is basically what I would want my app to do - enter range and show me the runs that t the criterion. Again, I only include the
rst 12 columns.
#displays all activities between a half marathon and full marathon distance
kable(Activities[Distance >= 13.10 & Distance <= 26.23, 1:12])
ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR
Running 4/11/2021 FALSE Indianapolis 14.21 1728 01:42:53 6173 102.88 1.71 160 178
9:41 Running
Running 3/28/2021 FALSE Indianapolis 14.12 1504 01:40:01 6001 100.02 1.67 158 180
9:12 Running
Running 3/21/2021 FALSE Indianapolis 14.05 1662 01:36:37 5797 96.62 1.61 161 184
9:34 Running
Running 3/2/2021 FALSE Indianapolis 13.12 1222 01:34:32 5672 94.53 1.58 146 165
15:47 Running
Running 2/14/2021 FALSE Indianapolis 15.20 1476 01:52:25 6745 112.42 1.87 149 173
8:33 Running
Running 1/31/2021 FALSE Indianapolis 13.51 1416 01:36:27 5787 96.45 1.61 150 169
8:39 Running
Running 1/24/2021 FALSE Indianapolis 13.12 1405 01:30:49 5449 90.82 1.51 155 175
9:03 Running
Running 1/17/2021 FALSE Brownsburg 13.23 1319 01:32:02 5522 92.03 1.53 149 168
17:12 Running
Running 1/13/2021 FALSE Morgan 13.13 1521 01:36:56 5816 96.93 1.62 160 181
15:48 County
Running
Running 1/10/2021 FALSE Hendricks 15.61 1438 01:39:24 5964 99.40 1.66 153 180
15:26 County
Running
Running 1/3/2021 FALSE Delaware 16.03 1566 01:48:47 6527 108.78 1.81 149 178
13:56 County
Running
Running 12/27/2020 FALSE Mooresville 14.52 1613 01:34:48 5688 94.80 1.58 161 192
15:10 Running
Running 12/20/2020 FALSE Plain eld 14.12 1387 01:37:21 5841 97.35 1.62 149 169
12:36 Running
Running 12/13/2020 FALSE Mooresville 16.05 1684 01:47:43 6463 107.72 1.80 160 185
16:07 Running
Running 12/6/2020 FALSE Morgan 13.12 1305 01:27:07 5227 87.12 1.45 154 168
16:45 County
Running
Running 10/30/2020 FALSE Indianapolis 24.01 2453 03:07:06 11226 187.10 3.12 139 157
21:53 Running
Running 10/25/2020 FALSE Brownsburg 13.24 1276 01:30:07 5407 90.12 1.50 149 164
19:41 Running
Trail 10/18/2020 FALSE Indianapolis 13.13 1401 01:46:55 6415 106.92 1.78 145 163
Running 8:33 Trail Running
ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR
Running 10/4/2020 FALSE Vanderburgh 13.12 1229 01:37:43 5863 97.72 1.63 144 160
15:07 County
Running
Running 9/27/2020 FALSE Indianapolis 13.26 1287 01:50:02 6602 110.03 1.83 141 162
7:37 Running
Running 9/20/2020 FALSE Indianapolis 14.15 1339 01:44:46 6286 104.77 1.75 141 158
17:08 Running
Trail 9/13/2020 FALSE Indianapolis 13.12 1389 01:45:43 6343 105.72 1.76 150 175
Running 9:13 Trail Running
Running 9/6/2020 FALSE Indianapolis 14.15 1290 01:46:52 6412 106.87 1.78 140 155
16:23 Running
Running 7/22/2020 FALSE Indianapolis 14.32 1100 01:45:18 6318 105.30 1.76 134 150
7:40 Running
Running 7/19/2020 FALSE Morgan 17.50 1607 02:14:49 8089 134.82 2.25 144 162
19:43 County
Running
Running 7/12/2020 FALSE Morgan 17.02 1493 02:07:14 7634 127.23 2.12 142 160
15:17 County
Running
Running 7/5/2020 FALSE Indianapolis 16.56 1498 01:58:06 7086 118.10 1.97 145 171
7:09 Running
Running 6/28/2020 FALSE Plain eld 16.01 1818 01:47:03 6423 107.05 1.78 168 188
12:26 Running
Running 6/25/2020 FALSE Grand Haven 13.73 1189 01:40:39 6039 100.65 1.68 144 179
9:39 Running
Running 6/21/2020 FALSE Grand Haven 15.22 1423 01:49:09 6549 109.15 1.82 148 177
19:23 Running
Running 6/14/2020 FALSE Morgan 15.13 1510 01:51:07 6667 111.12 1.85 150 164
16:44 County
Running
Running 5/31/2020 FALSE Morgan 14.01 1444 01:45:27 6327 105.45 1.76 151 160
9:40 County
Running
Running 4/27/2020 FALSE Morgan 20.01 2413 02:45:38 9938 165.63 2.76 154 185
9:03 County
Running
Running 4/18/2020 FALSE Morgan 13.28 1384 01:33:11 5591 93.18 1.55 156 184
15:11 County
Running
Running 4/12/2020 FALSE Morgan 18.05 1705 02:16:52 8212 136.87 2.28 149 171
8:30 County
Running
Running 4/5/2020 FALSE Morgan 17.01 2030 02:28:05 8885 148.08 2.47 159 174
15:33 County
Running
Running 3/22/2020 FALSE Plain eld 14.07 1616 01:38:12 5892 98.20 1.64 161 184
9:05 Running
ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR
Running 3/15/2020 FALSE St 13.15 1401 01:45:25 6325 105.42 1.76 149 159
9:31 Petersburg
Running
Running 3/8/2020 FALSE St Pete 13.19 1281 01:27:22 5242 87.37 1.46 154 182
10:10 Beach
Running
Running 2/9/2020 FALSE Indianapolis 13.25 1367 01:32:07 5527 92.12 1.54 151 176
9:01 Running
Running 2/2/2020 FALSE Indianapolis 13.14 1261 01:32:54 5574 92.90 1.55 149 172
14:10 Running
Running 12/22/2019 FALSE Plain eld 14.03 1453 01:36:16 5776 96.27 1.60 156 177
8:17 Running
Running 12/8/2019 FALSE Indianapolis 13.25 1192 01:32:23 5543 92.38 1.54 146 175
15:20 Running
Running 11/24/2019 FALSE Morgan 14.52 1674 01:42:57 6177 102.95 1.72 162 192
9:11 County
Running
Trail 10/19/2019 FALSE Brown 14.10 1413 02:00:35 7235 120.58 2.01 143 166
Running 16:03 County Trail
Running
Running 10/14/2019 FALSE Putnam 15.52 1427 01:52:25 6745 112.42 1.87 147 172
9:34 County
Running
Running 9/29/2019 FALSE Loogootee 14.06 1545 01:47:05 6425 107.08 1.78 155 171
8:54 Running
Running 9/22/2019 FALSE Indianapolis 15.01 1556 01:43:00 6180 103.00 1.72 156 176
16:19 Running
Running 9/15/2019 FALSE Indianapolis 14.37 1587 01:39:28 5968 99.47 1.66 159 174
10:12 Running
Running 9/8/2019 FALSE Indianapolis 14.11 1430 01:39:04 5944 99.07 1.65 154 175
7:59 Running
#5K's
kable(Activities[Distance > 3.09 & Distance < 3.13, 1:12])
ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR
Running 9/4/2020 FALSE Indianapolis 3.11 297 00:16:11 971 16.18 0.27 167 195
17:26 Running
Running 4/16/2020 FALSE Monrovia 3.11 239 00:15:54 954 15.90 0.27 146 166
18:05 Running
Running 2/11/2020 FALSE Indianapolis 3.11 315 00:21:28 1288 21.47 0.36 145 175
14:52 Running
Running 11/22/2019 FALSE Indianapolis 3.12 265 00:17:11 1031 17.18 0.29 148 162
16:14 Running
Running 11/15/2019 FALSE Indianapolis 3.11 185 00:15:07 907 15.12 0.25 135 157
15:55 Running
ActivityType Date Favorite Title Distance Calories Time TimeSec TimeMin TimeHrs AvgHR MaxHR
Running 10/1/2019 FALSE Morgan 3.12 352 00:19:25 1165 19.42 0.32 166 180
16:37 County
Running
Running 9/13/2019 FALSE Vanderburgh 3.12 366 00:22:53 1373 22.88 0.38 152 167
18:56 County
Running
I would want to be able to search for different places that I’ve ran, so this shows a list of where all I have ran, and then how many
times I have ran there. I do not believe that the counts of where I have run shows up on the html le for some reason, but I am
keeping this in here anyways.
#shows all of the different places I have ran, and what type of run it was (regular/trail/treadmill)
unique(Title)
#shows all of the different places I have ran, and what type of run it was (regular/trail/treadmill), an
d how many times I ran there
kable(aggregate(data.frame(count = Title), list(value = Title), length))
value count
Allendale Running 7
Avon Running 5
Birmingham Running 4
value count
Bolingbrook Running 1
Brooklyn Running 1
Brownsburg Running 2
Carmel Running 9
Coatesville Running 1
Crawfordsville Running 1
Ef ngham Running 1
Grandville Running 3
Greentown Running 1
Greenwood Running 1
Hoover Running 2
Kokomo Running 1
Loogootee Running 1
Louisville Running 4
Marion Running 3
Monrovia Running 30
Mooresville Running 41
Nashville Running 2
Porter Running 1
Rochester Running 1
Romeoville Running 4
Shelbyville Running 1
St Petersburg Running 10
Treadmill Running 1
Upland Running 3
Zeeland Running 1
detach(Activities)
This is the end of exploratory analysis and I will now go into methods.
Methods
In terms of analyzing my data, there are two different parts to what I wanted to accomplish on the app. First, I would like a panel
of sliders. Secondly, I would like a data table output. For the sliders, I hope to have two buttons that are movable to include
minimums and maximums. My approach to getting these sliders coded started with looking at the example code from the
Investment Calculator from class. This code came from Dr. Ra que. I made the maximum and minimums for the sliders the
maximum and minimums from the data, so that way it is always customizable from whatever data is read into R. The second part
of my goal for the app is the display of the ltered data table. When the sliders are changed, I would want the data table to
update based on the activities that meet the criteria. I was really struggling with this part at rst, but after nding the
interactive McDonald’s menu shiny app, that was be instrumental in my project since the slider and data table idea is just what I
had in mind. Borrowing code from this project allowed me to get my app working just how I wanted it to. This code came from
Borong Zhang. There are a few packages that I included in the app. First is shiny package to allow the app to be created. The
readr package allows R to easily read my csv le. The dplyr package is a grammar of manipulation for data storage, basically
making it easy to work with data frames. The tidyr package helps with working with my data the way it is set up. The DT package
is an R interface for the data tables. The shinythemes and shinyWidgets allowed me to customize the app aesthetics. A lot of
data cleaning and analysis can be found in my introduction section. Because the app format was easy to translate between
datasets, I repeated the above processes for data off Garmin Connect for my cycling and hiking/walking.
Results
Running App
The following are screenshots of the shiny apps that I created. First is the running app, which was my main focus for the project.
After that are my cycling and hiking/walking apps, which were my secondary focus.
knitr::include_graphics('/Users/Ben42/OneDrive/Desktop/RunningApp.jpg')
Cycling App
knitr::include_graphics('/Users/Ben42/OneDrive/Desktop/CyclingApp.jpg')
Walking/Hiking App
knitr::include_graphics('/Users/Ben42/OneDrive/Desktop/WalkingHikingApp.jpg')
Conclusion
I am very satis ed with my app. I wanted something that allowed me to sort and lter through my data, and I did just that. I
especially like the search bar which makes this process even easier, and the fact that I was able to make apps for my cycling and
walking/hiking activities as well. With these apps, I can give ranges for any data points in order to sort or lter my activities. This
allows me to go back and nd certain activities, or even nd activities where I achieved my maximum or minimums of certain
measurements or some combination of ranges. I can also search for places that I have ran which is pretty cool. Being able to
create shiny apps is another skill that I am very happy to have developed this year and would love to use in the future.