2021 Cartón-Llorente Absolute Reliability and Agreement Between Stryd and RunScribe Systems For The Assessment of Running Power

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Original Article

Proc IMechE Part P:


J Sports Engineering and Technology
1–6
Absolute reliability and agreement Ó IMechE 2020
Article reuse guidelines:
between Stryd and RunScribe systems sagepub.com/journals-permissions
DOI: 10.1177/1754337120984644

for the assessment of running power journals.sagepub.com/home/pip

Antonio Cartón-Llorente1 , Luis E Roche-Seruendo1,


Diego Jaén-Carrillo1 , Noel Marcen-Cinca1 and
Felipe Garcı́a-Pinillos2,3

Abstract
The advent of portable power meters has revolutionized training in cycling, allowing an accurate field-based assessment
of athletes. In a similar way, researchers have recently developed low-cost gait analysis equipment to assess running
power in a more natural environment. The purpose of this study was to evaluate the absolute reliability of two different
power meters and the agreement between these two wearable devices (i.e., Strydä and RunScribeä) for measuring
power during treadmill running. About 49 endurance runners performed a running protocol on a treadmill at self-
selected comfortable velocities. Power output (W) was measured using the Strydä and RunScribeä systems, which
were attached to the same shoe. The absolute reliability, based on coefficient of variation, was 0.32 6 0.29% for Strydä
and 1.68 6 1.49% for RunScribeä, while the standard error of the mean were 0.3 6 0.2 W and 2.6 6 2.5 W for Strydä
and RunScribeä, respectively. Data from both devices showed significant correlations (r = 0.783, p \ 0.001) and the
ICC (r = 0.855) reported an almost perfect reliability. Bland–Altman plots revealed no heteroscedasticity of error
(r2 = 0.030), although a moderate systematic bias (212.3 6 26.6 W), and wide limits of agreement (39.8–64.3 W) were
found. Considering the increased popularity of using power meter devices in running, scientists, coaches, athletes, and
general users should be aware that data from these devices are reliable, but not interchangeable, due to the variation
shown for running power output data.

Keywords
Endurance, runners, sensors, wearables, power meters

Date received: 3 September 2020; accepted: 9 December 2020

Introduction determine relationships between these loads and perfor-


mance.6 As a result, wearable devices have revolutio-
Performance in endurance sports is underpinned by a nized training and racing by including Global
range of physical and physiological conditions which Positioning System (GPS) technologies to instantly
adapt in response to training. Efficient monitoring of track external indicators like distance, speed or pace.
these responses is a key factor for applied researchers However, these metrics have proven to be highly
and practitioners to detect individual adaptations to dependent on environmental factors, such as wind
the training programs.1–3 The biological effect that a force or slope gradient, so a more accurate and
certain training or competition causes in the organism
is called internal load, whereas the objective workload 1
Health Sciences Faculty, Universidad San Jorge, Zaragoza, Spain
that an athlete completes is known as external load.4 A 2
Department of Sports and Physical Education, University of Granada,
wide variety of internal load monitoring methods have Granada, Spain
been used in endurance sports, such as ratings of per- 3
Department of Physical Education, Sports and Recreation, Universidad
ceived exertion (RPE), blood lactate or heart rate and de La Frontera, Temuco, Chile
its derivatives. However, given that there is no single,
Corresponding author:
definitive marker that can accurately quantify the fit- Antonio Cartón-Llorente, Universidad San Jorge, Campus Universitario,
ness and fatigue responses to training,5 concurrent Villanueva de Gállego, Autov A23 km 299, Zaragoza 50830, Spain.
assessment of internal and external load is needed to Email: acarton@usj.es
2 Proc IMechE Part P: J Sports Engineering and Technology 00(0)

repeatable method was needed to assess current train- Methods


ing stress. The inclusion of accelerometers and inertial
measurements units (IMUs) in tiny wearable devices
Participants
meant a great step forward for quantifying intensity in Forty-four men and five women, all amateur endurance
the field, as it has allowed for real-time assessment of runners (age: 26 6 8 years; height: 1.74 6 0.07 m; body
mechanical power. Actually, power output data proved mass: 71 6 10 kg) volunteered to participate in the pres-
to be more reliable and sensitive than other previous ent study. Every participant was older than 18, able to
workload indicators when determining slight changes run 10 km in less than 50 min, free from any injury in
in exercise intensity.7 Therefore, coaches and athletes the lower limb for the last 6 months prior to data col-
have opted to use power output data to track their cur- lection, and reported a weekly mileage greater than
rent performance and manage intensity during training 12 km for the same time period. Informed consent
and racing. forms were collected from all participants before testing
In the last few years, the advent of portable power trials after receiving detailed information of the study.
meters has drastically changed training and racing in This study complied with the ethical standards of the
cycling, allowing for an accurate field-based assessment World Medical Association’s Declaration of Helsinki
of athletes.8 The data obtained can also be used to aid (2013) and was approved by the Ethics Committee of
in decisions relating to cycling position and technique, the corresponding institution. It was ensured that parti-
competition demands, or equipment selection.9 In a cipants understood they could abandon the study at
similar way, researchers have recently developed afford- any time.
able, portable gait analysis equipment (e.g., Strydä and
RunScribeä systems) to assess running power output
in natural environments.10 Through algorithmic com- Measures
putations and inclusion of the runner’s body mass, Body height (m) and mass (kg) were determined using
those wearables are able to estimate the forces gener- a precision stadiometer and weighing scale (SECA 222
ated based on temporal patterns in triaxial accelerome- and 634, respectively, SECA Corp., Hamburg,
try and GPS tracking, considering the existing changes Germany) for descriptive purposes.
in running spatiotemporal parameters due to speed and Power output in W was acquired using the Strydä
slope.11 Although the Strydä and RunScribeä systems power meter (Stryd power meter, Stryd Inc., Boulder,
are relatively new tools, the validity and reliability of CO, USA) and RunScribeä system (Scribe Lab Inc.,
these systems for the assessment of power output12 and San Francisco, CA, USA) placed on the laces of the
spatiotemporal parameters13–15 have been recently vali- same running shoe (Figure 1).
dated. Thus, they may be useful for monitoring individ- Strydä is a carbon fiber-reinforced footpod based
uals and quantifying changes in functional performance on a 6-axis inertial motion sensor (3-axis gyroscope, 3-
over time. In this context, the running power data from axis accelerometer). Data were retrieved from the man-
Strydä had been successfully used to establish a linear ufacturer’s website (https://www.stryd.com/powercen-
power-velocity relationship to predict the power output ter/analysis) into the .fit file. Then, data were analyzed
at different submaximal running velocities,16 showing using the free license software (Golden Cheetah, ver-
the great potential of this portable equipment. In addi- sion 3.4) and exported as a .csl file. RunScribeä,
tion, a few studies found a positive correlation between launched in 2015, is based on a 9-axis (3-axis gyro-
Strydä power data and running economy17 or meta- scope, 3-axis accelerometer, 3-axis magnetometer).
bolic demands.18 Recently, a study by Cerezuela-Espejo Data from RunScribeä were also collected from the
et al.12 investigated the correlation between these power
meters and oxygen consumption (r ø 0.911 and r
ø 0.582 for Strydä and RunScribeä, respectively).
Similarly, a study by van Dijk and van Megen19 indi-
cated that the external mechanical power (W/kg)
reported by the Strydä system highly correlated
(r2 = 0.96) with metabolic cost (VO2 in ml/kg/min).
However, to the best of the authors’ knowledge, the
absolute reliability of each system and their level of
agreement related to the power output data have not
been determined.
Therefore, the aims of the current study are to deter-
mine the absolute reliability (within-subject variation)
and evaluate the level of agreement between the Strydä
and RunScribeä systems for measuring power output
during running at comfortable self-selected speeds on a
motorized treadmill. Figure 1. Strydä and RunScribeä footpods placement.
Cartón-Llorente et al. 3

developer’s website (https://dashboard.runscribe.com/ Table 1. Descriptive characteristics of the participants (mean,


runs) into the .csv file. Then, data from both systems 6SD; n, %).
were imported into ExcelÒ (v. 2016, Microsoft, Inc.,
Variables
Redmond, WA, USA) and further analyzed.
Sex Male 44 (89.8%)
Female 5 (10.2%)
Design and procedures Age (years) 26.16 (8.35)
The testing protocol was designed to be carried out on Height (m) 1.74 (0.07)
Body mass (kg) 70.9 (9.5)
one specific day. Before testing, intense physical activity
Velocity (km/h) 11.7 (1.3)
and any food intake was avoided for 48 and 3 h, respec-
tively. Participants wore their own running clothes and
shoes in order to maintain their usual performance.
A motorized treadmill (WOODWAY Pro XL, Table 2. Coefficient of variation (CV) and standard error of
Woodway, Inc., Waukesha, WI, USA) was used for the the mean (SEM) of power data from the Strydä and RunScribeä
running protocol, set at an initial velocity of 8 km h21 systems.
with an increase of 1 km/h each minute until partici-
pants felt a comfortable speed was reached. Once parti- Device CV (%) SEM (W)
cipants felt comfortable, the running velocity was set
Strydä 0.32 6 0.29 0.3 6 0.2
(i.e. self-selected comfortable running velocity: RunScribeä 1.68 6 1.49 2.6 6 2.5
11.7 6 1.3 km/h) and maintained for the entire proto-
col. An 8-min accommodation program was performed
at the self-selected velocity.20,21 After that, data were
recorded for 3 min. A slope gradient of 0% was between data from the two devices. Ultimately, Bland-
adopted during the entire protocol. Altman plots (i.e., limits of agreement method, mean
difference 6 1.96 SD)21 were constructed to assess sys-
tematic and proportional bias between the measured
Statistical analysis and estimated values of mechanical variables for the
Descriptive statistics are represented as mean (6SD). Strydä vs. RunScribeä systems. Heteroscedasticity of
Normal distribution and homogeneity tests were deter- error was defined as an r2 . 0.1.23 The level of signifi-
mined by the Shapiro-Wilk and Levene’s test, respec- cance used was p \ 0.05. Data analysis was performed
tively, before data analysis. Coefficient of variation using SPSS (version 23, SPSS Inc., Chicago, IL, USA).
(CV) and standard error of the mean (SEM) were cal-
culated as an absolute reliability measurement. The CV
(%) represents the within-subject variation [CV = SD/
Results
mean 3 100] and it was calculated on an individual A description of the participants’ characteristics is
basis by considering the first-, second- and third-minute shown in Table 1.
power data. The SEM indicated the standard deviation
of a sampling distribution (SEM = SD/On)22–24 being
calculated on an individual basis. To determine concur- Reliability
rent validity, a Pearson correlation analysis was per- Table 2 shows the CV and SEM (as measures of abso-
formed between Strydä and RunScribeä power data. lute reliability) of power output during running from
The following criteria were adopted to interpret the both systems.
magnitude of correlations between measurement vari-
ables: \ 0.1 (trivial), 0.1–0.3 (small), 0.3–0.5 (moder-
ate), 0.5–0.7 (large), 0.7–0.9 (very large), and 0.9–1.0 Comparative analysis
(almost perfect), as suggested by Hopkins.24 Intraclass The power output reported by the Strydä and
correlation coefficients (ICC), as measures of relative RunScribeä systems are presented in Table 3. The
reliability, were also calculated between Strydä and paired t-test demonstrated significant differences
RunScribeä for power output during running. Based (p = 0.002) between devices, and the Pearson correla-
on the characteristics of this experimental design and tion analysis conducted (Figure 2) showed very large
following the guidelines reported by Koo and Li,25 a correlations (r = 0.783, p \ 0.001, SEE = 29.6 W) and
‘‘two-way random-effects’’ model (ICC [2,k]), ‘‘mean of an almost perfect relative reliability (r = 0.855).
measurements’’ type, and ‘‘absolute’’ definition for the Through Bland-Altman plots, Figure 3 shows the
ICC measurement were conducted. The interpretation differences between the two devices (systematic bias
of the ICC was based on the benchmarks reported by a and random error) and the degree of agreement
previous study26: ICC \ 0 (poor), 0–0.20 (slight), between the two systems (95% limits of agreement).
0.21–0.40 (fair), 0.41–0.60 (moderate), 0.61–0.80 (sub- Even though heteroscedasticity of error was not found
stantial), and . 0.81 (almost perfect). Pairwise compar- (r2 = 0.030), these plots revealed a moderate systematic
isons of means (paired t-test) were also conducted bias and a large random error (212.3 6 26.6 W) for the
4 Proc IMechE Part P: J Sports Engineering and Technology 00(0)

Table 3. Descriptive values and comparative analysis of power output during running from the Strydä and RunScribeä systems.

Mean (SD) t-test Correlation ICC (2, k)


Strydä RunScribeä p-value r (p-value) 95% CI

Power output (W) 238.9 6 37.7 251.2 6 42.0 0.002 0.783 ( \ 0.001) 0.855 (0.707–0.924)

ICC: intraclass correlation coefficient; 95% CI: 95% confidence interval.

agreement for power data during running between both


systems varied widely (39.8 to 264.3 W).

Discussion
This study aimed at determining the absolute reliability
and evaluating the agreement between the Strydä and
RunScribeä systems for measuring power during run-
ning on a treadmill at a comfortable self-selected velo-
city (11.7 6 1.3 km/h). The major findings of the current
work were twofold: (i) both systems provided reliable
power output data during running and, (ii) despite
power data from both devices showing an almost per-
fect association and no heteroscedasticity of error, using
the systems interchangeably may not be recommended
due to the inconsistency found between systems.
Both the Strydä and RunScribeä systems showed a
high absolute reliability with a mean CV lower than
Figure 2. Pearson’s correlation analysis between RunScribe 3% and a SEM less than 3 W, especially the Strydä
and Stryd’s power output data achieved during a 3-min treadmill system with a CV ranging between 0.03% and 0.6%
running protocol. across participants and a SEM of 0.3 6 0.2 W. For the
SEE: standard error of estimate. RunScribeä system, the CV ranged between 0.1% and
Circles indicate individual data of the crossing point between average 3.2% and the SEM was 2.6 6 2.5 W. These results seem
values for each device.
to be supported by a recent study by Cerezuela-Espejo
et al.,12 which also found that the Strydä system is
more reliable than RunScribeä. However, the high
variability found for RunScribeä in the aforemen-
tioned study did not match the results of the current
study, likely due to their comparison between varying
conditions (i.e., treadmill and track). Despite these
methodological differences, the current results for both
devices are consistent, as they are in line with results
from several previous studies27–29 which assessed the
reliability of analogous cycling power meters27,28 (CV
1%–3.05% and SEM ~1 6 5 W) or PowerTap Hub29
(CV 1.7%–2.7% and SEM 2.9 6 3.3 W). Furthermore,
the CV and SEM found in these studies were similar to
results from the current study, highlighting the absolute
reliability of the Strydä and RunScribeä systems to
assess power output during running on a treadmill at a
Figure 3. Bland-Altman plots for the measurement of power comfortable velocity.
output during running at a self-selected comfortable speed for Regarding the comparative analysis between Strydä
the Strydä and RunScribeä systems. and RunScribeä systems for power assessment, an
The plot includes the mean difference (dotted line) and 95% limits of almost perfect association was observed between data.
agreement (dashed lined), along with the regression line (solid line).
Furthermore, heteroscedasticity of error was not found
(r2 = 0.030), thus the concurrence between both systems
power output during running at comfortable self- was strong. Despite these results, the limits of agreement
selected speeds obtained from the Strydä system as between the two devices varied widely, and a large ran-
compared to the RunScribeä system. Limits of dom error (26.55 W) was detected, so the comparison of
Cartón-Llorente et al. 5

data should be made with caution. Since previous stud- publication of this article: This study was funded by
ies on spatiotemporal parameters during running have the University of San Jorge (Universidad San Jorge.
shown that longer recording intervals resulted in smaller Villanueva de Gállego, Zaragoza, Spain).
systematic bias, random errors, and narrower limits of
agreement,30 the differences in power assessment are ORCID iDs
also expected to decrease for longer recording periods. Antonio Cartón-Llorente https://orcid.org/0000-
However, in the aforementioned work,30 it was stated 0001-5551-6037
that the variability in the spatiotemporal parameters of Diego Jaén-Carrillo https://orcid.org/0000-0003-
running can be accurately calculated within 25–30 steps 0588-0871
(i.e., approximately 10 s), so the 1-min intervals selected Noel Marcen-Cinca https://orcid.org/0000-0002-
should fit the purpose of this study. 2840-9882
In addition, some methodological differences Felipe Garcı́a-Pinillos https://orcid.org/0000-0002-
between systems could also play a role in this discre- 7518-8234
pancy. The Strydä system uses a sampling rate of
1000 Hz, whereas the RunScribeä system’s rate is only References
500 Hz. In addition, although both devices base their 1. Bourdon PC, Cardinale M, Murray A, et al. Monitoring
power output calculations on kinematic data, the algo- athlete training loads: consensus statement. Int J Sports
rithms applied by each system may not be exactly the Physiol Perform 2017; 12(Suppl 2): S2161–s2170.
same. Given that there is not a unique definition of run- 2. Barnes KR and Kilding AE. Running economy: mea-
ning power,31 different methods for calculating power surement, norms, and determining factors. Sports Med
output in running exist.32,33 Therefore, the power val- Open 2015; 1(1): 8.
ues obtained with each device may vary based on the 3. Luedke LE, Heiderscheit BC, Williams DS, et al. Influ-
ence of step rate on shin injury and anterior knee pain in
assumptions made by each system. Unfortunately, the
high school runners. Med Sci Sports Exerc 2016; 48(7):
specific algorithms have not been disclosed by the com-
1244–1250.
panies, thus making their comparisons challenging. 4. Mujika I. The alphabet of sport science research starts
Some limitations need to be considered for a proper with Q. Int J Sports Physiol Perform 2013; 8(5): 465–466.
interpretation of these results. The reliability was estab- 5. Borresen J and Lambert MI. The quantification of train-
lished based on within-subject variation (CV), so the ing load, the training response and the effect on perfor-
current analysis might not be able to explain the slight mance. Sports Med 2009; 39(9): 779–795.
inherent differences between runs performed by the 6. Mujika I. Quantification of training and competition
same subject on different days. Notwithstanding these loads in endurance sports: methods and applications. Int
limitations, the current study provides positive out- J Sports Physiol Perform 2017; 12(Suppl 2): S29–S217.
comes for the absolute reliability of the Strydä and 7. Sanders D, Myers T and Akubat I. Training-intensity
distribution in road cyclists: objective versus subjective
RunScribeä devices to measure power output during
measures. Int J Sports Physiol Perform 2017; 12(9):
running on a treadmill at a self-selected velocity, with 1232–1237.
the Strydä system reporting more reliable power data 8. Sanders D, Taylor RJ, Myers T, et al. A field-based
compared to the RunScribeä device. Furthermore, a cycling test to assess predictors of endurance perfor-
good level of agreement between devices was observed mance and establishing training zones. J Strength Cond
with no heteroscedasticity of error, but the large ran- Res 2020; 34(12): 3482–3488.
dom error detected prevents their interchangeable use 9. Passfield L, Hopker JG, Jobson S, et al. Knowledge is
for running power output assessment. power: issues of measuring training and performance in
Considering that the use of power meter devices for cycling. J Sports Sci 2017; 35(14): 1426–1434.
competing and training is becoming more popular, 10. Norris M, Anderson R and Kenny IC. Method analysis
of accelerometers and gyroscopes in running gait: a sys-
users like scientists, athletes, and coaches should know
tematic review. Proc IMechE Part P: J Sports Engineer-
that the Strydä and RunScribeä systems showed an
ing and Technology 2014; 228(1): 3–15.
adequate reliability for running power assessments, 11. Garcia-Pinillos F, Latorre-Roman PA, Ramirez-Cam-
with the Strydä system exhibiting more reliable data pillo R, et al. How does the slope gradient affect spatio-
for running power evaluation. temporal parameters during running? Influence of
athletic level and vertical and leg stiffness. Gait Posture
Declaration of conflicting interests 2019; 68: 72–77.
12. Cerezuela-Espejo V, Hernandez-Belmonte A, Courel-Iba-
The author(s) declared no potential conflicts of interest nez J, et al. Are we ready to measure running power?
with respect to the research, authorship, and/or publi- Repeatability and concurrent validity of five commercial
cation of this article. technologies. Eur J Sport Sci. Epub ahead of print April
2020. DOI: 10.1080/17461391.2020.1748117.
13. Hollis CKR, Resch JE and Hertel J. Gait mechanics as
Funding
measured by a wearable sensor while running at two
The author(s) disclosed receipt of the following finan- speeds on different surfaces. Sports Biomech 2019; 7: 1–
cial support for the research, authorship, and/or 11.
6 Proc IMechE Part P: J Sports Engineering and Technology 00(0)

14. Garcia-Pinillos F, Roche-Seruendo LE, Marcen-Cinca 24. Hopkins WG, Marshall SW, Batterham AM, et al. Pro-
N, et al. Absolute reliability and concurrent validity of gressive statistics for studies in sports medicine and exer-
the stryd system for the assessment of running stride cise science. Med Sci Sports Exerc 2009; 41(1): 3–13.
kinematics at different velocities. J Strength Cond Res. 25. Koo TK and Li MY. A guideline of selecting and report-
Epub ahead of print May 2018. DOI: 10.1519/ ing intraclass correlation coefficients for reliability
jsc.0000000000002595. research. J Chiropr Med 2016; 15(2): 155–163.
15. Koldenhoven RM and Hertel J. Validation of a wearable 26. Landis JR and Koch GG. The measurement of observer
sensor for measuring running biomechanics. Digit Bio- agreement for categorical data. Biometrics 1977; 33(1):
mark 2018; 2(2): 74–78. 159–174.
16. Garcia-Pinillos F, Latorre-Roman PA, Roche-Seruendo 27. Bouillod A, Pinot J, Soto-Romero G, et al. Validity, sen-
LE, et al. Prediction of power output at different running sitivity, reproducibility, and robustness of the powertap,
velocities through the two-point method with the Stryd() stages, and garmin vector power meters in comparison
power meter. Gait Posture 2019; 68: 238–243. with the SRM device. Int J Sports Physiol Perform 2017;
17. Austin CL, Hokanson JF, McGinnis PM, et al. The rela- 12(8): 1023–1030.
tionship between running power and running economy in 28. Nimmerichter A, Schnitzer L, Prinz B, et al. Validity and
well-trained distance runners. Sports 2018; 6(4): 142. reliability of the garmin vector power meter in laboratory
18. Aubry RL, Power GA and Burr JF. An assessment of and field cycling. Int J Sports Med 2017; 38(6): 439–446.
running power as a training metric for elite and recrea- 29. Bertucci W, Duc S, Villerius V, et al. Validity and relia-
tional runners. J Strength Cond Res 2018; 32(8): 2258– bility of the PowerTap mobile cycling powermeter when
2264. compared with the SRM Device. Int J Sports Med 2005;
19. van Dijk H and van Megen R. The secret of running: 26(10): 868–873.
maximum performance gains through effective power 30. Garcia-Pinillos F, Latorre-Roman PA, Ramirez-Cam-
metering and training analysis. Maidenhead: Meyer & pillo R, et al. Minimum time required for assessing step
Meyer Sports, 2017. variability during running at submaximal velocities. J
20. Schieb DA. Kinematic accommodation of novice tread- Biomech 2018; 80: 186–195.
mill runners. Res Q Exerc Sport 1986; 57(1): 1–7. 31. Williams KR and Cavanagh PR. A model for the calcula-
21. Lavcanska V, Taylor NF and Schache AG. Familiariza- tion of mechanical power during distance running. J Bio-
tion to treadmill running in young unimpaired adults. mech 1983; 16(2): 115–128.
Hum Mov Sci 2005; 24(4): 544–557. 32. Fukunaga T and Matsuo A. Effect of running velocity
22. Hopkins WG. Measures of reliability in sports medicine on external mechanical power output. Ergonomics 1980;
and science. Sports Med 2000; 30(1): 1–15. 23(2): 123–136.
23. Atkinson G and Nevill AM. Statistical methods for asses- 33. Schepens B, Willems PA, Cavagna GA, et al. Mechanical
sing measurement error (reliability) in variables relevant power and efficiency in running children. Pflugers Arch
to sports medicine. Sports Med 1998; 26(4): 217–238. 2001; 442(1): 107–116.

You might also like