Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

1 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL

1 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A


2 BIFACTOR MODEL, DIMENSIONALITY, AND MEASUREMENT INVARIANCE
3
4 Sedigheh Salami1; Paulo Felipe Ribeiro Bandeira2; Cristiano Mauro Assis Gomes3
5 and Parvaneh Shamsipour Dehkordi4
61 Alzahra University, Tehran, Iran; sed60sal@gmail.com

72Universidade Regional do Cariri – Urca Grupo de Estudo, Aplicação e Pesquisa em Avaliação


8Motora; Crato, Brasil; paulo.bandeira@urca.br

93Universidade Federal de Minas Gerais UFMG; Belo Horizonte, Brasil;


10cristianomaurogomes@gmail.com

114 Alzahra university; Tehran, Iran; p.shamsipour@alzahra.ac.ir

12Address correspondence to: Sedigheh Salami1; sed60sal@gmail.com

13

14Aim: To examine the latent structure of the Test of Gross Motor Development, 3rd Edition
15(TGMD-3) with a bifactor modeling approach. Furthermore, the study examines the
16dimensionality, model-based reliability of general and specific contributions of the test's
17subscales and measurement invariance of the TGMD-3. Methods: Using a sample of 496
18Iranian children (M age = 7.23±2.03 years; 53.8 female) from the five main geographic regions
19of Tehran city, three alternative measurement models were tested: (a) a unidimensional model,
20(b) a correlated 2-factor model, (c) a bifactor model. Results: The totality of results including
21item loadings, goodness-of-fit indexes and reliability estimates all supported the bifactor model
22and strong evidence of general fundamental movement factor. Additionally, the reliability of
23subscale scores was poor, it is thus contended that scoring, reporting and interpreting of the
24subscales scores are probably not justifiable. Suggesting that the 2 traditionally hypothesized
25factors are better understood as “grouping” factors rather than as representative of latent
26constructs. Furthermore, the bifactor model appears invariant for gender. Conclusion: This
27study is the first to address the bifactor model and new insights regarding the application and
28interpretation of the test battery most widely used with children.
29 Keywords: fundamental movement skills, bifactor analysis, measurement invariance
30

31

32

33
2 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
2

34 Introduction

35 Fundamental motor skills (FMS) also known as gross motor skills refer to basic and goal-

36directed movement patterns (Burton & Miller, 1998). FMS play a crucial role in more advanced,

37specialized and sport-specific motor skills enabling youth to participate in a wide range of

38organized and non-organized physical activities (Clark & Metcalfe, 2002; Gallahue, Ozmun &

39Goodway 2012). From a dimensional point of view, these skills are globally categorized (Clark &

40Metcalfe, 2002; Stodden et al., 2008) as object control (e.g.,kick, catch), locomotor (e.g., run,

41hop) and stability (dynamic and static balance) skills (Gallahue, Ozmun & Goodway 2012,).

42FMS are an important factor for development of physical activity, health-related fitness and

43perceived motor competence (Robinson et al., 2015; Morgan et al., 2013).

44 It has been demonstrated that inadequate levels of gross motor skills result in lower

45levels of perceived motor competence (Robinson, Rudisill, & Goodway, 2009), self-esteem, and

46social acceptance (Skinner & Piek, 2001; Valentini, Zanella, & Webster, 2017). Additionally,

47motor delays in children are associated with poor performance in ball skills (Pienaar, Visagie, &

48Leonard, 2015) and locomotor skills (Robinson et al., 2011). Early assessment of gross motor

49skills in children with and without disabilities during preschool and elementary school years can

50aid in detection and continues monitoring of movement delays that could affect other

51developmental aspects including cognitive and affective (Piek, Dawson, Smith, & Gasson,

522008). Consequently, accurate evaluation and measurement of motor skills is an important step

53towards support and intervention for the children showing delays (Burton & Miller, 1998; Ulrich,

542017).

55 There are several assessment tools to evaluate motor skills in early childhood (Cools,

56Martelaer, Samaey, & Andries, 2009).The Test of Gross Motor Development (TGMD; Ulrich,

571985, 2000, 2013) is one of the most widely used instruments in the clinical, educational and

58research settings in order to establish the current level of gross motor skills development of

59children with and without disabilities (Ulrich, 2017). The third edition of the Test of Gross Motor

60Development (TGMD-3; Ulrich, 2013) is one of the only behavioral standardized norm- and
3 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
3

61criterion- referenced, process -and product -oriented assessment tool that evaluates the

62qualitative aspects of gross movement skills in children ranging from ages three to ten. The

63TGMD-3 assesses two dimensions: locomotor (e.g.,transport one’s body through space) and

64ball skills(refer to manipulation or projection of objects Clark & Metcalfe, 2002; Gallahue,

65Ozman, & Goodway, 2012).TGMD-3contains 13 different items/ tasks, six items measuring

66locomotor and seven items measuring ball skills that can be summarized in two separate

67subscale scores (Webster & Ulrich, 2017).

68 Since the publication of the TGMD–third edition (Ulrich, 2013), several studies have

69examined its reliability and construct validity across different population of the world (Webster &

70Ulrich, 2017; Estevan et al., 2017; Valentini, Zanella & Webster 2017; Wagner et al., 2017).The

71TGMD-3 has been shown to have acceptable test–retest reliability (Webster & Ulrich, 2017;

72Estevan et al., 2017 Valentini, Zanella & Webster 2017). Weak to moderate correlations

73between TGMD-3 (German translation) and M-ABC 2 (German version) subscales (i.e., r

74ranging from.22 to .33; Wagner, Webster & Ulrich, 2017) have been found indicating divergent

75validity. Validity and reliability aspects of the TGMD-3 with visual support were confirmed in

76children with autism spectrum disorder (Allen et al., 2017). Concurrent validity between TGMD-2

77and TGMD-3 have established in children with Down syndrome (Bouquet, 2015) and visual

78impairments (Brian et al., 2018).In a longitudinal study by Temple & Foley (2017) developmental

79validity of the TGMD-3 for both subscale and total scores was confirmed in a sample of

80Canadian students from grade 3 to grade 4 and among both gender. The total score and

81locomotor score of TGMD-3 positively related to vigorous physical activity (Webster, Martin &

82Staiano, 2018), but total score and subscale scores of TGMD-3 negatively related to low social

83economic status (Burns & Fu 2018). In short, the TGMD-3 shows good construct validity.

84 Studies regarding the internal consistency of the TGMD-3 based on Cronbach’s alpha

85have reported acceptable to high alpha values indicating that total score and dimension scores

86were internally consistent (Webster & Ulrich, 2017; Estevan et al., 2017; Valentini, Zanella &

87Webster 2017; Wagner et al., 2017; Brian et al., 2018). Various studies have investigated the

88factor structure of the TGMD-3. Initially, Webster and Ulrich (2017) conducted both exploratory
4 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
4

89and confirmatory factor analysis (EFA/CFA) on the TGMD-3 using a diverse ethnic sample

90within the United States. Unlike the previous edition of TGMD that have generated two latent

91factors namely locomotor and object control skills (Ulrich, 1985; Ulrich, 2000), EFA indicated

92support for a single factor (i.e., gross motor skill) with approximately 74% of explained variance

93that was further supported by CFA indicating acceptable fit for a one-factor, unidimensional

94solution for the TGMD-3 with all items loading on a single factor. An alternative two –factor

95model, which is theoretically postulated for the TGMD, was also tested. The proposed two-

96factor model adequately fitted the data, but due to a very high interfactor correlation (r= .96)

97failed to support two distinct factors. Consequently, the one-factor representation of the TGMD-

983 fitted the data significantly better than did a two-factor model.

99 Valentini, Zanella & Webster (2017) conducted a confirmatory factor analyses to

100investigate the factor structure of the TGMD-3 in Brazilian children (ages 3–10) from five main

101geographic regions of Brazil. The two-factor model was endorsed, although the correlation

102between the two latent factors of locomotor and ball skills was large (r= .89), alternative

103measurement models were not tested. Results of a pilot study on the psychometric properties of

104the TGMD-3(German translation) conducted by Wagner, Webster & Ulrich (2017) in typically

105developing German children (ages 3–10) also reported support for the supposed two- factor

106model but there was a high correlation between locomotor and ball skills latent factors(r= .82).

107Estevan et al. (2017) examined the factor structure of the TGMD-3 (Spanish version) via CFA in

108typically developed Spanish children (ages 3–11).Two alternative measurement models were

109tested (i.e., a unidimensional model, and a bidimensional model). Results have supported both

110models, a two-factor solution representing locomotor and ball skills with a very high interfactor

111correlation (r= .91) and a one factor, uni-dimensional solution named (FMS). Finally, Brian et al.

112(2018) evaluated the factor structure of the TGMD-3 via CFA in a multiethnic sample of children

113and adolescents with visual impairments (ages 9-18). Results have supported a two-factor

114solution representing loco-motor and ball skills which were strongly correlated (r= .90).

115 A current debate regarding the TGMD-3 centers on whether or not bi-dimensional

116findings demonstrate that the TGMD assesses two substantively distinct elements related to
5 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
5

117FMS. Specifically this debate centers on whether or not the scale assesses gross motor skills as

118one factor based on a mixture of 13 items, or as two distinct constructs representing locomotor

119and ball skills of FMS (Webster & Ulrich, 2017; Estevan et al., 2017; Garn & Webster, 2018;

120Wagner et al., 2017). As alignment between FMS theory and assessment should be present

121most studies of the factor structure of different versions of the TGMD rely on using the original

122proposed two-factor model (Valentini, Zanella & Webster 2017; Wagner et al., 2017; Estevan et

123al., 2017; Garn & Webster, 2018). These authors reported a two-factor solution for the TGMD-3

124and TGMD-2, although considerable overlap between two latent constructs was present which

125undermine discriminant validity. Poor discriminant validity of the traditionally two-factor TGMD

126model can lead biased results in studies that examine associations of these factors with

127relevant external variables such as physical activity, obesity (Marsh, Morin, Parker, & Kaur,

1282014; Garn & Webster, 2018; Robinson et al., 2015).

129 The majority of research to date into the factor structure of the TGMD has relied on

130conventionally CFA techniques in which several zero factor loading restrictions are imposed to

131represent the hypothesis that only specific latent factors (in this case locomotor and ball skills)

132influence specific manifest indicators. In the TGMD construct, those proposing bidimensional

133solution conclude that seven of the scale items (manifest variables) measure one latent

134construct, ball skills, and a different six measure a second latent construct, namely locomotor

135skills. However, the restrictive measurement model approach of standard CFA is often not well

136aligned to the analysis of assessment instruments composed of indicators with many cross-

137loadings (Asparouhov & Muthen, 2009). This has likely contributed to various questionable

138practices in much applied CFA research (see Asparouhov & Muthen, 2009; Marsh et al., 2009).

139 This was the case in recent study by Garn & Webster (2018) who conducted an

140exploratory structural equation modeling (ESEM) to reexamine the factor structure of the

141TGMD-2 using the normative dataset from the TGMD-2 manual. They tested three alternative

142measurement models of the TGMD-2 including a one-factor, two-factor CFA and two-factor

143ESEM. Although all three alternative measurement models of TGMD-2 produced an acceptable

144model fit, the two-factor ESEM produced better model fit statistics across indices in comparison
6 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
6

145with the one-factor and two-factor CFA models. But findings illustrated the complexities

146associated with ESEM and CFA two-factor models such as poor simple structure and high

147interfactor correlations. Therefore, the authors indicated the one-factor solution is more

148parsimonious and reproducible representation of the TGMD-2 factor structure that was in line

149with initial pilot work on the third edition of TGMD (Webster & Ulrich, 2017). This does create

150dilemma because, on a theoretical and conceptual level these two TGMD constructs are

151described as a separate FMS factors, but in reality share much common variance.

152 Within the CFA framework, two important alternative (and less restrictive) modeling

153approaches-bifactor model and higher order models -are available that appears well-suited to

154examine the underlying factor structure of scales that are composed of indicators with many

155cross-loadings (Wiesner & Schanding, 2013). However, using the latter approach is

156questionable for TGMD, as a single higher order factor cannot be specified with only two

157primary factors, and as such a model will be under identified (Brown, 2015). An alternative

158model that has not yet been examined for the TGMD is the bifactor model. A bifactor model also

159referred to as nested factor, direct hierarchical and general-specific models consist of a general

160factor posited to account for the commonality among all of the scale items and several

161orthogonal (i.e., uncorrelated) specific or group factors, which are specific to subsets of the

162items, represent item response covariation not explained by the general factor (Gustafsson &

163Balke, 1993; Holzinger & Swineford, 1937). In other words, each scale item is a reflective

164indicator of both a general factor and a more narrowly defined specific factors that is

165uncorrelated with a general factor. Therefore, the variance of each scale item is decomposed

166into separate, unique contributions of a broad general construct and several specific constructs

167(Reise, Moore, & Haviland, 2010).The general factor is the main focus of the scale and

168represents the conceptually broad construct which the test is intended to measure General

169Fundamental Movement factor while group factors are restricted in scope to narrow subdomain

170constructs (i.e., locomotor and ball skills dimension; Reise, 2012).

171 Although bifactor models were introduce over 70 years ago (Holzinger & Swineford,

1721937), it is relatively uncommon (Reise, Scheines, Widaman, & Haviland, 2012) especially in
7 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
7

173the motor skill assessment tests. In a bifactor model, the general and group factors are

174uncorrelated and both factors have direct relation with observed variables (Gustafsson & Balke,

1751993) and each item simultaneously loads onto one general factor, and one or more group

176factors. This way, the predictive validity of the group factors can be examined independently

177from the predictive validity of the general factor, and the strength of the relation between group

178factors and scale items can be directly distinguished (Chen et al., 2012). The application of such

179a model allows researchers to examine measurement invariance across different groups and

180also latent mean differences for both the general and group factors.

181 Additionally, the bifactor model can be practically useful in testing whether a subset of

182the domain-specific factors predicts relevant external variables, over and above the general

183factor (Chen, West, & Sousa, 2006). To date, previous studies have identified the two

184components of the TGMD as separate dimensions, although, consistent findings in virtually all

185past CFA studies indicated the presence of very high correlation between the locomotor and

186ball skills latent factors. Within bifactor framework, it is still unclear how much unique variance is

187explained by the subdomains, locomotor and ball skills, when we account for general factor.

188This is an important question because it may shed some light on whether it is useful to devote

189further research to discriminant validity of these latent components. Consequently, using the

190bifactor model can be thought of as a helpful approach for measuring the (uni) dimensionality of

191TGMD (Reise, Morizot, & Hays, 2007; Reise, Moore, & Haviland, 2010). Testing the bifactor

192model allows deciding whether the TGMD is essentially unidimensional and should not be

193broken up into dimension (subscale) scores, or that the items are multidimensional, reflecting

194the complexity of the factor structure of the TGMD. Thus, the bifactor conceptualization of the

195TGMD would enable us to get a broad sense of the extent to which items reflect single common

196target trait and the extent to which items reflect a primary or subtrait (Reise, Moore, & Haviland,

1972010).

198 To explore dimensionality of the TGMD, it is imperative to examine not only standard

199CFA goodness of fit indices but also several psychometrically informative bifactor derived

200indices, in other words, factor strength indices because adequate fit does not imply parameter
8 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
8

201accuracy (Reise, Scheines, Widaman, & Haviland, 2012; Rodriguez, Reise & Haviland, 2016).

202Using only fit indices is insufficient to indicate whether the total score, referring to the core

203construct, suffices as a reliable index or whether the subscale scores provide additional reliable

204information beyond the total score (Reise, Bonifay, & Haviland, 2013).Coefficient alpha, which is

205commonly reported as measure for internal consistency reliability of the subscales and total

206score, combines multiple sources of systematic variance that is explained by the both general

207and group factors (Zinbarg, Revelle, Yovel, & Li, 2005). Cronbach’s alpha assumes a

208unidimensional solution, on the other hand, when the data are fitted with multidimensional

209solution; alpha coefficient tends to overestimate reliability indices (Gignac & Watkins, 2013;

210McDonald, 1999; Reise, Moore, & Haviland, 2010).

211 Rodriguez, Reise & Haviland (2016) introduced several bifactor model-based

212psychometric indices that give the opportunity to, in addition to fit indices, estimate strength

213indices such as omega reliability coefficients (i.e., omega coefficients for both total composite

214scores and subscale scores; McDonald,1999; Reise, 2012; Revelle & Zinbarg, 2009; Zinbarg,

215Revelle, Yovel, & Li, 2005), explained common variance (ECV; Sijtsma, 2009) , and percentage

216of uncontaminated correlations (PUC; Bonifay, Reise, Scheines, & Meijer, 2015; Reise,

217Scheines, Widaman, & Haviland, 2012). When a multidimensional data are fit to a bifactor

218model, these bifactor derived indices indicate the strength of factor and may shed some light on

219whether to continue to focus on a single common factor or also devote further research on the

220group factors (Reise, 2012).

221 Additionally, considering the psychometric properties of the TGMD, testing measurement

222invariance enables us to evaluate model equality across different groups (i.e., sex; age).

223Establishing psychometric equivalence of constructs is a prerequisite to more appropriately

224comparing group means and testing structural relations with important covariate. In other words,

225it is important for researchers to ensure that the instrument measures the same construct in all

226groups before making factor-level comparison with relevant external variables (Little, 2013). To

227date, at least three studies have used CFA models to examine various levels of invariance for
9 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
9

228the TGMD-3 across gender and children of different ages (Valentini, Zanella & Webster 2017;

229Wagner, Webster & Ulrich 2017; Magistro et al., 2018).All of them used the two-factor model.

230 Valentini, Zanella, & Webster (2017) examined measurement invariance of the TGMD-3

231Brazilian version using multigroup CFA for groups (males and females) and also across age

232groups (3-6-years-old and 7-10-years-old). They found no invariance for the structure, factor

233loadings and item intercepts across gender and age groups. In a pilot study of German children,

234Wagner et al., (2017) found support for configural (same form) and metric (same factor

235loadings) invariance across gender and both groups (boys and girls) showed equivalence for

236the two-factor structure. However, there was lack of support for full invariance for intercepts.

237Furthermore, Magistro et al (2018) tested measurement invariance for the TGMD-3across

238children with and without mental and behavioral disorders. According to the magnitude of

239changes in Root Mean Square Error of Approximation and Comparative Fit Index between

240nested models, they found support for configural, weak, scalar and strict invariance in two

241samples. Despite such findings, there are no data on measurement invariance properties of the

242TGMD-3 with bifactor model across gender or age groups. Thus, it would be useful to explore

243measurement invariance of the bifactor model for groups (boys and girls).

244 The central aim of the current study is to examine the factorial validity of the TGMD-3 by

245comparing three alternative measurement models: (a) a unidimensional model, (b) a correlated

2462-factor model, (c) a bifactor model. Applying the bifactor model of TGMD-3 is of importance,

247because this may have implications for the way TGMD-3 measures should be applied in

248research and in practice setting. We hypothesize that the bifactor model will be superior

249compared with other models.

250 In the light of Reise, Bonifay, & Haviland (2013) recommendation, this study examines

251the degree of unidimensionality of the TGMD-3 and whether the dimensions of TGMD-3 remain

252reliable after accounting for the shared variance explained by the general factor. Dimensionality

253of the TGMD-3 within the bifactor framework has not been considered in previous validation

254studies but provides a better understanding of how to use a measurement. The TGMD-3 has

255been developed to measure two separate dimensions of FMS but in reality these dimensions
10 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
10

256share much common variance. Therefore, it is hypothesized that the TGMD-3 is primarily a

257unidimensional measure.

258 Additionally, as bifactor models are specifically amenable to the estimations of model-

259based reliabilities, this study examines the model-based reliability of both general/overall

260composite scores and subscale/index scores of TGMD-3, thus, providing viable information of

261unique internal consistency reliabilities of both general and subscale scores of TGMD-

2623.Furthermore, this study will examine the measurement invariance of the superior model of

263TGMD-3 between sexes. We expect support for measurement invariance of TGMD-3 between

264sexes. Final aim of this study is to explore the relationship of age and TGMD-3 construct.

265Applying the multiple-indicators multiple-causes (MIMIC; Morin, Arens, & Marsh, 2016) model

266framework, we test age as a covariate on the superior solution. It is hypothesized that age will

267have a strong association with general factor.

268

269 Methods

270Participants

271 The sample included 496 typically developing children, aged 3-10.9 (M age= 7.23±2.03

272years; 53.8 female) from the five main geographic regions of Tehran (North, Northwest, Central,

273West, Southeast, and South). Participants were recruited through five elementary schools, six

274preschools and kindergartens across the mentioned regions of Tehran city. All participants

275agreed to participate and their parents signed informed consent forms approved by the

276Institutional Review Board before data collection. Children had also the right to refuse

277participation and refrain from testing any time .All children were assessed with the full version

278of the TGMD-3 in single 20 minute sessions, applied by a trained assessor in schools,

279kindergartens and play grounds at the time of physical education classes.

280

281Measures

282 The Test of Gross Motor Development, 3rd Edition (TGMD-3; Ulrich, 2013) was originally

283validated in 1985 and 2000 (TGMD and TGMD-2; Ulrich, 1985, 2000) with norms
11 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
11

284based on American children’s performance from different states. The previous version of the

285test, the TGMD-2, presented appropriate psychometric results for American children (Ulrich,

2862000) with adequate internal consistency (αvalues > .87) and reliability values (r values > .86)

287as well as appropriate fit indexes for CFA (GFI = .96, AGFI = .95, TLI = .90; Ulrich, 2000).

288Existing data for the recent version of the test show very high internal consistency for both

289subscales (Cronbach’s α of .95) and total composite scores (Cronbach’s α of .97; Webster &

290Ulrich, 2017). Total and subscale scores have been found to be structurally valid and internally

291consistent (α>.80) in Spanish and Brazilian populations (Estevan et al., 2017; Valentini, Zanella

292& Webster 2017).

293 In the new version of the TGMD the following changes were made;(1)TGMD-3

294measures 13 different FMS activities, (2) the object control subtest was renamed ball skills

295subtests and one of the ball skills items was changed from under hand roll to under hand

296throw,(3) one hand strike added under ball skills subtest, skip was reinstated from the original

297TGMD and leap was omitted (5)some specific items criteria were adjusted. TGMD-3 is

298organized into two subtests: locomotion skills comprised of (running, galloping, hopping,

299skipping, jumping, and sliding) and ball skills (striking with one hand and two hands, dribbling,

300catching, kicking, over hand and under hand throwing. Each skill is evaluated by examining

301three to five performance criteria that represent the appropriate movement pattern of the skill.

302The test needs systematic observation of the performance criteria and takes approximately 15

303to 20 min per child to conduct. Each participant had one practice trail before the main execution.

304If the child appeared to not fully understand the task, he or she was allowed an additional

305practice trial. Finally, there are two formal test trials for each skill. If the child demonstrated the

306performance criteria properly, he or she was awarded a score of (1) for each formal trial. If he or

307she did not demonstrate the performance criteria correctly, a point of (0) was recorded for the

308trial. Performance criteria scores are calculated by summing the score on trial one and trial two

309for each performance criteria to form a raw skill score. Skill scores are calculated by summing

310all of the performance criteria scores for each skill in order to provide a total raw score for either

311the locomotor or ball skills subscales , or combined to provide a total of TGMD-3 test
12 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
12

312score .Total raw score of the locomotor subscale is (46) points and the ball skills subscale total

313raw score has the possibility of (54)points and total raw score will be 100 points. Consequently,

314the higher score in TGMD-3 demonstrates more motor competency in FMS.

315

316Statistical procedure

317 Subsequent results on the reliability, validity and measurement invariance across gender

318of the TGMD-3 were calculated using R packages Dplyr (Wickham & Wickham 2020) SemPlot

319(Epskamp, 2019) and SemTools (Jorgensen et al., 2018). The significance level for all statistical

320tests was set a priori to α = .05. There were no missing data in this data set. CFAs were first

321performed on the total sample to examine the best fitting model for the TGMD-3.Three

322alternative measurement models were specified and estimated: (a)a one-factor model with all

32313 items of the TGMD-3 loading on a single latent variable (i.e., Gross Motor Skill) which

324explains the items variance of the test; (b) a two-factor model, with all 13 items loading to two

325factors locomotor and ball skills, allowing the factors to correlate; (c) a bifactor model, with each

326item loading on one of the two factors, as well as on a general factor and these latent factors

327are orthogonal.

328 Prior to the confirmatory factor analyses, the Mardia-Test was used for the assessment of

329multivariate normal distribution. Since the Mardia' test showed that items present a strong non-

330normal multivariate distribution (kurtosis: p-value = 7.37 e-206; skewness: p-value = 1.10 e-

331243), all the models were performed through the robust maximum likelihood estimator

332(MLR).The overall quality adjustment of each alternative model was conducted using model fit

333indexes.

334 Model fit was examined using chi-square and associated degrees of freedom. Because

335the chi-square statistic is strongly sensitive to sample size and tends to reject reasonable

336models if the sample is large, therefore, we focus on the other fit indices, such as the

337comparative fit index (CFI; Bentler, 1990), the Tucker–Lewis Index (TLI; Tucker & Lewis, 1973),

338root mean square error of approximation (RMSEA; Steiger, 1990), An adequate fit was

339considered when CFI and TLI values were >.90, while values of >.95 indicated good fit (Hu &
13 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
13

340Bentler, 1999).RMSEA values of .08 and .06 indicated acceptable fit, while values <.05

341indicated good fit (Hu & Bentler, 1999). Model comparisons were conducted using the Satorra-

342Bentler (S-B χ2) chi-square and associated degrees of freedom differences test (2001), as well

343as the ΔCFI. A model was preferred in comparison to other model if showed a smaller chi-

344square, with p-value < .05, and ΔCFI > .002 (Satorra & Bentler, 2001).

345 To remain consistent with previous research on the TGMD, internal consistency was also

346examined by with Cronbach's alpha (Cronbach, 1951). Alphas of .6 or lower were considered as

347low, between .6 and .7 as acceptable, and above .7 as high values (Leary, 2008).The model-

348based reliability omega (ω) index was calculated for general factor and dimension factors

349(MacDonald, 1999).Omega for the general factor is a reliability estimate based on the factorial

350model that estimates the proportion of the observed variance in the total score attributed to all

351sources of common variance (Raykov, 2001; Bollen,1989).Omega for dimensions is the

352reliability of dimension based on all sources of variance across the items for that dimension.

353Coefficient omega hierarchical (ωh) was also calculated. This reliability index is a statistic based

354of the bifactor model representation that reflects the proportion of systematic variance in total

355scores attributable to a single general factor, while the variance of specific factors is removed

356(MacDonald, 1999). Coefficient omega hierarchical is a direct index of general factor strength

357(Reise, Scheines, Widaman, & Haviland, 2012).

358 Asa superior unidimensionality index in comparison to omega hierarchical, the ECV was

359calculated. ECV is the proportion of common variance explained by the general factor in the

360bifactor model. ECV index is easy to interpret as higher ECV values indicate little common

361variance beyond the variance accounted for the general factor (Reise, Scheines, Widaman, &

362Haviland, 2012). Moreover, PUC was also calculated. PUC is an important indicator of

363unidimensionality that moderates the biasing effects of forcing a unidimensional model to

364multidimensional data. Coefficient omega (ω), coefficient omega hierarchical (ωh), and

365coefficient omega subscale (ωs) scores> 0.8 indicate a strong relationship between the latent

366variable and item scores. ECV and PUC values > 0.70 indicate that the instruments should be
14 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
14

367regarded as essentially unidimensional; thus, these indices should indicate general factor

368strength (Rodriguez, Reise, & Haviland, 2016).

369 Multiple-group CFA measurement invariance across gender was tested for the most

370precise measurement model among the three tested models (unidimensional, bidimensional,

371and bifactor), using the maximum likelihood robust estimation method. In brief, this procedure

372consisted of comparing increasingly restrictive models that test the assumption of TGMD-3

373measurement invariance across groups: configural invariance (equality for form), weak or metric

374invariance (equality for factor loading), strong invariance (equality for item intercepts), and strict

375invariance (equality for residual variances or uniqueness). The configural model data fit was

376evaluated through the CFI and RMSEA indexes. The configural model was rejected if it

377presented CFI < 0.90 or RMSEA ≥ .10. Only if the configural model showed an acceptable data

378fit, then weak, strong, and strict invariance were performed. The weak, strong, and strict

379invariance models were rejected if they showed, in comparison to the configural model, ΔCFI

380> .002 and the Satorra-Bentler (2001) chi-square difference test p-value < .01 (Satorra &

381Bentler, 2001).

382 Finally, a MIMIC model approach was performed to the best model, adding the age

383variable as predictor of the latent variables (Brown, 2015). The data fit of this model was

384evaluated using comparative fit index (CFI; Bentler, 1990), and root mean square error of

385approximation (RMSEA; Steiger, 1990), an adequate fit was considered when CFI values was

386>.90, while values of >.95 indicated good fit (Hu & Bentler, 1999). RMSEA values of .08 and .06

387indicated acceptable fit, while values <.05 indicated good fit (Hu & Bentler, 1999).

388

389 Results

390
391Fit for the One-Factor, Two-Factor, and Bifactor Models
392
393 Table 1 displays the fit indexes of three alternative measurement models of TGMD-3. In

394terms of model evaluation, we found an acceptable fit under MLR estimation for all three

395models; TLI and CFI are higher than .95, and RMSEA is close to .04 for the two-factor and
15 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
15

396bifactor models. The standard χ² statistic is reduced when comparing the one-factor model, the

397two-factor model, and the bifactor model. Specifically, the two-factor CFA and bifactor fit of the

398data considerably better than one-factor CFA. However, correlation between the two factors in

399the two-dimensional model was very high (r = 0.88), thereby indicating considerable overlap in

400the locomotor and ball skills indicators. Consequently, with respect to the information criteria

401both Δ Chi-square and Δ CFI, the best model fit with satisfactory values was found for the

402bifactor model and this model showed the best data fit. The adequacy of the superior model can

403also be determined in relation to its parameter estimates. The standardized factor loadings for

404the bifactor model, as well as the one-factor and two-factor model are presented in Table 2.

405

406***********************************Insert Table 1***********************************************

407

408*********************************** Insert Table 2**********************************************

409

410Dimensionality and Internal Consistency


411
412 Table 2 shows all items in the one-factor, two-factor solution have statistically

413significant (p <0.05) factor loadings. Based on the bifactor model, a small change was

414performed in the original bifactor model. As all items loaded significantly on the general factor,

415Dribble and Catch items were not found to be meaningful contributor of variance to their specific

416factor. Because of their negative variance in the bifactor model, the loadings of these items

417were constrained to zero in the specific latent variable. After modification of the bifactor model,

418we determined that these items could not be loaded by the specific factor. They are loaded only

419by the general factor.

420 Comparing the factor loading of the one-factor model at item level with the factor

421loadings of the general factor in the bifactor model revealed that the factor loadings are fairly

422similar (see table 2). On average, the factor loadings differed .03. The loadings of the general

423factor on the items varied from (.52 to .81; see Figure 1). Furthermore, when compared with

424factor loadings of the two correlated factors at subscale level, the factor loadings of the
16 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
16

425dimensions in the bifactor model are substantially lower (varied from .12 to .43), when

426controlling for the general factor. This is a first indication of a strong general factor in the data.

427As Reise, Moore, & Haviland (2010) suggested that when items load strongly onto a general

428factor, and comparatively weaker on each of the specific factors, this can be considered as a

429support for a unidimensional scoring scheme.

430

431*********************************** Insert Figure 1*******************************************

432

433 This view was confirmed by the ECV, which was high (ECV = .84; see Table 2)

434indicating strong general factor. It revealed that general factor explained a large proportion of

435variance and that collectively, the dimensions account for nearly 16% of the common variance,

436above and beyond the general factor. Reise et al. (2012) concluded that "there will be biasing

437effects of forcing a unidimensional model to multidimensional data. In this case the important

438diagnostic information can be derived from examining both ECV and PUC ". In the light of

439Rodriguez, Reise, & Haviland (2016) recommendation that when ECV is > .70 and PUC > .70,

440relative bias will be slight and the common variance can be regarded as essentially

441unidimensional. We examined PUC which was (PUC = .679; see Table 2). Thus, we further

442confirmed that TGMD-3 is unidimensional and should be specified as a single latent construct.

443 The internal consistency of the TGMD-3 in the present study as measured with

444Cronbach’s alpha for the total score was good (α = .91). Cronbach’s alphas of the two

445subscales were also acceptable (see Table 2). Due to coefficient alpha’s limitations with

446multidimensional models (see Raykov, 1998), this study investigated the composite reliability of

447the TGMD-3 as a more rigorous assessment of internal reliability. Values greater than .60 are

448generally considered acceptable (Bagozzi & Yi, 1988). Results indicated that total TGMD-3

449possesses satisfactory internal consistency (CR = .894). The omega for the general factor also

450showed high reliability with McDonalds' omega equal to .86, while the omegas for the specific

451factors were low .11, and .18, for the ball skills and locomotor skills dimension, respectively.

452Once accounting for the general factor as represented by omega hierarchical, the reliability of
17 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
17

453the dimensions were also low, while the omega hierarchical remained high (ωh = .856).

454Therefore, the results supported the presence of a strong general motor factor and indicated

455poor viability of subscales.

456

457Multiple-Group CFA for Invariance across Gender for the Bifactor Model
458
459 Table 3 shows the results of the analyses for invariance testing across gender for the

460bifactor model. As shown the goodness-of fit values for RMSEA and CFI produced good fit for

461the configural model (M1), indicating that parameter configuration in the bifactor solution was

462similar between sexes. The second step was testing for weak invariance where factor loading

463were set equal across gender. The results displayed in Table 3 indicated that there was weak or

464metric invariance (M2). The model still indicated satisfactory fit; regarding the Δ CFI and the

465Satorra-Bentler (2001) chi-square difference test (see Table 3). Model fit also remained intact

466for the strong invariance model (M3). Finally, the addition of equality constraints to residual

467variances in the strict invariance model (M4) did not undermine model fit compared to the strong

468invariance model. These results support for measurement invariance for the bifactor model and

469indicating that group comparisons can be meaningfully made for TGMD-3.

470

471*********************************** Insert Table 3 *******************************************

472

473MIMIC Model

474 Results of MIMIC model examining age variable as predictor of the latent variables

475indicated that the MIMIC model has good data fit (χ2= 118.097;df(64);p <.001 (RMSEA90%

476CI=.041 [.029–.053], CFI = .986, TLI = .980). The age variable showed a strong correlation with

477general factor (r= .890;p-value = .000), but a non-significant (p > 0.05) and negative correlations

478with Ball skills r= -.404; ¿p-value = .161) and locomotor skills r= -.335; ¿p-value = .283).This

479suggested that age was a significant predictor of general factor but not a significant predictor for

480the specific factors. Since the loading of age on the general factor showed a range of .85 to .93
18 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
18

481(interval confidence of 90%), we can affirm that age explains at least 72.25% of the general

482factor variance measured by the TGMD.

483 Discussion

484 The general purpose of this study was to use CFA to examine support for the bifactor

485model of the TGMD-3. A total of three competing models were specified and tested, a one-

486factor model, a two-factor model and a bifactor model. Based on the fit indices, the bifactor

487solution was considered to be an adequately fitting model, and to provide a better fit the data

488than the alternative measurement models. In comparison with bifactor model, the model fit

489information for the one-factor model was inadequate in so far as the CFI value was<.95 and the

490RMSEA values was >.05(Bentler, 1990; Hu & Bentler, 1999).The two-factor model also showed

491good fit and the fit for this model was better than with one-factor model. Like previous studies

492(Webster & Ulrich, 2017; Estevan et al., 2017), the correlation between the TGMD-3 factors in

493the two-factor model was very high(r = 0.88), indicating that there was no evidence of divergent

494validity between the factors. However, locomotor and ball skills are described as distinct and

495unique constructs in motor development theory. This question the appropriateness of the two-

496factor model which contradicted FMS theoretical models (Clark & Metcalfe, 2002; Gallahue,

497Ozman, & Goodway, 2012).

498 As Brown (2015) suggested, when interfactor correlation is >.85, it would be possible to

499conflate factors to reduce the number of dimensions and consequently attain the most

500parsimonious set of items that informs the underlying factorial structure. In this context, the

501bifactor model can resolve these potentially problematic dimensionality issues (Reise, 2012).

502The absence of the bifactor model in the previous TGMD-3 validation studies would have

503pointed to a two-factor and a one-factor solution as more parsimonious models. In the present

504study our results provide support for a bifactor model for TGMD-3. The good support for the

505bifactor model and better support for this model over the two competing models indicates that

506the bifactor model is an appropriate and a better structural model for the new edition of TGMD

507than alternative models. Based on the bifactor model, all tasks predicted by a general factor,

508which is closely connected to the generality perspective of a general ability underlying the
19 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
19

509performance on various motor skills (Brace, 1927). Two items (dribble and catch) were not valid

510indicators of their specific factor because they had negative variance on their specific factor in

511the bifactor model.

512 According to the magnitude of changes in RMSEA and CFI indexes between nested

513models, the assumption of measurement invariance across gender was valid. Our results

514revealed that the bifactor model is invariant across gender, indicating that the scores measure

515the same construct across boys and girls. This can be interpreted to suggest additional support

516for the bifactor model. Additionally, internal consistencies of total score, dimensions as

517measured with Cronbach's alpha are consistent with previously reported alphas, demonstrating

518high reliability estimates (Webster & Ulrich, 2017; Estevan et al., 2017; Valentini, Zanella &

519Webster 2017). The model-based McDonald's omega coefficients used as an alternative

520estimate of reliability also indicates high reliability for the total scores but not for dimension

521scores. The values of omega are lower than the alphas, indicating the limitation of coefficient

522alpha which tends to combine multiple sources of systematic variance when data are associated

523with multidimensional models, thus coefficient alphas overestimates the reliability of TGMD-3

524(Gignac & Watkins, 2013). We further inspect the strength indices to examine

525multidimensionality of the TGMD -3, omega hierarchical of the both dimensions drops

526considerably when accounting for the general motor factor.

527 Moreover, the reliability of the general factor remains high, indicating that the variance of

528the scale is primarily explained by the general motor factor. This result is supported by the small

529differences in factor loadings between the general factor from the bifactor solution and the one-

530factor solution and also an acceptable value of PUC (i.e., 68%) and relatively high ECV value

531for the general motor factor (i.e., 85% of the common variance was explained by the general

532factor). Consequently, the dimensions do not explain variance over and above the general

533factor. However, the TGMD-3 is a multidimensional construct, the general motor factor is

534robustly reliable, and the specific factors showed weak viability beyond general motor factor.

535From a clinical perspective, these findings indicate that the gross motor skills assessed by

536TGMD-3 reflect a general latent trait and the use of observed dimension scores is probably not
20 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
20

537justifiable. Thus, it is contended that reporting and interpretation of the TGMD-3 should be

538restricted to total scale's composite score which is in accordance with the general motor ability

539hypothesis (Brace, 1927; Burton & Rodgerson, 2001; Utesch et al., 2016).

540 Recent research with other instruments for motor competence assessment corroborates

541our result. For example, the study conducted by Bardid, Utesch & Lenoir (2019) to investigate

542mid-childhood motor competence using the Bruininks Oseretsky Proficiency-2 Short Form

543(SFBOT-2; Bruininks & Bruininks, 2005) through item response theory reported a

544unidimensional construction and also provided support for the use composite scores. Another

545study conducted by Utesch et al. (2018) revealed a unidimensional factorial solution and

546supported the use of validated composite scores in 6-9-year-old children using the German

547motor ability test. These studies used item response theory to assess the hypothesis of general

548motor skill in children. Additionally, the findings here are consistent with the findings of bifactor

549model applied to different scales assessing various aspects of psychopathology, personality

550and motor assessments tests (Rodriguez, Reise, & Haviland, 2016; Mckay, Boduszek, &

551Harvey, 2014; Okuda et al., 2019).

552 As expected, MIMIC findings showed age as a strong covariate of the general motor

553factor (r=.890). However, ball skills r= - 0.40 and locomotor skills r=-0.33 revealed non-

554significant (p > 0.05) and negative correlation with age respectively. The result of the MIMIC

555model is not in line with previous study on reexamining TGMD-2 that confirmed age as a strong

556covariate of the both latent factors, pointing to that age explains substantial variance of the

557latent factors (Garn & Webster, 2018).

558 Conclusions

559 Regarding the vital role of motor development to the children's overall health (Robinson et

560al., 2015), it is imperative to assess and monitor motor competence with a valid assessment tool

561to make appropriate interpretation during childhood. This study provides evidence based on

562tests of model fit, item loadings, reliability, and correlations with external variable, that the

563TGMD-3 is a unidimensional motor assessment. The bifactor modeling approach contributes to

564a better understanding of latent trait(s) underlying TGMD-3. In view of limited bifactor studies in
21 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
21

565the field of motor assessment, present study examined motor competence across childhood

566applying the bifactor approach and provides evidence for the general motor ability hypothesis

567(Brace, 1927).

568 The bifactor model fits the TGMD-3properly and may be used to compare different groups

569(e.g., boys vs. girls). The support for full cross-gender measurement invariance for form, factor

570loadings, intercepts and uniqueness in this study means that there are no differences in

571measurement and scaling properties for the TGMD-3 across ratings provided by boys and girls.

572Furthermore, the omega hierarchical and ECV show that FMS as measured with the TGMD-3 is

573primarily a unidimensional construct, indicating that when interpreting scores, the focus should

574be on the total scale's composite scores of TGMD-3, rather than on dimensional scores.

575Because most of the reliable variance was derived from general factor.

576 Although this study has provided useful new psychometric information on the TGMD-3,

577there are also limitations. First, the support for the bifactor model needs to be viewed in the

578context that two items of ball skills (i.e., dribble and catch) had negative factor loadings. As both

579items had negative loadings on their respective factor, these items cannot be considered as

580measures of their specific factors. Second, as this study examined typically developing children,

581it is uncertain whether the results can be generalized to other groups, such as clinical samples,

582and specific cultural groups. Demonstration of support for this model across a range of diverse

583groups would add support for the robustness of this model. Third, this study examined the

584correlates of the general and specific factors of the bifactor model for only age and gender of

585many possible outcomes.

586 Examination of the relations of the factors in the bifactor model with a range of health-

587related outcome variables such as physical activity and obesity would provide a more

588comprehensive understanding of how FMS helps protect and alleviate children from motor

589difficulties and also help improve their well-being. Therefore, future research should use latent

590variables modeling techniques such as structural equation modeling to examine these relations

591between the general motor factor score as well as locomotor and ball skill scores with outcome

592variables. This provides insight into what the general motor factor and the specific dimension
22 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
22

593factors represent. Fourth, as this study was a cross-sectional study and our data were not

594representative of entire country, the relations reported here do not imply causal relations. In this

595respect it would be useful if longitudinal studies involving the bifactor model were conducted.

596Such studies could be aimed at evaluating the stability of the bifactor model or the role of the

597factors in the bifactor model in influencing critical outcomes variables with a large and diverse

598sample of children. This study is the first to address the bifactor model and new insights

599regarding the application and interpretation of the test battery most widely used with children.

600 Acknowledgments

601 Competing interests

602 The authors declare that they have no competing interests.

603 References

604Allen, K. A., Bredero, B., Van Damme, T., Ulrich, D. A., & Simons, J. (2017). Test of Gross

605 Motor Development-3 (TGMD-3) with the Use of Visual Supports for Children with Autism

606 Spectrum Disorder: Validity and Reliability. Journal of Autism and Developmental

607 Disorders, 47(3), 813–833. doi:10.1007/s10803-016-3005-0.

608Asparouhov, T., & Muthén, B. (2009). Exploratory structural equation modeling. Structural

609 equation modeling: a multidisciplinary journal, 16(3), 397-438.

610Bagozzi, R. P., & Yi, Y. (1988). On the evaluation of structural equation models. Journal of the

611 Academy of Marketing Science, 16(1), 74–94. doi:10.1007/BF02723327

612Bardid, F., Utesch, T., & Lenoir, M. (2019). Investigating the construct of motor competence in

613 middle childhood using the BOT-2 Short Form: an item response theory perspective.

614 Scandinavian Journal of Medicine and Science in Sports, 29 (12), 1980-1987.

615 doi:10.1111/sms.13527

616Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107

617 (2), 238–246. doi:10.1037/0033-2909.107.2.238

618Bollen, K. A. (1989). Structural equations with latent variables. New York, NY: Wiley.

619Bonifay, W. E., Reise, S. P., Scheines, R., & Meijer, R. R. (2015). When are multidimensional

620 data unidimensional enough for structural equation modeling? An evaluation of the
23 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
23

621 DETECT multidimensionality index. Structural Equation Modeling: A Multidisciplinary

622 Journal, 22 (4), 504-516. doi:10.1080/10705511.2014.938596

623Bouquet, J. (2015). Concurrent validity of TGMD-2 and TGMD-3 in children with Down

624 syndrome.

625Brace, D. K. (1927). Measuring motor ability: A scale of motor ability tests. New York: A. S.

626 Barnes.

627Brian, A., Taunton, S., Lieberman, L.J., Haibach-Beach, P., Foley, J., & Santarossa, S. (2018).

628 Psychometric Properties of the Test of Gross Motor Development-3 for children with visual

629 impairments. Adapted Physical Activity Quarterly, 35 (2), 145–158.

630 doi:10.1123/apaq.2017-0061

631Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd Ed.). New York, 606

632 NY: The Guilford Press.

633Bruininks, R., & Bruininks, B. (2005). Bruininks-Oseretsky Test of Motor Proficiency (2nd ed.).

634 Minneapolis, MN: Pearson Assessment.

635Burns, R. D., & Fu, Y. (2018). Testing the motor competence and health-related variable

636 conceptual model: A path analysis. Functional Morphology and Kinesiology, 3(4), 61.

637 doi:10.3390/jfmk3040061

638Burton, A.W., & Miller, D.E. (1998). Movement skill assessment. Champaign, IL: Human

639 Kinetics.

640Burton, A. W., & Rodgerson, R. W. (2001). New perspectives on the assessment of movement

641 skills and motor abilities. Adapted Physical Activity Quarterly, 18(4), 347-365.

642 doi:10.1123/apaq.18.4.347

643Chen, F. F., West, S. G., & Sousa, K. H. (2006). A comparison of bifactor and second-order

644 models of quality of life. Multivariate Behavioral Research, 41(2), 189-225.

645 doi:10.1207/s15327906mbr4102_5.

646Chen, F. F., Hayes, A., Carver, C. S., Laurenceau, J.-P., & Zhang, Z. (2012). Modeling general

647 and specific variance in multifaceted constructs: A comparison of the bifactor model to

648 other approaches. Journal of Personality, 80 (1), 219-251.


24 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
24

649 doi:10.1111/j.1467-6494.2011.00739.x.

650Clark, J. E., & Metcalfe, J.S. (2002). The mountain of motor development: A metaphor. In J.E.

651 Clark & J.H. Humphrey (Eds.), Motor development: Research and reviews (Vol. 2, pp.

652 163–190). Reston, VA: National Association of Sport and Physical Education.

653Cools, W., De Martelaer, K., Samaey, C., & Andries, C. (2009). Movement skill assessment of

654 typically developing preschool children: A review of seven movement skill assessment

655 tools. Journal of Sports Science and Medicine, 8 (2), 154–168.

656Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16,

657 297-334. doi: 10.1007/BF02310555

658Epskamp, S. (2019). SemPlot: Path Diagrams and Visual Analysis of Various SEM Packages'

659 Output. R package version 1.1.1. https://CRAN.R-project.org/package=semPlot

660Estevan, I., Molina-García, J., Queralt, A., Álvarez, O., Castillo, I., & Barnett, L. (2017). Validity

661 and Reliability of the Spanish Version of the Test of Gross Motor Development-3. Journal

662 of Motor Learning and Development, 5 (1), 69-81. doi:10.1123/jmld.2016-0045

663Gallahue, D.L., Ozman, J.C., & Goodway, J.D. (2012). Understanding motor development: 629

664 Infants, children, adolescents, adults. (7th Ed). New York, NY: McGraw-Hill.

665Garn, A. C., & Webster, E. K. (2018). Reexamining the factor structure of the test of gross motor

666 development–second edition: Application of exploratory structural equation modeling.

667 Measurement in Physical Education and Exercise Science, 22(3), 200–212.

668 doi:10.1080/1091367X.2017.1413373

669Gignac, G. E., & Watkins, M. W. (2013). Bifactor modeling and the estimation of model based

670 Reliability in the WAIS-IV. Multivariate Behavioral Research, 48(5), 639–662.

671 doi: 10.1080/00273171.2013.804398

672Gustafsson, J.-E., & Balke, G. (1993). General and specific abilities as predictor of school

673 achievement. Multivariate Behavioral Research, 28 (4), 407-434.

674 doi:0.1207/s15327906mbr2804_2

675Holzinger, K. J., & Swineford, F. (1937). The bi-factor method. Psychometrika, 2, 41–54.

676 doi: 10.1007/BF02287965


25 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
25

677Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis:

678 Conventional criteria versus new alternatives. Structural Equation Modeling, 6 (1), 1–55.

679 doi:10.1080/10705519909540118

680Jorgensen, T. D., Pornprasertmanit, S., Schoemann, A. M., & Rosseel, Y. (2018). SemTools:

681 Useful tools for structural equation modeling. R package version 0.5-1. Retrieved from

682 https://CRAN.R-project.org/package=semTools

683Leary, M. R. (2008). Introduction to behavioral research methods (5th Ed.). Boston, MA:

684 Pearson Education.

685Little, T. D. (2013). Longitudinal structural equation modeling. New York, NY: The Guilford

686 press.

687Magistro, D., Piumatti, D., Calevaro, F., Sherar, L. B., Esliger, D. W., Bardaglio, G., Magno, F.,

688 Musella, G., & Zecca, M (2018) Measurement invariance of TGMD-3 in children with and

689 without mental and behavioral disorders. Psychological Assessment, 30(11), 1421-1429.

690 doi:10.1037/pas0000587

691Marsh, H. W., Morin, A. J. S., Parker, P. D., & Kaur, G. (2014). Exploratory structural equation

692 652 modeling: An integration of the best features of exploratory and confirmatory factor

693 653 analysis. Annual Review of Clinical Psychology, 10, 85–110.

694 doi: 10.1146/annurev-clinpsy-032813-153700

695Marsh, H. W., Muthen, B., Asparouhov, T., L¨udtke, O., Robitzsch, A., Morin, A. J. S., &

696 Trautwein, U. (2009). Exploratory structural equation modeling, integrating CFA and EFA:

697 Application to students’ evaluations of university teaching. Structural Equation Modeling: A

698 Multidisciplinary Journal, 16(3), 439–476. doi:10.1080/10705510903008220

699McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Lawrence Erlbaum.

700McKay, M. T., Boduszek, D., & Harvey, S. A. (2014). The Rosenberg Self- Esteem Scale: A

701 bifactor answer to the two-factor question? Journal of Personality Assessment, 96(6), 654-

702 660. doi:10.1080/00223891.2014.923436

703Morin, A. J. S., Arens, A. K., & Marsh, H. W. (2016). A bifactor exploratory structural equation

704 modeling framework for the identification of distinct sources of construct-relevant


26 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
26

705 psychometric multidimensionality. Structural Equation Modeling: A Multidisciplinary

706 Journal, 23 (1), 116–139. doi:10.1080/10705511.2014.961800

707Morgan, P.J., Barnett, L.M., Cliff, D.P., Okely, A.D., Scott, H.A., Cohen, K.E., & Lubans, D.L.

708 (2013). Fundamental movement skill interventions in youth: A systematic review and

709 meta-analysis. Pediatrics, 132(5), e1361–e1383. doi: 10.1542/peds.2013-1167

710Okuda, P. M. M., Pangelinan, M., Capellini, S. A., & Cogo-Moreira, H. (2019). Motor skills

711 assessments: support for a general motor factor for the Movement Assessment Battery for

712 Children-2 and the Bruininks-Oseretsky Test of Motor Proficiency-2. Trends in psychiatry

713 and psychotherapy, (ahead).

714Piek, J.P., Dawson, L., Smith, L.M., & Gasson, N. (2008). The role of early fine and gross motor

715 development on later motor and cognitive ability. Human Movement Science, 27 (5), 668–

716 681. doi: 10.1016/j.humov.2007.11.002.

717Pienaar, A.E., Visagie, M., & Leonard, A. (2015). Proficiency at object control skills by nine-to

718 ten-year-old children in South Africa: The NW-child study. Perceptual andMotor Skills, 121

719 (1), 309–332.

720Raykov, T. (1998). Coefficient alpha and composite reliability with interrelated no homogeneous

721 items. Applied Psychological Measurement, 22, 375–385.

722Raykov, T. (2001). Estimation of congeneric scale reliability using covariance structure analysis

723 with nonlinear constraints. British Journal of Mathematical and Statistical Psychology, 54

724 (2), 315–323. doi:10.1348/000711001159582

725Reise, S. P., Morizot, J., & Hays, R. D. (2007). The role of the bifactor model in resolving

726 dimensionality issues in health outcomes measures. Quality Life Research, 16(1), 19-31.

727 doi:10.1007/s11136-007-9183-7

728Reise, S. P., Moore, T. M., & Haviland, M. G. (2010). Bifactor models and rotations: Exploring

729 the extent to which multidimensional data yield univocal scale scores. Journal of

730 Personality Assessment, 92(6), 544–559. doi:10.1080/00223891.2010.496477

731Reise, S. P. (2012). The rediscovery of bifactor measurement models. Multivariate Behavioral

732 Research, 47(5), 667-696. doi:10.1080/00273171.2012.715555


27 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
27

733Reise, S. P., Bonifay, W. E., & Haviland, M. G. (2013). Scoring and modeling psychological

734 measures in the presence of multidimensionality. Journal of Personality Assessment, 95

735 (2), 129-140. doi:10.1080/00223891.2012.725437.

736Reise, S. P., Scheines, R., Widaman, K. F., & Haviland, M. G. (2012). Multidimensionality and

737 structural coefficient bias in structural equation modeling: A bifactor perspective.

738 Educational and Psychological Measurement, 73 (1), 5-26.

739 doi:10.1177/0013164412449831

740Robinson, L.E., Rudisill, M.E., & Goodway, J.D. (2009). Instructional climates in preschool

741 children who are at risk. Part II: Perceived physical competence. Research Quarterly for

742 Exercise and Sport, 80(3), 543–551. doi:10.1080/02701367.2009.10599592

743Robinson, L.E., Rudisill, M.E., Weimar, W.H., Breslin, C.M., Shroyer, J.F., & Morera, M. (2011).

744 Footwear and locomotor skill performance in preschoolers. Perceptual and Motor Skills,

745 113(2), 534-538. doi:10.2466/05.06.10.26.PMS.113.5.534-538

746Robinson, L.E., Stodden, D.F., Barnett, L.M., Lopes, V.P., Logan, S.W., Rodrigues, L.P., &

747 D’Hondt, E. (2015). Motor competence and its effect on positive development trajectories

748 of health. Sports Medicine (Auckland, N.Z.), 45(9), 1273–1284.

749 doi:10.1007/s40279-015-0351-6

750Rodriguez A, Reise SP, Haviland M.G (2016). Applying bifactor statistical indices in the

751 evaluation of psychological measures. Journal of Personality Assessment. 98(3), 223-37.

752 doi:10.1080/00223891.2015.1089249.

753Revelle, W., & Zinbarg, R. E. (2009). Coefficients alpha, beta, omega, and the glb: Comments

754 on Sijtsma. Psychometrika, 74(1), 145-154. doi:10.1007/S11336-008-9102-Z

755Satorra, A., & Bentler, P. M. (2001). A scaled difference chi-square test statistic for moment

756 structure analysis. Psychometrika, 66(4), 507–514. doi:10.1007/BF02296192

757Skinner, R. A., & Piek, J. P. (2001). Psychosocial implications of poor motor coordination in

758 children and adolescents. Human Movement Science, 20(1-2), 73–94.

759 doi:10.1016/s0167-9457(01)00029-x
28 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
28

760Steiger, J. H. (1990). Structural model evaluation and modification: An interval estimation

761 approach. Multivariate Behavioral Research, 25(2), 173–180.

762 doi:10.1207/s15327906mbr2502_4

763Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of Cronbach’s

764 alpha. Psychometrika, 74(1), 107-120. doi: 10.1007/s11336-008-9101-0

765Stodden, D.F., Goodway, J.D., Langendorfer, S.J., Roberton, M.A., Rudisill, M.E., Garcia, C., &

766 Garcia, L.E. (2008). A developmental perspective on the role of motor skill competence in

767 physical activity: An emergent relationship. Quest, 60(2), 290–306.

768 doi:10.1080/00336297.2008.10483582

769Temple, V. A., & Foley, J. T. (2017). A peek at the developmental validity of the Test of Gross

770 Motor Development–3. Journal of Motor Learning and Development, 5(1), 5-14.

771 doi.org/10.1123/jmld.2016-0005.

772Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis.

773 Psychometrika, 38(1), 1–10. doi: 10.1007/BF02291170

774Ulrich, D.A. (1985). Test of Gross Motor Development. Austin, TX: Pro-Ed.

775Ulrich, D.A. (2000). The Test of Gross Motor Development (2nd ed.). Austin, TX: Pro-Ed.

776Ulrich, D.A. (2013). The Test of Gross Motor Development–3 (TGMD-3): Administration,

777 scoring, and international norms. Spor Bilimleri Dergisi, 24(2), 27–33.

778Ulrich, D. A. (2017). Introduction to the Special Section: Evaluation of the Psychometric

779 Properties of the TGMD-3. Motor Learning and Development, 5(1), 1-4.

780 doi:10.1123/jmld.2017-0020

781Utesch, T., Bardid, F., Huyben, F., Strauss, B., Tietjens, M., De Martelaer, K. ...& Lenoir, M.

782 (2016). Using Rasch modeling to investigate the construct of motor competence in early

783 childhood. Psychology of Sport and Exercise, 24, 179-187.

784 doi:10.1016/j.psychsport.2016.03.001

785Utesch, T., Dreiskämper, D., Strauss, B., & Naul, R. (2018). The development of the physical

786 fitness construct across childhood. Scandinavian journal of medicine & science in sports,

787 28(1), 212-219. doi: 10.1111/sms.12889


29 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
29

788Valentini, N. C., Zanella, L. W., & Webster, E. K. (2017). Test of Gross Motor Development—

789 Third edition: Establishing content and construct validity for Brazilian children. Journal of

790 Motor Learning and Development, 5(1), 15–28. doi:10.1123/jmld.2016-0002

791Wagner, M. O., Webster, E. K., & Ulrich, D. A. (2017). Psychometric Properties of the Test of

792 Gross Motor Development Third Edition (German translation): Results of a Pilot-Study.

793 Journal of Motor Learning and Development, 5, 29-44. doi:10.1123/jmld.2016-0006

794Webster, E. K., & Ulrich, D. A. (2017). Evaluation of the Psychometric Properties of the Test of

795 Gross Motor Development–third Edition. Journal of Motor Learning and Development,

796 5(1), 45-58. doi: 10.1123/jmld.2016-0003

797Webster, E. K., Martin, C. K., & Staiano, A.E. (2018). Fundamental motor skills, screen- time,

798 and physical activity in preschoolers. Sport and health science, 8(2), 114-121.

799 doi: 10.1016/j.jshs.2018.11.006.

800Wickham, H., & Wickham, M. H. (2020). Package ‘plyr’. https://cran. rproject.

801 org/web/packages/dplyr/dplyr. pdf.

802Wiesner, M., & Schanding, G. T. (2013). Exploratory structural equation modeling, bifactor

803 models, and standard confirmatory factor analysis models: Application to the BASC–2

804 Behavioral and Emotional Screening System Teacher Form. School Psychology, 51(6),

805 751–763. doi:10.1016/j.jsp.2013.09.001

806Zinbarg, R. E., Revelle, W., Yovel, I., & Li, W. (2005). Cronbach’s α, Revelle’s β, and

807 McDonald’s ω h: Their relations with each other and two alternative conceptualizations of

808 reliability. Psychometrika, 70(11):123-133. doi:10.1007/s11336-003-0974-7

809

810

811

812

813

814

815
30 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
30

816Table 1

817Fit Indices of Models Tested for TGMD-3

Model χ 2 (df) CFI TLI RMSEA 90%CI Δ χ2(df); p-value Δ CFI


RMSEA
Unidimensional 236.390(65)* .947 .936 .073 .063-.083
Bidimensional 125.916(64)* .981 .977 .044 .033-.055 55.40(1) ; * 9.8 .034 *
Bifactor 98.017(54)* .986 .980 .041 .027-.053 19.85(10) ; **0.03 .005 **

818Note: * = comparing the models bidimensional versus unidimensional; ** = comparing the models bifactor

819versus bidimensional. χ² = chi-square statistic; df = degrees of freedom; CFI = comparative fit index; TLI =
820Tucker-Lewis Index; RMSEA = root mean square error of approximation; CI = confidence interval
821

822

823

824

825

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840
31 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
31

841Table 2

842Standardized Factor Loadings and error variance of the One-, Two-, and Bifactor models
843
Bifactor model One-factor model Two-factor model
Skill indicators GFMS BS LS GMS BS LS
Two-hand strike .52 .29 .54 .56
One-hand strike .59 .37 .60 .63
Dribble .80 - .75 .79
Catch .77 - .74 .76
Kick .66 .12 .64 .67
Overhand throw .68 .22 .68 .71
Underhand throw .64 .22 .65 .67
Run .54 .28 .60 .61
Gallop .64 .36 .71 .73
Hop .77 .32 .82 .84
Skip .64 .30 .70 .72
Jump .66 .34 .72 .74
Slide .69 .42 .78 .80
Α .91 .80 .87
Ω .86 .11 .18
ωh .85 .11 .18
EVC .84
PUC .67
CR .89
AVE .54
844Note. GFMS= General fundamental motor skills; BS= Ball skills; LS= Locomotor skills; GMS= Gross

845Motor Skills; α = Cronbach’s alpha; ω = omega, ωh = omega hierarchical; ECV = explained common
846variance; PUC = percentage of uncontaminated correlations; CR= composite reliability; AVE= average
847variance extracted. All the factor loadings were significant at p < .05.

848

849
850

851

852

853

854

855

856

857

858

859
32 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
32

860Table 3

861Fit Indices of the Measurement Invariance Models for Gender.

Model fit
Model difference

Model S−B χ 2 (df) CFI RMSEA [90% CI] ∆ χ 2( ∆ df );p- ∆CFI


value

M1: Sex- 162.072(108) .983 .045[.030, .059] - -


Configural
M2: Sex-Weak 181.888(129) .984 .041[.026, .054] 16.54 (21) .74 .001
*

M3: Sex-Strong 186.007(139) .985 .037[.021, .050] 20.24 (31) .93 .002
*

M4: Sex-Strict 207.536(152) .983 .038[.024, .051] 34.48(44).86 .000


*

862
863
864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880
33 THE TEST OF GROSS MOTOR DEVELOPMENT-THIRD EDITION: A BIFACTOR MODEL
33

881Figure 1
882Bifactor model of Test of Gross Motor Development Third Edition.
883
884

885
886Note. Fc1 = Ball skills; Fc2= Locomotor skills; Gnr = General FMS.
887
888
889
890
891
892

You might also like