Professional Documents
Culture Documents
J. Nutr.-2013-Schoenaker-392-8
J. Nutr.-2013-Schoenaker-392-8
http://jn.nutrition.org/content/suppl/2013/02/10/jn.112.16901
1.DCSupplemental.html
The Journal of Nutrition
Methodology and Mathematical Modeling
Abstract
Treelet transform (TT) is a proposed alternative to factor analysis for deriving dietary patterns. Before applying this method
Introduction
Type 2 diabetes is a growing public health problem; never-
A diet consists of a variety of foods with complex combinations theless, results from the NursesÕ Health Study (1) show that the
of nutrients that are likely to interact. A way to examine the joint majority of cases could be avoided by behavior modification,
effect of food intakes and capture overall diet is to derive dietary
including maintaining a diet high in fiber and low in saturated
patterns using appropriate statistical methods. The identification
and trans fat and glycemic load. A great deal of epidemiological
of dietary patterns offers a comprehensive approach to study
and clinical research on the role of diet and diabetes has resulted
eating habits and makes it possible to examine the relations with
disease risk in order to propose well-grounded dietary guide- in a considerable body of evidence relating specific dietary pat-
lines. terns to the risk of diabetes (2–4). However, studies have used
different approaches for identifying dietary patterns. Two gen-
eral approaches have been used in observational studies: a prio-
1 ri methods, where nutritional variables are grouped according
The Australian Longitudinal Study on WomenÕs Health, which was conceived
and developed by groups of interdisciplinary researchers at the Universities of to prior knowledge or theory of a healthy diet (5,6), and a
Newcastle and Queensland, is funded by the Australian Government posteriori methods, where dietary patterns are derived from
Department of Health and Ageing. Gita D. Mishra is supported by the statistical modeling of dietary data allowing for hypothesis-
Australian National Health and Medical Research Council Centre for Research
Excellence in WomenÕs Health. generating analyses.
2
Author disclosures: D. A. J. M. Schoenaker, A. J. Dobson, S. S. Factor analysis is a widely used a posteriori method to
Soedamah-Muthu, and G. D. Mishra, no conflicts of interest. identify dietary patterns (7). The use of factor analysis, however,
3
Supplemental Table 1 is available from the ‘‘Online Supporting Material’’ link in remains controversial in the field of nutritional epidemiology
the online posting of the article and from the same link in the online table of
contents at http://jn.nutrition.org.
because of subjective choices made throughout the analytical
* To whom correspondence should be addressed. E-mail: g.mishra@sph.uq.edu. process. Examples are pregrouping of original food items prior
au. to analysis, choice of the number of factors to extract, the
ã 2013 American Society for Nutrition.
392 Manuscript received September 4, 2012. Initial review completed October 9, 2012. Revision accepted November 30, 2012.
First published online January 23, 2013; doi:10.3945/jn.112.169011.
TABLE 1 Factor analysis and TT for identifying dietary patterns: features/aims, assumptions and
decisions associated with each method
Factor analysis TT
Features/aims Uses the correlation matrix of food Uses the correlation matrix of food
items to derive dietary patterns items to derive dietary patterns
Factor loadings on each food item Factor loadings generally on only a
are used to identify important few food items are used to identify
foods contributing to each dietary important foods contributing to each
pattern to extract maximum variance dietary pattern
Provides a hierarchical
cluster tree to visually identify
dietary patterns
Assumptions Each factor is a linear combination Sparsity: factors contain only a few
of all food items to capture overall food items (omitting other food items
diet by giving them zero loadings)
Decisions Pregrouping of original food items Pregrouping of original food items
prior to analysis prior to analysis
The number of factors to extract The number of factors to extract
The method of rotation Determining the optimal cut-level for
the cluster tree
Labeling of factors Labeling of factors
method of rotation, and labeling of factors (8,9). Furthermore, study of factors affecting the health and well-being of 3 cohorts of
that each factor is a linear combination of all original food items Australian women born in 1973–1978 (‘‘young’’), 1946–1951 (‘‘mid-
may make interpretation complicated. age’’), and 1921–1926 (‘‘older’’). Women were randomly selected from
New estimation methods have been proposed to overcome the national Medicare health insurance database, which includes all
Australian citizens and permanent residents. Women from rural and
these limitations and provide better insight into diet and disease
remote areas were intentionally oversampled (12). Since 1996 surveys
etiology. Recently, the Treelet transform (TT)6, developed by Lee have been administered to each cohort every 2–4 y on a rolling basis.
et al. (10), was suggested to overcome some of the limitations of Further details of the recruitment methods and response have been
factor analysis mainly by improving the interpretation of factors described elsewhere (13). Informed consent was obtained from all
(11). Gorst-Rasmussen et al. (11) compared TT and Procrustes- participants at each survey, with ethical clearance obtained from the
rotated principal component analysis (PCA) as explanatory Human Research Ethics Committees of the University of Newcastle and
methods to study dietary patterns and the risk of myocardial the University of Queensland.
infarction in middle-aged men in a Danish prospective cohort
study. Risk estimates were not comparable with those obtained Participants and surveys. The present study focuses on women in the
using PCA, even though they found that TT factors were easier mid-age cohort. In 1996, 13,715 women aged 45–50 y participated in
the baseline survey (survey 1). This was estimated to be a 53–56%
to interpret due to the graphical representation of the clustering
response rate for this age cohort (12). Diabetes was assessed at every
of food items and the limited number of food items with a factor survey and dietary intake was first assessed at the third survey (S3). From
loading. The authors concluded that TT may be a useful al- the initial mid-age cohort, 11,226 women aged 50–55 y in 2001
ternative to factor analysis (11). To further assess the validity of this completed S3. This study further includes women during follow-up who
approach in nutritional epidemiology, however, comparisons are responded to the fourth, fifth, and sixth surveys in 2004 (S4, n = 10,905),
needed with other methods of determining dietary patterns and 2007 (S5, n = 10,638), and 2010 (S6, n = 9748), respectively. Attrition
with the resultant associations for a range of health outcomes. occurred mainly due to participants not returning the survey or inability
Therefore, the aim of the present study was to compare to contact the participant (14). Percentages of women deceased between
dietary patterns derived by factor analysis, a widely used surveys are 0.4% at S2, 0.5% at S3, 0.8% at S4, 0.7% at S5, and 0.8% at
method, and the proposed alternative, TT. Our second aim was S6. Women with history of type 1 or 2 diabetes or impaired glucose
tolerance (n = 745) or a history of cardiovascular disease (n = 703) before
to compare the associations between these dietary patterns and
or at S3, or with incomplete dietary data at S3 (n = 1627) were excluded;
incidence of diabetes. Associations between dietary patterns and the data of 8065 participants were used for obtaining dietary patterns.
diabetes have been extensively studied in literature (2–4), which Those with missing data on covariates (n = 716) were then excluded,
is ideal for critical evaluation of a new proposed analysis leaving complete data of 7349 women for analysis of the associations
method. between dietary patterns and incident diabetes.
Dietary intake. At S3, diet was assessed using an FFQ: the Dietary
Participants and Methods Questionnaire for Epidemiological Studies version 2. The development
The Australian Longitudinal Study on WomenÕs Health. The of the questionnaire (15) and its validation were previously reported
Australian Longitudinal Study on WomenÕs Health is a prospective (16). A total of 63 women completed 7-d weighted food records next to
the FFQ. Nutrient intakes were compared and deattenuated correlations
corrected for daily variation in nutrient intake ranged between 0.28 for
6
Abbreviations used: MET, total metabolic equivalent; PCA, principal compo- total vitamin A and 0.78 for carbohydrate after energy adjustment,
nent analysis; S1, basline survey; S2, S3, etc., second, third, etc. survey; TT, indicating that the FFQ was useful for assessing habitual intake (16).
Treelet transform. Participants were asked to report their usual frequency of consumption
FIGURE 1 Factor loadings for factors derived by factor analysis and TT for prudent patterns (A) and Western patterns (B) for participants in The
Australian Longitudinal Study on WomenÕs Health (n = 8065). TT, Treelet transform.
In contrast to factor analysis, TT does not automatically disregarded (11). These foods, however, show a high loading on
produce factors with high variance. A subjective decision is the Western pattern from factor analysis and have strong
made when selecting the cut-level for the cluster tree using cross- individual relationships with diabetes incidence [white bread:
validation before high variance factors can be extracted. The OR = 1.21 (95% CI: 1.12, 1.30); potatoes with fat: OR = 1.69
cut-level influences both the sparsity as well as the grouping of (95% CI: 1.01, 2.75)]. These different pattern structures,
the factors and might therefore affect the results when looking at therefore, result in different conclusions regarding their rela-
associations with disease incidence. Lowering the cut-level tionship with diabetes incidence.
results in increased sparsity, whereas increasing the cut-level In summary, we demonstrated that the proposal of a new
decreases sparsity, showing contributions from all food items approach to derive dietary patterns and comparison of method-
to each factor, comparable to factor analysis. Increasing the ologies gives insight into the importance of aims and assump-
sparsity improves interpretability, but at the same time, the tions in such analyses. Both factor analysis and TT involve
factor variances increase, which might result in unstable results subjective decisions to be made that should be explored in
(28). Performing TT with different cut-levels reveals the insta- sensitivity analyses and taken into account when interpreting
bilities and helps determine the optimal level (11). Pattern results and conclusions for public health messages. Sensitivity
structures remained comparable in our study when obtained analyses on, e.g., pregrouping of food items and number of
using cut-levels of 63; however, further decreasing or increasing factors to extract can indicate and optimize robustness of results.
the cut-level would most likely have a larger influence on the TT produces clearly interpretable factors that account for al-
structure of the patterns. Instead of cutting the tree at a single most as much variation as factors from factor analysis, but the
height, another approach could be to start near the root of the sparse factors do not represent an overall dietary pattern.
tree and descending deeper into the tree, looking for optimal Besides, results on the relation between dietary patterns from TT
identification of patterns regarding number of food items with and incidence of diabetes are not in line with consistent findings
a non-zero loading, interpretability, and public health rele- from the literature. Results from this study indicate that factor
vance (28). analysis might be a more appropriate method for identifying
A major concern when applying TT to nutritional data is overall dietary patterns associated with diabetes compared with
whether it is in line with the original aim of dietary pattern TT.
analysis: to derive dietary patterns that represent the frequency
and amount of all foods consumed to capture overall diet (29). Acknowledgments
The combined role of all foods is essential in the biologic in- The authors thank Professor Graham Giles of the Cancer
fluence of diet on disease as well as for dietary interventions and Epidemiology Centre of The Cancer Council Victoria for per-
public health messages (30). Where factors from factor analysis mission to use the Dietary Questionnaire for Epidemiological
comprise all food items, the sparsity feature of TT results in Studies (version 2), Melbourne: The Cancer Council Victoria,
patterns ignoring foods with zero loading, as in the case of white 1996. D.A.J.M.S., G.D.M., and A.J.D. designed research;
bread and potatoes with fat. This may be due to the fact that D.A.J.M.S. analyzed data and had primary responsibility for
these food items were not correlated with the newly formed final content; G.D.M. and A.J.D. contributed to statistical
variables from the local PCA and hence were subsequently analysis and interpretation of results and critical revision of the
Comparing methods for dietary pattern analysis 397
manuscript; and S.S.S-M. contributed by critical revision of the 15. Ireland P, Jolley D, Giles G, OÕDea K, Powles J, Rutishauser I, Wahlqvist
manuscript for important intellectual content. All authors read ML, Williams J. Development of the Melbourne FFQ: a food frequency
questionnaire for use in an Australian prospective study involving an
and approved the final manuscript. ethnically diverse cohort. Asia Pac J Clin Nutr. 1994;3:19–31.
16. Hodge A, Patterson AJ, Brown WJ, Ireland P, Giles G. The Anti Cancer
Council of Victoria FFQ: relative validity of nutrient intakes compared
Literature Cited with weighed food records in young to middleaged women in a study of
1. Hu FB, Manson JE, Stampfer MJ, Colditz G, Liu S, Solomon CG, iron supplementation. Aust N Z J Public Health. 2000;24:576–83.
Willett WC. Diet, lifestyle, and the risk of type 2 diabetes mellitus in 17. Lewis J, Milligan G, Hunt A. NUTTAB95 Nutrient Data Table for use
women. N Engl J Med. 2001;345:790–7. in Australia. Canberra: Australian Government Publishing Service;
2. Esposito K, Kastorini CM, Panagiotakos DB, Giugliano D. Prevention 1995.
of type 2 diabetes by dietary patterns: a systematic review of prospective 18. National Health and Medical Research Council. Australian Alcohol
studies and meta-analysis. Metab Syndr Relat Disord. 2010;8:471–6. Guidelines: health risks and benefits. Canberra: ACT, Commonwealth
3. Kastorini CM, Panagiotakos DB. Dietary patterns and prevention of of Australia; 2001.
type 2 diabetes: from research to clinical practice; a systematic review. 19. Brown WJ, Burton NW, Marshall AL, Miller YD. Reliability and
Curr Diabetes Rev. 2009;5:221–7. validity of a modified self-administered version of the Active Australia
4. Salas-Salvadó J, Martinez-González M, Bulló M, Ros E. The role of diet physical activity survey in a sample of mid-age women. Aust N Z J
in the prevention of type 2 diabetes. Nutr Metab Cardiovasc Dis. Public Health. 2008;32:535–41.
2011;21:B32–48. 20. Brown WJ, Bauman AE. Comparison of estimates of population levels
5. Haines PS, Siega-Riz AM, Popkin BM. The Diet Quality Index revised: of physical activity using two measures. Aust N Z J Public Health.
a measurement instrument for populations. J Am Diet Assoc. 1999; 2000;24:520–5.
99:697–704. 21. Armstrong T, Bauman A, Davis J. Physical activity patterns of
6. Kennedy ET, Ohls J, Carlson S, Fleming K. The Healthy Eating Index: Australian adults. Canberra: Australian Institute of Health and Welfare;
design and applications. J Am Diet Assoc. 1995;95:1103–8. 2000.
7. Newby PK, Tucker KL. Empirically derived eating patterns using factor 22. WHO. Obesity: preventing and managing the global epidemic. Geneva:
WHO; 2000.