Ebook Multilevel and Longitudinal Modeling Using Stata Fourth Edition Volumes I and Ii Sophia Rabe Hesketh Anders Skrondal Online PDF All Chapter

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 70

Multilevel and Longitudinal Modeling

Using Stata Fourth Edition Volumes I


and II Sophia Rabe Hesketh Anders
Skrondal
Visit to download the full and correct content document:
https://ebookmeta.com/product/multilevel-and-longitudinal-modeling-using-stata-fourt
h-edition-volumes-i-and-ii-sophia-rabe-hesketh-anders-skrondal/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Microeconometrics Using Stata Second Edition Volume II


Nonlinear Models and Casual Inference Methods Adrian
Colin Cameron P K Trivedi

https://ebookmeta.com/product/microeconometrics-using-stata-
second-edition-volume-ii-nonlinear-models-and-casual-inference-
methods-adrian-colin-cameron-p-k-trivedi/

Health Econometrics Using Stata 1st Edition Partha Deb

https://ebookmeta.com/product/health-econometrics-using-
stata-1st-edition-partha-deb/

Environmental Econometrics Using Stata 1st Edition


Christopher F. Baum

https://ebookmeta.com/product/environmental-econometrics-using-
stata-1st-edition-christopher-f-baum/

Introduction to Time Series Using Stata Revised Edition


Revised Edition Sean Becketti

https://ebookmeta.com/product/introduction-to-time-series-using-
stata-revised-edition-revised-edition-sean-becketti/
Object Oriented Modeling and Design Using UML 2nd
Edition Singh

https://ebookmeta.com/product/object-oriented-modeling-and-
design-using-uml-2nd-edition-singh/

Microeconometrics Using Stata Cross Sectional and Panel


Regression Models 2nd Edition A Colin Cameron Pravin K
Trivedi

https://ebookmeta.com/product/microeconometrics-using-stata-
cross-sectional-and-panel-regression-models-2nd-edition-a-colin-
cameron-pravin-k-trivedi/

Multiphysics Modeling Using COMSOL 5 and MATLAB 2nd


Edition Pryor Phd

https://ebookmeta.com/product/multiphysics-modeling-using-
comsol-5-and-matlab-2nd-edition-pryor-phd/

Radar systems analysis and design using MATLAB Fourth


Edition Bassem R. Mahafza

https://ebookmeta.com/product/radar-systems-analysis-and-design-
using-matlab-fourth-edition-bassem-r-mahafza/

The Complete Guide to Blender Graphics Computer


Modeling and Animation Fourth Edition John M. Blain

https://ebookmeta.com/product/the-complete-guide-to-blender-
graphics-computer-modeling-and-animation-fourth-edition-john-m-
blain/
Multilevel and Longitudinal
Modeling Using Stata
Fourth Edition

SOPHIA RABE-HESKETH
University of California–Berkeley

ANDERS SKRONDAL
Norwegian Institute of Public Health
University of Oslo
University of California–Berkeley
®

A Stata Press Publication


StataCorp LLC
College Station, Texas
®

Copyright © 2005, 2008, 2012, 2022 by StataCorp LLC


All rights reserved. First edition 2005
Second edition 2008
Third edition 2012
Fourth edition 2022

Published by Stata Press, 4905 Lakeway Drive, College Station, Texas 77845

Typeset in LATEX 2

Printed in the United States of America

10 9 8 7 6 5 4 3 2 1

Print ISBN-10: 1-59718-136-6 (volumes I and II)

Print ISBN-10: 1-59718-137-4 (volume I)

Print ISBN-10: 1-59718-138-2 (volume II)

Print ISBN-13: 978-1-59718-136-5 (volumes I and II)

Print ISBN-13: 978-1-59718-137-2 (volume I)

Print ISBN-13: 978-1-59718-138-9 (volume II)

ePub ISBN-10: 1-59718-309-1 (volumes I and II)

ePub ISBN-10: 1-59718-310-5 (volumes I)

ePub ISBN-10: 1-59718-311-3 (volumes II)

ePub ISBN-13: 978-1-59718-309-3 (volumes I and II)

ePub ISBN-13: 978-1-59718-310-9 (volumes I)


ePub ISBN-13: 978-1-59718-311-6 (volumes II)

Mobi ISBN-10: 1-59718-312-1 (volumes I and II)

Mobi ISBN-10: 1-59718-313-X (volumes I)

Mobi ISBN-10: 1-59718-314-8 (volumes II)

Mobi ISBN-13: 978-1-59718-312-3 (volumes I and II)

Mobi ISBN-13: 978-1-59718-313-0 (volumes I)

Mobi ISBN-13: 978-1-59718-314-7 (volumes II)

Library of Congress Control Number: 2021944297

No part of this book may be reproduced, stored in a retrieval system, or


transcribed, in any form or by any means—electronic, mechanical, photocopy,
recording, or otherwise—without the prior written permission of StataCorp LLC.

Stata, , Stata Press, Mata, , and NetCourse are registered trademarks of


StataCorp LLC.

Stata and Stata Press are registered trademarks with the World Intellectual
Property Organization of the United Nations.

NetCourseNow is a trademark of StataCorp LLC.

LATEX 2 is a trademark of the American Mathematical Society.

Other brand and product names are registered trademarks or trademarks of their
respective companies.
Contents
Displays

Preface
Acknowledgments

I Preliminaries
1 Review of linear regression
1.1 Introduction
1.2 Is there gender discrimination in faculty salaries?
1.3 Independent-samples t test
1.4 One-way analysis of variance
1.5 Simple linear regression
1.6 Dummy variables
1.7 Multiple linear regression
1.8 Interactions
1.9 Dummy variables for more than two groups
1.10 Other types of interactions
1.10.1 Interaction between dummy variables
1.10.2 Interaction between continuous covariates
1.11 Nonlinear effects
1.12 Residual diagnostics
1.13 ❖ Causal and noncausal interpretations of regression
coefficients
1.13.1 Regression as conditional expectation
1.13.2 Regression as structural model
1.14 Summary and further reading
1.15 Exercises
II Two-level models

2 Variance-components models
2.1 Introduction
2.2 How reliable are peak-expiratory-flow measurements?
2.3 Inspecting within-subject dependence
2.4 The variance-components model
2.4.1 Model specification
2.4.2 Path diagram
2.4.3 Between-subject heterogeneity
2.4.4 Within-subject dependence
Intraclass correlation
Intraclass correlation versus Pearson correlation
2.5 Estimation using Stata
2.5.1 Data preparation: Reshaping from wide form to long form
2.5.2 Using xtreg
2.5.3 Using mixed
2.6 Hypothesis tests and confidence intervals
2.6.1 Hypothesis test and confidence interval for the population
mean
2.6.2 Hypothesis test and confidence interval for the between-
cluster variance
Likelihood-ratio test
❖ Score test
F test
Confidence intervals
2.7 ❖ Model as data-generating mechanism
2.8 Fixed versus random effects
2.9 Crossed versus nested effects
2.10 Parameter estimation
2.10.1 Model assumptions
Mean structure and covariance structure
Distributional assumptions
2.10.2 Different estimation methods
2.10.3 Inference for β
Estimate and standard error: Balanced case
Estimate: Unbalanced case
2.11 Assigning values to the random intercepts
2.11.1 Maximum “likelihood” estimation
Implementation via OLS
Implementation via the mean total residual
2.11.2 Empirical Bayes prediction
2.11.3 Empirical Bayes standard errors
Posterior and comparative standard errors
Diagnostic standard errors
Accounting for uncertainty in β
2.11.4 ❖ Bayesian interpretation of REML estimation and prediction
2.12 Summary and further reading
2.13 Exercises
3 Random-intercept models with covariates
3.1 Introduction
3.2 Does smoking during pregnancy affect birthweight?
3.2.1 Data structure and descriptive statistics
3.3 The linear random-intercept model with covariates
3.3.1 Model specification
3.3.2 Model assumptions
3.3.3 Mean structure
3.3.4 Residual covariance structure
3.3.5 Graphical illustration of random-intercept model
3.4 Estimation using Stata
3.4.1 Using xtreg
3.4.2 Using mixed
3.5 Coefficients of determination or variance explained
3.6 Hypothesis tests and confidence intervals
3.6.1 Hypothesis tests for individual regression coefficients
3.6.2 Joint hypothesis tests for several regression coefficients
3.6.3 Predicted means and confidence intervals
3.6.4 Hypothesis test for random-intercept variance
3.7 Between and within effects of level-1 covariates
3.7.1 Between-mother effects
3.7.2 Within-mother effects
3.7.3 ❖ Relations among within estimator, between estimator, and
estimator for random-intercept model
3.7.4 Level-2 endogeneity and cluster-level confounding
3.7.5 Conventional Hausman test
3.7.6 Allowing for different within and between effects
3.7.7 Robust Hausman test
3.8 Fixed versus random effects revisited
3.9 Assigning values to random effects: Residual diagnostics
3.10 More on statistical inference
3.10.1 ❖ Overview of estimation methods
Pooled OLS
Feasible generalized least squares (FGLS)
ML by iterative GLS (IGLS)
ML by Newton–Raphson and Fisher scoring
ML by the expectation-maximization (EM) algorithm
REML
3.10.2 Consequences of using standard regression modeling for
clustered data
Purely between-cluster covariate
Purely within-cluster covariate
3.10.3 ❖ Power and sample-size determination
Purely between-cluster covariate
Purely within-cluster covariate
3.11 Summary and further reading
3.12 Exercises
4 Random-coefficient models
4.1 Introduction
4.2 How effective are different schools?
4.3 Separate linear regressions for each school
4.4 Specification and interpretation of a random-coefficient model
4.4.1 Specification of a random-coefficient model
4.4.2 Interpretation of the random-effects variances and
covariances
4.5 Estimation using mixed
4.5.1 Random-intercept model
4.5.2 Random-coefficient model
4.6 Testing the slope variance
4.7 Interpretation of estimates
4.8 Assigning values to the random intercepts and slopes
4.8.1 Maximum “likelihood” estimation
4.8.2 Empirical Bayes prediction
4.8.3 Model visualization
4.8.4 Residual diagnostics
4.8.5 Inferences for individual schools
4.9 Two-stage model formulation
4.10 Some warnings about random-coefficient models
4.10.1 Meaningful specification
4.10.2 Many random coefficients
4.10.3 Convergence problems
4.10.4 Lack of identification
4.11 Summary and further reading
4.12 Exercises
III Models for longitudinal and panel data
5 Subject-specific effects, endogeneity, and unobserved

confounding
5.1 Introduction
5.2 Random-effects approach: No endogeneity
5.3 Fixed-effects approach: Level-2 endogeneity
5.3.1 De-meaning and subject dummies
De-meaning
Subject dummies
5.3.2 Hausman test
5.3.3 Mundlak approach and robust Hausman test
5.3.4 First-differencing
5.4 Difference-in-differences and repeated-measures ANOVA
5.4.1 Does raising the minimum wage reduce employment?
5.4.2 ❖ Repeated-measures ANOVA
5.5 Subject-specific coefficients
5.5.1 Random-coefficient model: No endogeneity
5.5.2 Fixed-coefficient model: Level-2 endogeneity
5.6 Hausman–Taylor: Level-2 endogeneity for level-1 and level-2
covariates
5.7 Instrumental-variable methods: Level-1 (and level-2)
endogeneity
5.7.1 Do deterrents decrease crime rates?
5.7.2 Conventional fixed-effects approach
5.7.3 Fixed-effects IV estimator
5.7.4 Random-effects IV estimator
5.7.5 More Hausman tests
5.8 Dynamic models
5.8.1 Dynamic model without subject-specific intercepts
5.8.2 Dynamic model with subject-specific intercepts
5.9 Missing data and dropout
5.9.1 ❖ Maximum likelihood estimation under MAR: A simulation
5.10 Summary and further reading
5.11 Exercises
6 Marginal models
6.1 Introduction
6.2 Mean structure
6.3 Covariance structures
6.3.1 Unstructured covariance matrix
6.3.2 Random-intercept or compound symmetric/exchangeable
structure
6.3.3 Random-coefficient structure
6.3.4 Autoregressive and exponential structures
6.3.5 Moving-average residual structure
6.3.6 Banded and Toeplitz structures
6.4 Hybrid and complex marginal models
6.4.1 Random effects and correlated level-1 residuals
6.4.2 Heteroskedastic level-1 residuals over occasions
6.4.3 Heteroskedastic level-1 residuals over groups
6.4.4 Different covariance matrices over groups
6.5 Comparing the fit of marginal models
6.6 Generalized estimating equations (GEE)
6.7 Marginal modeling with few units and many occasions
6.7.1 Is a highly organized labor market beneficial for economic
growth?
6.7.2 Marginal modeling for long panels
6.7.3 Fitting marginal models for long panels in Stata
6.8 Summary and further reading
6.9 Exercises
7 Growth-curve models
7.1 Introduction
7.2 How do children grow?
7.2.1 Observed growth trajectories
7.3 Models for nonlinear growth
7.3.1 Polynomial models
Estimation using mixed
Predicting the mean trajectory
Predicting trajectories for individual children
7.3.2 Piecewise linear models
Estimation using mixed
Predicting the mean trajectory
7.4 Two-stage model formulation and cross-level interaction
7.5 Heteroskedasticity
7.5.1 Heteroskedasticity at level 1
7.5.2 Heteroskedasticity at level 2
7.6 How does reading improve from kindergarten through third
grade?
7.7 Growth-curve model as a structural equation model
7.7.1 Estimation using sem
7.7.2 Estimation using mixed
7.8 Summary and further reading
7.9 Exercises
IV Models with nested and crossed random effects

8 Higher-level models with nested random effects


8.1 Introduction
8.2 Do peak-expiratory-flow measurements vary between methods
within subjects?
8.3 Inspecting sources of variability
8.4 Three-level variance-components models
8.5 Different types of intraclass correlation
8.6 Estimation using mixed
8.7 Empirical Bayes prediction
8.8 Testing variance components
8.9 Crossed versus nested random effects revisited
8.10 Does nutrition affect cognitive development of Kenyan
children?
8.11 Describing and plotting three-level data
8.11.1 Data structure and missing data
8.11.2 Level-1 variables
8.11.3 Level-2 variables
8.11.4 Level-3 variables
8.11.5 Plotting growth trajectories
8.12 Three-level random-intercept model
8.12.1 Model specification: Reduced form
8.12.2 Model specification: Three-stage formulation
8.12.3 Estimation using mixed
8.13 Three-level random-coefficient models
8.13.1 Random coefficient at the child level
Estimation using mixed
8.13.2 Random coefficient at the child and school levels
Estimation using mixed
8.14 Residual diagnostics and predictions
8.15 Summary and further reading
8.16 Exercises
9 Crossed random effects
9.1 Introduction
9.2 How does investment depend on expected profit and capital
stock?
9.3 A two-way error-components model
9.3.1 Model specification
9.3.2 Residual variances, covariances, and intraclass correlations
Longitudinal correlations
Cross-sectional correlations
9.3.3 Estimation using mixed
9.3.4 Prediction
9.4 How much do primary and secondary schools affect attainment
at age 16?
9.5 Data structure
9.6 Additive crossed random-effects model
9.6.1 Specification
9.6.2 Intraclass correlations
9.6.3 Estimation using mixed
9.7 Crossed random-effects model with random interaction
9.7.1 Model specification
9.7.2 Intraclass correlations
9.7.3 Estimation using mixed
9.7.4 Testing variance components
9.7.5 Some diagnostics
9.8 ❖ A trick requiring fewer random effects
9.9 Summary and further reading
9.10 Exercises
A Useful Stata commands
V Models for categorical responses

10 Dichotomous or binary responses


10.1 Introduction
10.2 Single-level logit and probit regression models for dichotomous
responses
10.2.1 Generalized linear model formulation
Labor-participation data
Estimation using logit
Estimation using glm
10.2.2 Latent-response formulation
Logistic regression
Probit regression
Estimation using probit
10.3 Which treatment is best for toenail infection?
10.4 Longitudinal data structure
10.5 Proportions and fitted population-averaged or marginal
probabilities
Estimation using logit
10.6 Random-intercept logistic regression
10.6.1 Model specification
Reduced-form specification
Two-stage formulation
10.6.2 Model assumptions
10.6.3 Estimation
Using xtlogit
Using melogit
Using gllamm
10.7 Subject-specific or conditional versus population-averaged or
marginal relationships
10.8 Measures of dependence and heterogeneity
10.8.1 Conditional or residual intraclass correlation of the latent
responses
10.8.2 Median odds ratio
10.8.3 ❖ Measures of association for observed responses at
median fixed part of the model
10.9 Inference for random-intercept logistic models
10.9.1 Tests and confidence intervals for odds ratios
10.9.2 Tests of variance components
10.10 Maximum likelihood estimation
10.10.1 ❖ Adaptive quadrature
10.10.2 Some speed and accuracy considerations
Integration methods and number of quadrature points
Starting values
Using melogit and gllamm for collapsible data
Spherical quadrature in gllamm
10.11 Assigning values to random effects
10.11.1 Maximum “likelihood” estimation
10.11.2 Empirical Bayes prediction
10.11.3 Empirical Bayes modal prediction
10.12 Different kinds of predicted probabilities
10.12.1 Predicted population-averaged or marginal probabilities
10.12.2 Predicted subject-specific probabilities
Predictions for hypothetical subjects: Conditional probabilities
Predictions for the subjects in the sample: Posterior mean
probabilities
10.13 Other approaches to clustered dichotomous data
10.13.1 Conditional logistic regression
Estimation using clogit
10.13.2 Generalized estimating equations (GEE)
Estimation using xtgee
10.14 Summary and further reading
10.15 Exercises
11 Ordinal responses
11.1 Introduction
11.2 Single-level cumulative models for ordinal responses
11.2.1 Generalized linear model formulation
11.2.2 Latent-response formulation
11.2.3 Proportional odds
11.2.4 ❖ Identification
11.3 Longitudinal data structure and graphs
11.3.1 Longitudinal data structure
11.3.2 Plotting cumulative proportions
11.3.3 Plotting cumulative sample logits and transforming the time
scale
11.4 Single-level proportional-odds model
11.4.1 Model specification
Estimation using ologit
11.5 Random-intercept proportional-odds model
11.5.1 Model specification
Estimation using meologit
Estimation using gllamm
11.5.2 Measures of dependence and heterogeneity
Residual intraclass correlation of latent responses
Median odds ratio
11.6 Random-coefficient proportional-odds model
11.6.1 Model specification
Estimation using meologit
Estimation using gllamm
11.7 Different kinds of predicted probabilities
11.7.1 Predicted population-averaged or marginal probabilities
11.7.2 Predicted subject-specific probabilities: Posterior mean
11.8 Do experts differ in their grading of student essays?
11.9 A random-intercept probit model with grader bias
11.9.1 Model specification
Estimation using gllamm
11.10 ❖ Including grader-specific measurement-error variances
11.10.1 Model specification
Estimation using gllamm
11.11 ❖ Including grader-specific thresholds
11.11.1 Model specification
Estimation using gllamm
11.12 ❖ Other link functions
Cumulative complementary log–log model
Continuation-ratio logit model
Adjacent-category logit model
Baseline-category logit and stereotype models
11.13 Summary and further reading
11.14 Exercises
12 Nominal responses and discrete choice
12.1 Introduction
12.2 Single-level models for nominal responses
12.2.1 Multinomial logit models
Transport data version 1
Estimation using mlogit
12.2.2 Conditional logit models with alternative-specific covariates
Transport data version 2: Expanded form
Estimation using clogit
Estimation using cmclogit
12.2.3 Conditional logit models with alternative- and unit-specific
covariates
Estimation using clogit
Estimation using cmclogit
12.3 Independence from irrelevant alternatives
12.4 Utility-maximization formulation
12.5 Does marketing affect choice of yogurt?
12.6 Single-level conditional logit models
12.6.1 Conditional logit models with alternative-specific intercepts
Estimation using clogit
Estimation using cmclogit
12.7 Multilevel conditional logit models
12.7.1 Preference heterogeneity: Brand-specific random intercepts
Estimation using cmxtmixlogit
Estimation using gllamm
12.7.2 Response heterogeneity: Marketing variables with random
coefficients
Estimation using cmxtmixlogit
Estimation using gllamm
12.7.3 ❖ Preference and response heterogeneity
Estimation using cmxtmixlogit
Estimation using gllamm
12.8 Prediction of marginal choice probabilities
12.9 Prediction of random effects and household-specific choice
probabilities
12.10 Summary and further reading
12.11 Exercises
VI Models for counts
13 Counts
13.1 Introduction
13.2 What are counts?
13.2.1 Counts versus proportions
13.2.2 Counts as aggregated event-history data
13.3 Single-level Poisson models for counts
13.4 Did the German healthcare reform reduce the number of
doctor visits?
13.5 Longitudinal data structure
13.6 Single-level Poisson regression
13.6.1 Model specification
Estimation using poisson
Estimation using glm
13.7 Random-intercept Poisson regression
13.7.1 Model specification
13.7.2 Measures of dependence and heterogeneity
13.7.3 Estimation
Using xtpoisson
Using mepoisson
Using gllamm
13.8 Random-coefficient Poisson regression
13.8.1 Model specification
Estimation using mepoisson
Estimation using gllamm
13.9 Overdispersion in single-level models
13.9.1 Normally distributed random intercept
Estimation using xtpoisson
13.9.2 Negative binomial models
Mean dispersion or NB2
Constant dispersion or NB1
13.9.3 Quasilikelihood
Estimation using glm
13.10 Level-1 overdispersion in two-level models
13.10.1 Random-intercept Poisson model with robust standard
errors
Estimation using mepoisson
13.10.2 Three-level random-intercept model
13.10.3 Negative binomial models with random intercepts
Estimation using menbreg
13.10.4 The HHG model
13.11 Other approaches to two-level count data
13.11.1 Conditional Poisson regression
Estimation using xtpoisson, fe
Estimation using Poisson regression with dummy variables for
clusters
13.11.2 Conditional negative binomial regression
13.11.3 Generalized estimating equations
Estimation using xtgee
13.12 Estimating marginal and conditional effects when responses
are missing at random
❖ Simulation
13.13 Which Scottish counties have a high risk of lip cancer?
13.14 Standardized mortality ratios
13.15 Random-intercept Poisson regression
13.15.1 Model specification
Estimation using gllamm
13.15.2 Prediction of standardized mortality ratios
13.16 ❖ Nonparametric maximum likelihood estimation
13.16.1 Specification
Estimation using gllamm
13.16.2 Prediction
13.17 Summary and further reading
13.18 Exercises
VII Models for survival or duration data
14 Discrete-time survival
14.1 Introduction
14.2 Single-level models for discrete-time survival data
14.2.1 Discrete-time hazard and discrete-time survival
Promotions data
14.2.2 Data expansion for discrete-time survival analysis
14.2.3 Estimation via regression models for dichotomous responses
Estimation using logit
14.2.4 Including time-constant covariates
Estimation using logit
14.2.5 Including time-varying covariates
Estimation using logit
14.2.6 Multiple absorbing events and competing risks
Estimation using mlogit
14.2.7 Handling left-truncated data
14.3 How does mother’s birth history affect child mortality?
14.4 Data expansion
14.5 ❖ Proportional hazards and interval-censoring
14.6 Complementary log–log models
14.6.1 Marginal baseline hazard
Estimation using cloglog
14.6.2 Including covariates
Estimation using cloglog
14.7 Random-intercept complementary log–log model
14.7.1 Model specification
Estimation using mecloglog
14.8 ❖ Population-averaged or marginal vs. cluster-specific or
conditional survival probabilities
14.9 Summary and further reading
14.10 Exercises
15 Continuous-time survival
15.1 Introduction
15.2 What makes marriages fail?
15.3 Hazards and survival
15.4 Proportional hazards models
15.4.1 Piecewise exponential model
Estimation using streg
Estimation using poisson
15.4.2 Cox regression model
Estimation using stcox
15.4.3 Cox regression via Poisson regression for expanded data
Estimation using xtpoisson, fe
15.4.4 Approximate Cox regression: Poisson regression with
smooth baseline hazard
Estimation using poisson
15.5 Accelerated failure-time models
15.5.1 Log-normal model
Estimation using streg
Estimation using stintreg
15.6 Time-varying covariates
Estimation using streg
15.7 Does nitrate reduce the risk of angina pectoris?
15.8 Marginal modeling
15.8.1 Cox regression with occasion-specific dummy variables
Estimation using stcox
15.8.2 Cox regression with occasion-specific baseline hazards
Estimation using stcox, strata
15.8.3 Approximate Cox regression
Estimation using poisson
15.9 Multilevel proportional hazards models
15.9.1 Cox regression with gamma shared frailty
Estimation using stcox, shared
15.9.2 Approximate Cox regression with log-normal shared frailty
Estimation using mepoisson
15.9.3 Approximate Cox regression with normal random intercept
and random coefficient
Estimation using mepoisson
15.10 Multilevel accelerated failure-time models
15.10.1 Log-normal model with gamma shared frailty
Estimation using streg
15.10.2 Log-normal model with log-normal shared frailty
Estimation using mestreg
15.10.3 Log-normal model with normal random intercept and
random coefficient
Estimation using mestreg
15.11 Fixed-effects approach
15.11.1 Stratified Cox regression with subject-specific baseline
hazards
Estimation using stcox, strata
15.12 ❖ Different approaches to recurrent-event data
15.12.1 Total-time risk interval
15.12.2 Counting-process risk interval
15.12.3 Gap-time risk interval
15.13 Summary and further reading
15.14 Exercises
VIII Models with nested and crossed random effects

16 Models with nested and crossed random effects


16.1 Introduction
16.2 Did the Guatemalan-immunization campaign work?
16.3 A three-level random-intercept logistic regression model
16.3.1 Model specification
16.3.2 Measures of dependence and heterogeneity
Types of residual intraclass correlations of the latent responses
Types of median odds ratios
16.3.3 Three-stage formulation
16.3.4 Estimation
Using melogit
Using gllamm
16.4 A three-level random-coefficient logistic regression model
16.4.1 Estimation
Using melogit
Using gllamm
16.5 Prediction of random effects
16.5.1 Empirical Bayes prediction
16.5.2 Empirical Bayes modal prediction
16.6 Different kinds of predicted probabilities
16.6.1 Predicted population-averaged or marginal probabilities:
New clusters
16.6.2 Predicted median or conditional probabilities
16.6.3 Predicted posterior mean probabilities: Existing clusters
16.7 Do salamanders from different populations mate successfully?
16.8 Crossed random-effects logistic regression
16.8.1 Setup for estimating crossed random-effects model using
melogit
16.8.2 Approximate maximum likelihood estimation
Estimation using melogit
16.8.3 Bayesian estimation
Brief introduction to Bayesian inference
Priors for the salamander data
Estimation using bayes: melogit
16.8.4 Estimates compared
16.8.5 Fully Bayesian versus empirical Bayesian inference for
random effects
16.9 Summary and further reading
16.10 Exercises
B Syntax for gllamm, eq, and gllapred: The bare essentials

C Syntax for gllamm


D Syntax for gllapred
E Syntax for gllasim

References
Author index

Subject index
Tables
1.1 Sums of squares (SS) and mean squares (MS) for one-way ANOVA
1.2 OLS estimates for salary data (in U.S. dollars)
2.1 Peak-expiratory-flow rate measured on two occasions using both
the Wright and the Mini Wright peak-flow meters
2.2 Maximum likelihood and restricted maximum likelihood estimates
for Mini Wright peak-flow meter
2.3 GHQ scores for 12 students tested on two occasions
2.4 Estimates for hypothetical test–retest study
3.1 Maximum likelihood estimates for smoking data with robust
standard errors (all estimates in grams)
3.2 Random-, between-, and within-effects estimates for smoking
data (in grams); MLE of random-intercept model (3.2), OLS of (3.11),
OLS of (3.12), and MLE of random-intercept model including all
cluster means
3.3 Overview of distinguishing features of fixed- and random-effects
approaches for linear models that include covariates
4.1 Maximum likelihood estimates for inner-London-schools data
with robust standard errors
III.1 Illustration of longitudinal data in long form
5.1 Violations of exogeneity classified by which types of errors are
correlated with which types of covariates
5.2 Estimates for subject-specific models for wage-panel data
5.3 Prefixes for differences, lags, and lagged differences in Stata’s
time-series operators
5.4 Fixed-effects estimates for minimum-wage and employment data
5.5 Fixed-effects (FE), fixed-effects instrumental-variable (FE-IV), and
random-effects instrumental-variable (RE-IV) estimates for deterrent
and crime data
5.6 Estimates for AR(1) dynamic models for wage-panel data
6.1 Common marginal covariance structures for longitudinal data (
). The number of parameters and requirements for timing of
occasions are also given. (In Stata, missing data are allowed for all
structures.) Whenever the variance is constant, it is denoted and
factored out.
6.2 Conditional and marginal variances and covariances of total
residuals for random-intercept model
6.3 Conditional and marginal variances and covariances of total
residuals for random-coefficient model
6.4 Pooled OLS estimates with no within-unit correlations and with
AR(1) correlations
7.1 Maximum likelihood estimates of random-coefficient models for
children’s-growth data with robust standard errors
7.2 Maximum likelihood estimates with robust standard errors for
quadratic models for children’s-growth data. “Model 3” includes
cross-level interaction. “Model 4” and “Model 5” allow the random
part of the model at level 1 and level 2, respectively, to differ
between boys (B) and girls (G).
7.3 Maximum likelihood estimates for reading data with robust
standard errors
8.1 Restricted maximum likelihood estimates for three-level variance-
components (VC) and random-intercept (RI) models for peak-
expiratory-flow data
8.2 Restricted maximum likelihood estimates for Kenyan-nutrition
data. Models with random intercept at both child and school levels
[RI(2) & RI(3)], random coefficient at child level and random intercept
at school level [RC(2) & RI(3)], and random coefficients at both child
and school levels [RC(2) & RC(3)].
9.1 REML estimates of two-way error-components model for
Grunfeld (1958) data
9.2 Restricted maximum-likelihood estimates for crossed random-
effects models for Fife data
9.3 Estimated intraclass correlations for Fife data
9.4 Rating data for 16 cases in incomplete block design
9.5 Latin-square design for nitrogen fertilization experiment
9.6 Ratings of seven skating pairs by seven judges using two criteria
(program and performance) in the 1932 Winter Olympics
10.1 ML estimates for logistic regression model for women’s labor
force participation
10.2 Estimates for toenail data
10.3 ML estimates for bitterness model
11.1 Maximum likelihood estimates and 95% CIs for proportional
odds model (POM), random-intercept proportional-odds model (RI-
POM), and random-coefficient proportional-odds model (RC-POM)
11.2 Maximum likelihood estimates for essay-grading data (for
models 1 and 2, )
12.1 Estimates for nominal regression models for choice of transport
12.2 Estimates for nominal regression models for choice of yogurt
13.1 Estimates for different kinds of Poisson regression: Ordinary,
GEE, random-intercept (RI), and fixed-intercept (FI)
13.2 Estimates for different kinds of random-effects Poisson
regression: random-intercept (RI) and random-coefficient (RC)
models
13.3 Two approaches for allowing for level-1 overdispersion in
random-intercept models for counts
13.4 Observed and expected numbers of lip-cancer cases and
various SMR estimates (in percentages) for Scottish counties
13.5 Estimates for random-intercept models for Scottish lip-cancer
data
14.1 Expanded data with time-constant and time-varying covariates
for first two assistant professors
14.2 Maximum likelihood estimates for logistic discrete-time hazards
model for promotions of assistant professors
14.3 Maximum likelihood estimates for complementary log–log
models with and without random intercept for Guatemalan-child-
mortality data
15.1 Parametric PH models: Name of density , form of baseline
hazard function and baseline survival function , and
parameters. Parameterization .
15.2 Estimated hazard ratios (HR) for PH models and time ratio (TR)
for accelerated failure-time (AFT) model with associated 95%
confidence intervals
15.3 Estimated hazard ratios for combinations of the spouses’ race
(both spouses White as reference category and adjusted for other
covariates)
15.4 AFT models: Name of density , form of baseline hazard
function and baseline survival function , parameterization,
and parameters
15.5 Hazards implied by Cox models: “Occ.spec. dummies” refers to
model (15.11) with occasion-specific dummy variables, and
“Occ.spec. baselines” refers to model (15.12) with occasion-specific
baseline hazards
15.6 PH models for angina data. Estimated hazard ratios for
treatment ISDN versus placebo with corresponding 95% confidence
intervals reported under “Fixed part” (other estimated regression
coefficients not shown). Estimated variance parameters reported
under “Random part”.
15.7 Conditional or subject-specific hazards implied by Cox model
with random intercept and random treatment effect
15.8 Conditional or subject-specific hazards implied by stratified Cox
model with subject-specific baseline hazards
16.1 Maximum likelihood estimates for three-level random-intercept
logistic model (using 15-point adaptive quadrature in melogit)
16.2 Maximum likelihood estimates for three-level random-intercept
and random-coefficient logistic models
16.3 Salamander-mating data (layout adapted from Vaida and Meng
[2005])
16.4 Different estimates for the salamander-mating data
Figures
1.1 Box plots of salary and log salary by gender
1.2 Histograms of salary and log salary by gender
1.3 Illustration of deviations contributing to total sum of squares
(TSS), model sum of squares (MSS), and sum of squared errors (SSE)
1.4 Scatterplot with LOWESS curve
1.5 Illustration of simple linear regression model
1.6 Illustration of sums of squares for simple linear regression
1.7 Scatterplot with predicted line from simple regression
1.8 Illustration of simple linear regression with a dummy variable
1.9 Illustration of multiple regression with a dummy variable for male
( ) and a continuous covariate, marketc ( )
1.10 Scatterplot with predicted lines from multiple regression
1.11 Estimated densities of marketc for men and women
1.12 Illustration of confounding: Top panel shows conditional
population means for the treatment groups, given (true, data-
generating model); bottom panel shows population means for the
treatment groups, not conditioning on
1.13 Illustration of interaction between male ( ) and yearsdg ( )
for marketc ( ) equal to 0 (not to scale)
1.14 Estimated effect of gender and time since degree on mean
salary for disciplines with mean marketability
1.15 Illustration: Interpretations of coefficients of dummy variables
and for associate and full professors, with assistant professors
as the reference category
1.16 Estimated effects of gender and time since degree on mean
salary for assistant professors in disciplines with mean marketability
1.17 Predicted residuals with overlaid normal distribution
1.18 Illustration of violation of exogeneity: Top panel shows the
structural model, where the errors are correlated with the
treatments; bottom panel shows the estimated regression model
when is omitted
2.1 Examples of clustered data
2.2 First and second measurements of peak-expiratory-flow using
Mini Wright meter versus subject number (the horizontal line
represents the overall mean)
2.3 Illustration of variance-components model for a subject
2.4 Path diagram of random part of random-intercept model
2.5 Illustration of lower intraclass correlation (top) and higher
intraclass correlation (bottom)
2.6 First recording of Mini Wright meter and second recording plus
100 versus subject number (the horizontal line represents the overall
mean)
2.7 Illustration of hierarchical sampling in variance-components
model
2.8 Illustration of nested and crossed factors
2.9 Prior distribution, “likelihood” (normalized), and posterior
distribution for a hypothetical subject with responses with
total residuals and [the vertical lines represent modes
(and means) of the distributions]
3.1 Illustration of random-intercept model for one mother
3.2 Illustration of random-intercept model for one mother
3.3 Illustration of different , , and the corresponding residual
intraclass correlations
3.4 Predictive margins and confidence intervals for birthweight data
3.5 Illustration of different within-cluster and between-cluster effects
of a covariate
3.6 Illustration of different within and between effects for two
clusters having the same value of ( is the within effect and
is the between effect); in the top panel, , whereas
in the bottom panel
3.7 Illustration of assuming 0 between effect for two clusters having
the same value of ( is the within effect and is the between
effect). The top panel is the same as the bottom panel of figure 3.6
with whereas in the bottom panel .
3.8 Histogram of standardized level-1 residuals
3.9 Histogram of standardized level-2 residuals
4.1 Scatterplot of gcse versus lrt for school 1 with ordinary least-
squares regression line
4.2 Trellis of scatterplots of gcse versus lrt with fitted regression
lines for all 65 schools
4.3 Scatterplot of estimated intercepts and slopes for all schools with
at least five students
4.4 Spaghetti plot of ordinary least-squares regression lines for all
schools with at least five students
4.5 Illustration of random-intercept and random-coefficient models
4.6 Perspective plot of bivariate normal distribution
4.7 Cluster-specific regression lines for random-coefficient model,
illustrating lack of invariance under translation of covariate (Source:
Skrondal and Rabe-Hesketh 2004)
4.8 Heteroskedasticity of total residual as function of lrt
4.9 Scatterplots of empirical Bayes (EB) predictions versus maximum
likelihood (ML) estimates of school-specific intercepts (top) and
slopes (bottom); equality of EB and ML shown as dashed reference
lines and ML estimates of 0 shown as solid reference lines
4.10 Spaghetti plots of empirical Bayes (EB) predictions of school-
specific regression lines for the random-intercept model (top) and
the random-coefficient model (bottom)
4.11 Histograms of predicted random intercepts and slopes
4.12 Scatterplot and histograms of predicted random intercepts and
slopes
4.13 Histogram of predicted level-1 residuals
4.14 Caterpillar plot of random-intercept predictions and
approximate 95% confidence intervals versus ranking (school
identifiers shown on top of confidence intervals)
4.15 Stretched caterpillar plot of random-intercept predictions and
approximate 95% confidence intervals versus ranking (school
identifiers shown on top of confidence intervals)
III.1 Box plots of log hourly wages at each occasion
III.2 Trellis graph of trajectories for log hourly wage for 12 randomly
chosen subjects
III.3 Scatterplot of log hourly wages versus occasions; individual
trajectories for 12 randomly chosen subjects (thin solid lines) and
mean trajectory (thick dashed line)
III.4 Lexis diagram for the relationship between age, period, and
cohort
5.1 Path diagram of specified AR(1) dynamic model
5.2 Dynamic model with subject-specific random effect with
responses at four occasions. Data-generating model (left) and naïve
specified model (right) [adapted from Skrondal and Rabe-
Hesketh (2014b)]
6.1 Relationships between covariance structures assuming fixed and
equally spaced occasions; arrows point from a more general model
to a model nested within it
6.2 Estimated residual standard deviations and correlation matrices
from mixed
6.3 Illustration of marginal variances and correlations induced by
random-coefficient models ( , , , , )
6.4 Path diagram of AR(1) process for residuals
6.5 Simulated AR(1) process (top panel) and white noise (bottom
panel) where both processes have the same mean and variance
6.6 Path diagram of MA(1) process for residuals
7.1 Observed growth trajectories for boys and girls
7.2 Illustration of different polynomial functions
7.3 Mean trajectory for boys from quadratic model
7.4 Mean trajectory and 95% range of subject-specific trajectories
for boys from quadratic model
7.5 Trellis graph of observed responses (dots) and fitted trajectories
(dashed lines) for boys
7.6 Trellis graph of observed responses (dots) and predicted
trajectories (dashed lines) from quadratic model for girls
7.7 Illustration of piecewise linear function
with knots at 2 and 6 (top) and corresponding spline basis functions
(bottom)
7.8 Spline basis functions for piecewise-linear model for children’s-
growth data
7.9 Mean trajectory and 95% range of subject-specific trajectories
for boys from piecewise-linear model
7.10 Path diagram of linear growth-curve model with four time
points
7.11 Box plots of reading scores for each grade
7.12 Sample mean growth trajectory for reading score
7.13 Fitted mean trajectory and sample mean trajectory for reading
scores
8.1 Illustration of three-level design
8.2 Scatterplot of peak expiratory flow measured by two methods
versus subject
8.3 Illustration of error components for the three-level variance-
components model for a subject
8.4 Path diagram of random part of three-level model
8.5 Trellis of spaghetti plots for schools in Kenyan-nutrition study,
showing observed growth trajectories
8.6 Box plots of empirical Bayes predictions for random intercepts at
the school level , random intercepts at the child level , and
level-1 residuals at the occasion level
8.7 Bivariate and univariate distributions of empirical Bayes
predictions for random intercepts and random slopes at the
child level
8.8 Trellis of spaghetti plots for schools in Kenyan-nutrition study,
showing predicted growth trajectories for children
8.9 Predicted mean Raven’s scores over time for the four
interventions among boys whose age at baseline was equal to the
average baseline age across both genders
8.10 Path diagrams of equivalent models (left panel: three-stage
formulation of three-level model; right panel: correlated random
effects)
9.1 Sum of the predicted random effects versus time for 10
firms
9.2 Predicted random effect of year
9.3 Normal Q–Q plot for secondary school predictions
9.4 Normal Q–Q plot for primary school predictions
9.5 Model structure and data structure for students in primary
schools crossed with secondary schools (source: Skrondal and Rabe-
Hesketh [2004])
10.1 Predicted probability of working from logistic regression model
(for range of husbinc in dataset)
10.2 Illustration: Predicted probability of working from logistic
regression model, extrapolated beyond the range of husbinc in the
data
10.3 Illustration of equivalence of latent-response and generalized
linear model formulations for logistic regression
10.4 Illustration of equivalence between probit models with change
in residual standard deviation counteracted by change in slope
10.5 Predicted probabilities of working from logistic and probit
regression models for women without children at home
10.6 Bar plot of proportion of patients with toenail infection by visit
and treatment group
10.7 Line plot of proportion of patients with toenail infection by
average time at visit and treatment group
10.8 Proportions and fitted probabilities using ordinary logistic
regression
10.9 Subject-specific probabilities (thin, dashed curves), population-
averaged probabilities (thick, solid curve), and population median
probabilities (thick, dashed curve) for random-intercept logistic
regression
10.10 Gauss–Hermite quadrature: Approximating continuous density
(dashed curve) by discrete distribution (bars)
10.11 Density of (dashed curve), normalized integrand (solid
curve), and quadrature weights (bars) for ordinary quadrature and
adaptive quadrature (source: Rabe-Hesketh, Skrondal, and
Pickles 2002)
10.12 Empirical Bayes modal predictions (circles) and nonmissing
maximum “likelihood” estimates (triangles) versus empirical Bayes
predictions
10.13 Fitted marginal probabilities using ordinary and random-
intercept logistic regression
10.14 Conditional and marginal predicted probabilities for random-
intercept logistic regression model
10.15 Posterior mean probabilities against time for 16 patients in the
control group (a) and treatment group (b) with predictions for
missing responses shown as diamonds
11.1 Illustration of threshold model for categories
11.2 Illustration of three-category ordinal probit model without
covariates
11.3 Illustration of equivalence of latent-response and generalized
linear model formulation for ordinal logistic regression
11.4 Illustration of cumulative and category-specific response
probabilities
11.5 Relevant odds for in a
proportional odds model with four categories. Odds is a ratio of
probabilities of events. Events included in the numerator probability
are in thick frames and events included in the denominator
probability are in thin frames [adapted from Brendan Halpin’s web
notes on “Models for ordered categories (ii)”].
11.6 Illustration of translation and scale invariance in cumulative
probit model
11.7 Cumulative sample proportions versus week
11.8 Cumulative sample logits versus week
11.9 Cumulative sample logits versus square root of week
11.10 Cumulative sample proportions and predicted cumulative
probabilities from ordinal logistic regression versus week
11.11 Marginal category probabilities from random-coefficient
proportional-odds model versus week
11.12 Cumulative sample proportions and cumulative predicted
marginal probabilities from random-coefficient proportional-odds
model versus week
11.13 Area graph analogous to stacked bar chart for marginal
predicted probabilities from random-coefficient proportional-odds
model
11.14 Posterior mean cumulative probabilities for 12 patients in
control group and 12 patients in treatment group versus week
11.15 Relevant odds for different logit link models for ordinal
responses. Events corresponding to “success” are in thick frames,
and events corresponding to “failure” are in thin frames—when not
all categories are shown, the odds are conditional on the response
being in one of the categories shown [adapted from Brendan
Halpin’s web notes on “Models for ordered categories (ii)”].
12.1 Illustration of category probabilities for multinomial logit model
with four categories
12.2 Relevant odds for in a
baseline category logit model with four categories. The outcomes in
the numerator probability are in thick frames, and outcomes included
in the denominator probability are in thin frames [adapted from
Brendan Halpin’s web notes on “Models for ordered categories (ii)”].
12.3 Empirical Bayes (EB) predictions of household-specific
coefficients of pricec and feature
12.4 Posterior mean choice probabilities versus price in cents/oz of
Yoplait for six households. Based on conditional logit model with
response heterogeneity when there is no feature advertising and
when the price of Weight Watchers and Dannon is held constant at 8
cents/oz.
13.1 Map of crude SMR as percentage (Source: Skrondal and Rabe-
Hesketh 2004)
13.2 Map of SMRs assuming normally distributed random intercept
(no covariate) (Source: Skrondal and Rabe-Hesketh 2004)
13.3 Empirical Bayes SMRs versus crude SMRs
13.4 Lip cancer in Scotland: SMRs (locations) and probabilities for
nonparametric maximum likelihood estimate of random-intercept
distribution
VII.1 Illustration of different types of censoring and truncation in
calendar time (top panel) and analysis time (bottom panel) Dots
represent events and arrowheads represent censoring.
14.1 Expansion of original data to person–year data for first two
assistant professors (first one right-censored in year 10, second one
promoted in year 4)
14.2 Discrete-time hazard (conditional probability of promotion given
that promotion has not yet occurred)
14.3 Predicted log odds of promotion given that promotion has not
yet occurred for assistant professors 1 (solid) and 4 (dashed)
14.4 Relevant odds for
in a continuation-ratio logit model with four time intervals. Odds is
the ratio of probabilities of events; events included in numerator
probability are in thick frames, and events included in denominator
probability are in thin frames. [This diagram is very similar to those
in Brendan Halpin’sHalpin, B. web notes on “Models for ordered
categories (ii)”.]
14.5 Predicted probability of remaining an assistant professor for
assistant professors 1 (solid) and 4 (dashed)
14.6 Predicted cluster-specific or conditional (median) survival
function and population-averaged or marginal survival function
15.1 Kaplan–Meier survival plot of for divorce data
15.2 Smoothed hazard estimate for divorce data
15.3 Piecewise constant baseline hazard curve
15.4 Kernel-smoothed hazard curves from Cox regression
15.5 Estimated baseline hazard curve from piecewise exponential
model with cubic spline and Cox model
15.6 Hazards for log-normal survival model according to value taken
by sheolder
15.7 Smoothed estimated hazard functions at second exercise test
occasion for treatment and placebo groups
15.8 Estimated baseline hazard functions from Poisson regression
with orthogonal polynomials and Cox regression
15.9 Smoothed estimated conditional hazard functions for the
second exercise test occasion from Cox regression with gamma
frailty evaluated at mean
15.10 Subject-specific or conditional hazard functions (top) and
population-averaged or marginal hazard functions (bottom) at
second exercise test for treatment and placebo groups
15.11 Illustration of risk intervals for total time, counting process,
and gap time. Unrestricted risk sets shown as intersections between
vertical and horizontal lines and restricted risk sets shown as circles
(adapted from Kelly and Lim [2000]).
16.1 Three-level structure of Guatemalan-immunization data
16.2 Empirical Bayes predictions of community-level random slopes
versus community-level random intercepts; based on three-level
random-coefficient logistic regression model
16.3 Predicted median or conditional probabilities of immunization
with random effects set to 0 (solid curves) and marginal probabilities
of immunization (dashed curves). Curves higher up in each graph
correspond to and curves lower down to .
Based on three-level random-coefficient logistic regression model.
16.4 Prior distribution, “likelihood” (normalized), and posterior
distribution for a hypothetical cluster with units with total
residuals and [the vertical lines represent modes (and
means) of the distributions]
16.5 Inverse-gamma prior for with shape and scale parameters
equal to 1;
16.6 Posterior distributions of random effects for three males (top
row) and three females (bottom row) in group 1
16.7 Kernel density estimate of posterior density of random effect for
female 2 in group 1 with normal approximation
16.8 Normal approximations to the conditional posterior (with
Bayesian estimates plugged in for model parameters) and marginal
posteriors of the random effect of female 2 in group 1
Displays
1.1 Log-linear models and multiplicative effects
2.1 How many clusters are needed?
2.2 Wald and score statistics as approximations to likelihood-ratio
statistic
3.1 Matrix expressions for OLS and GLS estimators
3.2 Residual covariance structure for random-intercept model
3.3 Approximate relationship between power, significance level,
effect size, and standard error for two-sided test
III.1 Fixed versus varying occasions and equal versus unequal
spacing of occasions
5.1 Instrumental-variables estimation
5.2 Elasticities
6.1 Covariance structure induced by random-coefficient model
8.1 Asymptotic null distributions for likelihood-ratio testing of
variance components when random effects are uncorrelated
9.1 Brief summary of Grunfeld’s (1958) investment theory
10.1 Partial effects at the average (PEA) and average partial effects
(APE) for the logistic regression model,
, where is
continuous and is binary
12.1 Odds ratio used in this book equals Stata’s relative-risk ratio
12.2 Conditional probability of choosing an alternative given that
exactly one alternative is chosen
12.3 Correlations between utility differences in conditional logit
model with alternative-specific random intercepts
15.1 Demonstration of two central relations in survival analysis:
and
15.2 Kaplan–Meier estimator of survival function
15.3 Demonstration of for PH model
15.4 Partial likelihood and ties in Cox regression
Preface
This book is about applied multilevel and longitudinal modeling.
Other terms for multilevel models include hierarchical models,
random-effects or random-coefficient models, mixed-effects models,
or simply mixed models. Longitudinal data are also referred to as
panel data, repeated measures, or cross-sectional time series. A
popular type of multilevel model for longitudinal data is the growth-
curve model.

The common theme of this book is regression modeling when


data are clustered in some way. In cross-sectional settings, students
may be nested in schools, people in neighborhoods, employees in
firms, or twins in twin-pairs. Longitudinal data are by definition
clustered because multiple observations over time are nested within
units, typically subjects.

Such clustered designs often provide rich information on


processes operating at different levels, for instance, people’s
characteristics interacting with institutional characteristics.
Importantly, the standard assumption of independent observations is
likely to be violated because of dependence among observations
within the same cluster. The multilevel and longitudinal methods
discussed in this book extend conventional regression to handle such
dependence and exploit the richness of the data.

Volume 1 is on multilevel and longitudinal modeling of continuous


responses using linear models. The volume consists of four parts: I.
Preliminaries (a review of linear regression modeling, preparing the
reader for the rest of the book), II. Two-level models, III. Models for
longitudinal and panel data, and IV. Models with nested and crossed
random effects. For readers who are new to multilevel and
longitudinal modeling, the chapters in part II should be read
sequentially and can form the basis of an introductory course on this
topic. A one-semester course on multilevel and longitudinal modeling
can be based on most of the chapters in volume 1 plus chapter 10
on binary or dichotomous responses from volume 2. For this
purpose, we have made chapter 10 freely downloadable from
https://www.stata-press.com/books/mlmus4_ch10.pdf.

Volume 2 is on multilevel and longitudinal modeling of categorical


responses, counts, and survival data. This volume also consists of
four parts: I. Categorical responses (binary or dichotomous
responses, ordinal responses, and nominal responses or discrete
choice), II. Counts, III. Survival (in both discrete and continuous
time), and IV. Models with nested and crossed random effects. Each
chapter starts by introducing models for nonclustered data (for
example, logistic and Poisson regression) and then extends the
models for clustered data by introducing random effects, leading to
generalized linear mixed models. Subsequently, alternatives such as
generalized estimating equations (GEE) and fixed-effects approaches
are discussed. Chapter 10 on binary or dichotomous responses is a
core chapter of this volume and should be read before embarking on
the other chapters. It is also a good idea to read chapter 14 on
discrete-time survival before reading chapter 15 on continuous-time
survival.

Our emphasis is on explaining the models and their assumptions,


applying the methods to real data, and interpreting results. Many of
the issues are conceptually demanding but do not require that you
understand complex mathematics. Therefore, wherever possible, we
introduce ideas through examples and graphical illustrations, keeping
the technical descriptions as simple as possible. Some sections that
go beyond an introductory course on multilevel and longitudinal
modeling are tagged with the ❖ symbol. Derivations that can be
skipped by the reader are given in displays. For an advanced
treatment, placing multilevel modeling within a general latent-
variable framework, we refer the reader to Skrondal and Rabe-
Hesketh (2004), which uses the same notation as this book.
This book shows how all the analyses described can be
performed using Stata. There are many advantages of using a
general-purpose statistical package such as Stata. First, for those
already familiar with Stata, it is convenient not having to learn a new
stand-alone package. Second, conducting multilevel analysis within a
powerful package has the advantage that it allows complex data
manipulation to be performed, alternative estimation methods to be
used, and publication-quality graphics to be produced, all without
having to switch packages. Finally, Stata is a natural choice for
multilevel and longitudinal modeling because it has gradually
become perhaps the most powerful general-purpose statistics
package for such models.

Each chapter is based on one or more research problems and


real datasets. After describing the models, we walk through the
analysis using Stata, pausing to address statistical issues that need
further explanation. Do-files for each chapter can be downloaded
from https://www.stata-press.com/data/mlmus4.html. Some readers
may find it useful to perform the analyses while reading the book.

Stata can be used either via a graphical user interface (GUI) or


through commands. We recommend using commands interactively—
or preferably in do-files—for serious analysis in Stata. For this
reason, and because the GUI is fairly self-explanatory, we use
commands exclusively in this book. However, the GUI can be useful
for learning the Stata syntax. Generally, we use the typewriter font
to refer to Stata commands, syntax, and variables. A “dot” prompt
followed by a command indicates that you can type verbatim what is
displayed after the dot (in context) to replicate the results in the
book. Some readers may find it useful to intersperse reading with
running these commands. We encourage readers to write do-files for
solving the data analysis exercises because this is standard practice
for professional data analysis.

The commands used for data manipulation and graphics are


explained to some extent, but the purpose of this book is not to
teach Stata from scratch. For a basic introduction to Stata, we refer
the reader to Acock (2018). Other books and further resources for
learning Stata are listed at the Stata website.

If you are new to Stata, we recommend running all the


commands given in chapter 1 of volume 1. A list of commands that
are particularly useful for manipulating, describing, and plotting
multilevel and longitudinal data is given in the appendix of volume 1.
Examples using these and other commands can easily be found by
referring to the “commands” entry in the subject index.

We have included applications from a wide range of disciplines,


including medicine, economics, education, sociology, and psychology.
The interdisciplinary nature of this book is also reflected in the
choice of models and topics covered. If a chapter is primarily based
on an application from one discipline, we try to balance this by
including exercises with real data from other disciplines. The two
volumes contain over 140 exercises based on over 100 different real
datasets. Exercises for which solutions are available to readers are
marked with , and the solutions can be downloaded from
https://www.stata-press.com/books/mlmus4-answers.html.
Instructors can obtain solutions to all exercises from Stata Press.

All datasets used in this book are freely available for download;
for details, see https://www.stata-press.com/data/mlmus4.html.
These datasets can be downloaded into a local directory on your
computer. Alternatively, individual datasets can be loaded directly
into net-aware Stata by specifying the complete URL. For example,

If you have stored the datasets in the working directory, omit the
path and just type
We will generally describe all Stata commands that can be used
to fit a given model, discussing their advantages and disadvantages.
An exception to this rule is that we do not discuss our own gllamm
command in volume 1 (see the gllamm companion, downloadable
from http://www.gllamm.org, for how to fit the models of volume 1
in gllamm). In volume 1, we extensively use the Stata commands
xtreg and mixed, and we introduce several more specialized
commands for longitudinal modeling, such as xthtaylor, xtivreg,
and xtabond. The sem command for structural equation modeling is
used for growth-curve modeling.

In volume 2, we use Stata’s xt and me commands for different


response types. For example, we use xtlogit and melogit for binary
responses, meologit for ordinal responses, xtpoisson and mepoisson
for counts, and mestreg for multilevel continuous-time survival
modeling with shared frailties. In chapter 12 on nominal responses,
we use Stata’s new cm (for “choice model”) suite of commands, such
as cmxtmixlogit. gllamm is also used throughout volume 2. We also
discuss commands for marginal models and fixed-effects models,
such as xtgee and clogit. The online reference manuals available
through the help command within Stata provide detailed information
on all the official Stata commands for multilevel and longitudinal
modeling.

The nolog option has been used to suppress the iteration logs
showing the progress of the log likelihood. This option is not shown
in the command line because we do not recommend it to users; we
are using it only to save space.

We assume that readers have a good knowledge of linear


regression modeling, in particular, the use and interpretation of
dummy variables and interactions. However, the first chapter in
volume 1 reviews linear regression and can serve as a refresher.

Errata for different editions and printings of the book can be


downloaded from https://www.stata-
Another random document with
no related content on Scribd:
“Yes; two seasons ago. I can get a new one for three dollars.”
“Not like that.”
“Well, maybe not, but good enough.”
“I’ll let you have it for a dollar and a half,” went on Jack. “That’s
cheap enough.”
“Give you a dollar,” replied Tom quickly, who knew how to bargain.
“All right,” and Jack sighed a little. He had hoped to get enough to
put aside some cash for future emergencies.
Tom passed over the dollar. Then he tried on the glove. It certainly
was a good one.
“Come on in and I’ll treat you to a soda,” he proposed generously, for
he decided that he had obtained a bargain, and could afford to treat.
“Going to the show?” asked Tom, as the two came out of the drug
store.
“Sure. That’s what I sold the glove for.”
“What’s the matter? Don’t your dad send you any money?”
“Yes, he left some for me, but it’s like pulling teeth to get it from old
Klopper. He wouldn’t give me even fifty cents to-night, and he sent
me to my room. But I sneaked out, and I’m going to have some fun.”
“That’s the way to talk! He’s a regular hard-shell, ain’t he?”
“I should say yes! But come on, or maybe we won’t get a good seat.”
“Oh, I got my ticket,” replied Tom. “Besides, I want to take this glove
home. I’ll see you there.”
Jack hastened to the town auditorium, where, occasionally, traveling
theatrical shows played a one-night stand. There was quite a throng
in front of the box office, and Jack was afraid he would not get a
seat, but he managed to secure one well down in front.
The auditorium began to fill up rapidly. Jack saw many of his chums,
and nodded to them. Then he began to study the program. An
announcement on it caught his eye. It was to the effect that during
the entertainment a chance would be given to any amateur
performers in the audience to come upon the stage, and show what
they could do in the way of singing, dancing or in other lines of public
entertaining. Prizes would be given for the best act, it was stated;
five dollars for the first, three for the second, and one for the third.
“Say,” Jack whispered to Tom, who came in just then, “going to try
for any of those prizes?”
“Naw,” replied Tom, vigorously chewing gum. “I can’t do nothin’.
Some of the fellows are, though. Arthur Little is going to recite, and
Sam Parsons is going to do some contortions. Why, do you want to
try?”
“I’d like to.”
“What can you do?”
“My clown act,” replied Tom. “I’ve got some new dancing steps, and
maybe I could win a prize.”
“Sure you could,” replied Tom generously. “Go ahead. I’ll clap real
loud for you.”
“Guess I will,” said Jack, breathing a little faster under the exciting
thought of appearing on a real stage. He had often taken the part of
a clown in shows the boys arranged among themselves, but this
would be different.
“Ah, there goes the curtain!” exclaimed Tom, as the orchestra
finished playing the introduction, and there was a murmur all over
the auditorium, as the first number of the vaudeville performance
started.
CHAPTER III
JACK IS PUNISHED

The show was a fairly good one, and Jack and the other boys, as
well as older persons in the audience, enjoyed the various numbers,
from the singing and dancing, to a one-act sketch.
More than one was anxious, however, for the time to come when the
amateurs would be given a chance. At length the manager came
before the curtain, and announced that those who wished might try
their talents on the audience.
Several of the boys began to call for this or that chum, whom they
knew could do some specialty.
“Give us that whistling stunt, Jimmy!” was one cry.
“Hey, Sim; here’s a chance to show how far you can jump!” cried
another.
“Speak about the boy on the burning deck!” suggested a third.
“Now we must have quietness,” declared the manager. “Those who
wish to perform may come up here, give me their names, and I will
announce them in turn.”
Several lads started for the stage, Jack included. His chums called
good-naturedly after him as he walked up the aisle.
“I might as well have all the fun I can to-night,” thought our hero.
“When Professor Klopper finds out what I’ve done, if he hasn’t
already, he’ll be as mad as two hornets.”
The boys, and one or two girls, who had stage aspirations, crowded
around the manager, eager to give in their names.
“Now, one at a time, please,” advised the theatrical man. “You’ll each
be given a chance. I may add,” he went on, turning to the audience,
“that the prizes will be awarded by a popular vote, as manifested by
applause. The performer getting the most applause will be
considered to have won the five dollars, and so with the other two
prizes.”
The amateurs began. Some of them did very well, while others only
made laughing stocks of themselves. One of the girls did remarkably
well in reciting a scene from Shakespeare.
At last it came Jack’s turn. He was a little nervous as he faced the
footlights, and saw such a large crowd before him. A thousand eyes
seemed focused on him. But he calmed himself with the thought that
it was no worse than doing as he had often done when taking part in
shows that he and his chums arranged.
While waiting for his turn Jack had made an appeal to the property
man of the auditorium, whom he knew quite well. The man, on
Jack’s request, had provided the lad with some white and red face
paint, and Jack had hurriedly made up as much like a clown as
possible, using one of the dressing-rooms back of the stage for this
purpose. So, when it came his turn to go out, his appearance was
greeted with a burst of applause. He was the first amateur to “make-
up.”
Jack was, naturally, a rather droll lad, and he was quite nimble on his
feet. He had once been much impressed by what a clown did in a
small circus, and he had practiced on variations of that entertainer’s
act, until he had a rather queer mixture of songs, jokes, nimble
dancing and acrobatic steps.
This he now essayed, with such good effect that he soon had the
audience laughing, and, once that is accomplished, the rest is
comparatively easy for this class of work on the stage.
Jack did his best. He went through a lot of queer evolutions, leaped
and danced as if his feet were on springs, and ended with an odd
little verse and a backward summersault, which brought him
considerable applause.
“Jack’ll get first prize,” remarked Tom Berwick to his chums, when
they had done applauding their friend.
But he did not. The performer after him, a young lady, who had
undoubted talent, by her manner of singing comic songs, to the
accompaniment of the orchestra, was adjudged to have won first
prize. Jack got second, and he was almost as well pleased, for the
young lady, Miss Mab Fordworth, was quite a friend of his.
“Well,” thought Jack, as the manager handed him the three dollars,
“here is where I have spending money for a week, anyhow. I won’t
have to see the boys turning up their noses because I don’t treat.”
The amateur efforts closed the performance, and, after Jack had
washed off the white and red paint, he joined his chums.
“Say, Jack,” remarked Tom, “I didn’t know you could do as well as
that.”
“I didn’t, either,” replied Jack. “It was easy after I got my wind. But I
was a bit frightened at first.”
“I’d like to be on the stage,” observed Tom, with something of a sigh.
“But I can’t do anything except catch balls. I don’t s’pose that would
take; would it?”
“It might,” replied Jack good-naturedly.
“Well, come on, let’s get some sodas,” proposed Tom. “It was hot in
there. I’ll stand treat.”
“Seems to me you’re always standing treat,” spoke Jack, quickly. “I
guess it’s my turn, fellows.”
“Jack’s spending some of his prize money,” remarked Charlie
Andrews.
“It’s the first I have had to spend in quite a while,” was his answer.
“Old Klopper holds me down as close as if he was a miser. I’ll be
glad when my dad comes back.”
“Where is he now?” asked Tom.
“Somewhere in China. We can’t find out exactly. I’m getting a bit
worried.”
“Oh, I guess he’s all right,” observed Charlie. “But if you’re going to
stand treat, come on; I’m dry.”
The boys were soon enjoying the sodas, and Jack was glad that he
had the chance to play host, for it galled him to have to accept the
hospitality of his chums, and not do his share. Now, thanks to his
abilities as a clown, he was able to repay the favors.
“Well, I suppose I might as well go in the front door as to crawl in the
window,” thought Jack, as he neared the professor’s house. “He
knows I’m out, for that old maid told him, and he’ll be waiting for me.
I’m in for a lecture, and the sooner it’s over the better. Oh, dear, but I
wish dad and mom were home!”
“Well, young man, give an account of yourself,” said the professor
sharply, when Jack came in. Mr. Klopper could never forget that he
had been a teacher, and a severe one at that. His manner always
savored of the classroom, especially when about to administer a
rebuke.
“I went to the show,” said Jack shortly. “I told you I was going.”
“In other words you defied and disobeyed me.”
“I felt that I had a right to go. I’m not a baby.”
“That is no excuse. I shall report your conduct to your parents. Now
another matter. Where did you get the money to go with?”
“I—I got it.”
“Evidently; but I asked you where. The idea of wasting fifty cents for
a silly show! Did you stop to realize that fifty cents would pay the
interest on ten dollars for a year, at five per cent?”
“I didn’t stop to figure it out, professor.”
“Of course not. Nor did you stop to think that for fifty cents you might
have bought some useful book. And you did not stop to consider that
you were disobeying me. I shall attend to your case. Do you still
refuse to tell me where you got that money?”
“I—I’d rather not.”
“Very well, I shall make some inquiries. You may retire now. I never
make up my mind when I am the least bit angry, and I find myself
somewhat displeased with you at this moment.”
“Displeased” was a mild way of putting it, Jack thought.
“I shall see you in the morning,” went on the professor. “It is
Saturday, and there is no school. Remain in your room until I come
up. I wish to have a serious talk with you.”
Jack had no relish for this. It would not be the first time the professor
had had a “serious talk” with him, for, of late, the old teacher was
getting more and more strict in his treatment of the boy. Jack was
sure his father would not approve of the professor’s method. But Mr.
Allen was far away, and his son was not likely to see him for some
time.
But, in spite of what he knew was in store for him the next morning,
Jack slept well, for he was a healthy youth.
“I suppose he’ll punish me in some way,” he said, as he arose, “but
he won’t dare do very much, though he’s been pretty stiff of late.”
The professor was “pretty stiff” when he came to Jack’s room to
remonstrate with his ward on what he had done. Jack never
remembered such a lecture as he got that day. Then the former
college instructor ended up with:
“And, as a punishment, you will keep to your room to-day and to-
morrow. I forbid you to stir from it, and if I find you trying to sneak
out, as you did last night, I shall take stringent measures to prevent
you.”
The professor was a powerful man, and there was more than one
story of the corporal punishment he had inflicted on rebellious
students.
“But, professor,” said Jack. “I was going to have a practice game of
baseball with the boys to-day. The season opens next week, and I’m
playing in a new position. I’ll have to practice!”
“You will remain in your room all of to-day and to-morrow,” was all
the reply the professor made, as he strode from Jack’s apartment.
CHAPTER IV
DISQUIETING NEWS

“Well, if this ain’t the meanest thing he’s done to me yet!” exclaimed
Jack, as the door closed on the retreating form of his crusty
guardian. “This is the limit! The boys expect me to the ball game,
and I can’t get there. That means they’ll put somebody else in my
place, and maybe I’ll have to be a substitute for the rest of the
season. I’ve a good notion——”
But so many daring thoughts came into Jack’s mind that he did not
know which one to give utterance to first.
“I’ll not stand it,” he declared. “He hasn’t any right to punish me like
this, for what I did. He had no right to keep me in. I’ll get out the
same way I did before.”
Jack looked from the window of his room. Below it, seated on a
bench, in the shade of a tree, was the professor, reading a large
book.
“That way’s blocked,” remarked the boy. “He’ll stay there all day,
working out problems about how much a dollar will amount to if put
out at interest for a thousand years, or else figuring how long it will
take a man to get to Mars if he traveled at the rate of a thousand
miles a minute, though what in the world good such knowledge is I
can’t see.
“But I can’t get out while he’s on guard, for he wouldn’t hesitate to
wallop me. And when he comes in to breakfast his sister will relieve
him. I am certainly up against it!
“Hold on, though! Maybe he forgot to bolt the door!”
It was a vain hope. Though Jack had not heard him do it, the
professor had softly slid the bolt across as he went out of the boy’s
room, and our hero was practically a prisoner in his own apartment.
And this on a beautiful Saturday, when there was no school and
when the first practice baseball game of the season was to be
played. Is it any wonder that Jack was indignant?
“It’s about time they brought me something to eat,” he thought, as he
heard a clock somewhere in the house strike nine. “I’m getting
hungry.”
He had little fear on the score that the professor would starve him,
for the old college instructor was not quite as mean as that, and, in a
short time, Miss Klopper appeared with a tray containing Jack’s
breakfast.
“I should think you would be ashamed of yourself,” she said. “The
idea of repaying my brother’s kindness by such acts! You are a
wicked boy!”
Jack wondered where any special kindness on the part of the
professor came in, but he did not say anything to the old maid whose
temper was even more sour than her brother’s. Since his parents
had left him with the professor, Jack had never been treated with real
kindness. Perhaps Mr. Klopper did not intend to be mean, but he
was such a deep student that all who did not devote most of their
time to study and research earned his profound contempt. While
Jack was a good boy, and a fairly good student, he liked sports and
fun, and these the professor detested. So, when he found that his
ward did not intend to apply himself closely to his books, Professor
Klopper began “putting the screws on,” as Jack termed it.
Matters had gone from bad to worse, until the boy was now in a
really desperate state. His naturally good temper had been spoiled
by a series of petty fault-findings, and he had been so hedged about
by the professor and his sister that he was ripe for almost anything.
All that day he remained in his room, becoming more and more
angry at his imprisonment as the hours passed.
“The boys are on the diamond now,” he said, as he heard a clock
strike three. “They’re practicing, and soon the game will start. Gee,
but I wish I was there! But it’s no use.”
Another try at the door, and a look out of his window convinced him
of this. The professor was still on guard, reading his big book.
Toward dusk the professor went in, as he could see no longer. But,
by that time Jack had lost all desire to escape. He resolved to go to
bed, to make the time pass more quickly, though he knew he had
another day of imprisonment before him. Sunday was the occasion
for long rambles in the woods and fields with his chums, but he knew
he would have to forego that pleasure now. He almost hoped it
would rain.
As he was undressing there came a hurried knock on his door.
“What is it?” he asked.
“My brother wants to see you at once, in his study,” said Miss
Klopper.
“Oh, dear,” thought Jack. “Here’s for another lecture.”
There was no choice but to obey, however, for Mr. Allen in his last
injunction to his son, had urged him to give every heed to his
guardian’s requests.
He found the professor in his study, with open books piled all about
on a table before which he sat. In his hand Mr. Klopper held a white
slip of paper.
“Jack,” he said, more kindly than he had spoken since the trouble
between them, “I have here a telegram concerning your father and
mother.”
“Is it—is it bad news?” asked the boy quickly, for something in the
professor’s tone and manner indicated it.
“Well, I—er—I’m sorry to say it is not good news. It is rather
disquieting. You remember I told you I cabled to the United States
Consul in Hong Kong concerning your parents, when several days
went by without either of us hearing from them.”
“What does he say?”
“His cablegram states that your parents went on an excursion
outside of Hong Kong about two weeks ago, and no word has been
received from them since.”
“Are they—are they killed?”
“No; I do not think so. The consul adds that as there have been
disturbances in China, it is very likely that Mr. and Mrs. Allen,
together with some other Americans, have been detained in a
friendly province, until the trouble is over. I thought you had better
know this.”
“Do you suppose there is any danger?”
“I do not think so. There is no use worrying, though I was a little
anxious when I had no word from them. We will hope for the best. I
will cable the consul to send me word as soon as he has any
additional news.”
“Poor mother!” said Jack. “She’s nervous, and if she gets frightened
it may have a bad effect on her heart.”
“Um,” remarked the professor. He had little sympathy for ailing
women. “In view of this news I have decided to mitigate your
punishment,” he added to Jack. “You may consider yourself at liberty
to-morrow, though I shall expect you to spend at least three hours in
reading some good and helpful book. I will pick one out for you. It is
well to train our minds to deep reading, for there is so much of the
frivolous in life now-a-days, that the young are very likely to form
improper thinking habits. I would recommend that you spend an hour
before you retire to-night, in improving yourself in Latin. Your
conjugation of verbs was very weak the last time I examined you.”
“I—I don’t think I could study to-night,” said Jack, who felt quite
miserable with his enforced detention in the house, and the
unpleasant news concerning his parents. “I’d be thinking so much
about my father and mother that I couldn’t keep my attention on the
verbs,” he said.
“That indicates a weak intellect,” returned the professor. “You should
labor to overcome it. However, perhaps it would be useless to have
you do any Latin to-night. But I must insist on you improving in your
studies. Your last report from the academy was very poor.”
Jack did not answer. With a heavy heart he went to his room, where
he sat for some time in the dark, thinking of his parents in far-off
China.
“I wish I could go and find them,” he said. “Maybe they need help. I
wonder if the professor’d let me go?”
But, even as that idea came to him, he knew it would be useless to
propose it to Mr. Klopper.
“He’s got enough of money that dad left for my keep, to pay my
passage,” the boy mused on. “But if I asked for some for a
steamship ticket he’d begin to figure what the interest on it for a
hundred years would be, and then he’d lecture me about being a
spendthrift. No, I’ll have to let it go, though I do wish I could make a
trip abroad. If I could only earn money enough, some way, I’d go to
China and find dad and mom.”
But even disquieting and sad thoughts can not long keep awake a
healthy lad, and soon Jack was slumbering. He was up early the
next morning, and, as usual, accompanied the professor to church.
The best part of the afternoon he was forced to spend in reading a
book on what boys ought to do, written by an old man who, if ever he
was a healthy, sport-loving lad, must have been one so many years
ago that he forgot that he ever liked to have fun once in a while.
Jack was glad when night came, so he could go to bed again.
“To-morrow I’ll see the boys,” he thought to himself. “They’ll want to
know why I didn’t come to play ball, and I’ll have to tell them the real
reason. I’m getting so I hate Professor Klopper!”
If Jack had known what was to happen the next day, he probably
would not have slept so soundly.
CHAPTER V
A SERIOUS ACCUSATION

“Hey, Jack, where were you Saturday?” asked Tom Berwick, as our
hero came into the school yard Monday morning. “We had a dandy
game,” he went on. “Your catching glove is nifty!”
“Yes, Fred Walton played short,” added Sam Morton. “We waited as
long as we could for you. What was the matter?”
“The professor made me stay home because I skipped out the night
before to go to the show.”
“Say, he’s a mean old codger,” was Tom’s opinion, which was
echoed by several other lads.
“Is Fred going to play shortstop regularly?” asked Jack, of Tom
Berwick, who was captain of the Academy nine.
“I don’t know. He wants to, but I’d like to have you play there, Jack.
Still, if you can’t come Saturdays——”
“Oh, I’ll come next Saturday all right. Can’t we have a little practice
this afternoon?”
“Sure. You can play then, if you want to. Fred has to go away, he
said.”
The boys had a lively impromptu contest on the diamond when
school closed that afternoon, and Jack proved himself an efficient
player at shortstop. It was getting dusk when he reached the
professor’s house, and the doughty old college instructor was
waiting for him.
“Did I not tell you to come home early, in order that I might test you in
algebra?” he asked Jack.
“Yes, sir. But I forgot about it,” which was the truth for, in the
excitement over the game, Jack had no mind for anything but
baseball.
“Where were you?” went on Mr. Klopper.
“Playing ball.”
“Playing ball! An idle, frivolous amusement. It tends to no good, and
does positive harm. I have no sympathy with that game. It gives no
time for reflection. I once watched a game at the college where I
used to teach. I saw several men standing at quite some distance
from the bare spot where one man was throwing a ball at another,
with a stick in his hand.”
“That was the diamond,” volunteered Jack, hoping the professor
might get interested in hearing about the game, and so forego the
lecture that was in prospect.
“Ah, a very inappropriate name. Such an utterly valueless game
should not be designated by any such expensive stone as a
diamond. But what I was going to say was that I saw some of the
players standing quite some distance from the bare spot——”
“They were in the outfield, professor. Right field, left field and centre.”
“One moment; I care nothing about the names of the contestants. I
was about to remark that those distant players seemed to have little
to do with the game. They might, most profitably have had a book
with them, to study while they were standing there, but they did not.
Instead they remained idle—wasting their time.”
“But they might have had to catch a ball any moment.”
“Nonsense!” exclaimed the professor. “It is an idle frivolous
amusement, and I regret very much that you wasted your valuable
time over it. After supper I want to hear you read some Virgil, and
also do some problems in geometry. I was instructed by your father
to see that your education was not neglected, and I must do my duty,
no matter how disagreeable it is.”
Jack sighed. He had studied hard in class that day, and now to be
made to put in the evening over his books he thought was very
unfair.
But there was no escape from the professor, and the boy had to put
in two hours at his Latin and mathematics, which studies, though
they undoubtedly did him good, were very distasteful to him.
“You are making scarcely any progress,” said the professor, when
Jack had failed to properly answer several of his questions. “I want
you to come home early from school to-morrow afternoon, and I will
give you my undivided attention until bedtime. I am determined that
you shall learn.”
Jack said nothing, but he did not think it would be wise to go off
playing ball the next afternoon, though the boys urged him strongly.
“Why don’t you write and tell your dad how mean old Klopper is
treating you?” suggested Tom, when Jack explained the reason for
going straight home from his classes.
“I would if I knew how to reach him. But I don’t know where he is,”
and Jack sighed, for he was becoming more and more alarmed at
the long delay in hearing from his father.
But Jack was destined to do no studying that afternoon under the
watchful eye of Professor Klopper. He had no sooner entered the
house than he was made aware that something unusual had
happened.
“My brother is waiting for you in the library,” said Miss Klopper, and
Jack noticed that she was excited over something.
“Maybe it’s bad news about the folks,” the boy thought, but when he
saw that the professor had no cablegram, he decided it could not be
that.
“Jack,” began the aged teacher, “I have a very serious matter to
speak about.”
“I wonder what’s coming now?” thought the boy.
“Do you recall the night you disobeyed me, and, sneaking out of your
window like a thief, you went to a—er—a theatrical performance
without my permission?” asked the professor.
“Yes, sir,” replied Jack, wondering if his guardian thought he was
likely to forget it so soon.
“Do you also recollect me asking you where you got the money
wherewith to go?”
“Yes, sir.”
“I now, once more, demand that you tell me where you obtained it,
and, let me warn you that it is serious. I insist that you answer me.
Where did you get that money?”
“I—I don’t want to tell you, Professor Klopper.”
“Are you afraid?”
“No, sir,” came the indignant answer, for there were few things of
which Jack Allen was afraid.
“Then why don’t you tell me?”
“Because I don’t think you have a right to know everything that I do. I
am not a baby. I assure you I got that money in a perfectly legitimate
way.”
“Oh, you did?” sneered the professor. “We shall see about that.
Come in,” he called, and, to Jack’s surprise the door opened and
Miss Klopper entered the library.
“I believe you have something to say on a subject that interests all
present,” went on the professor, in icy tones.
“She knows nothing of where I got the money,” said Jack.
“We shall see,” remarked Mr. Klopper. “You may tell what you know,”
he added to his sister.
“I saw Jack just as he got down out of his window,” Miss Klopper
stated, as if she was reciting a lesson. “He had a bundle with him. I
asked what it was and he would not tell me.”
“Is that correct?” inquired the former teacher.
“Yes, sir,” replied Jack, wondering how the professor could be
interested in his catching glove, which was what the bundle had
contained.
“What was in that package?” went on the professor.
“I—I don’t care to tell, sir.”
“I insist that you shall. Once again, I warn you that it is a very serious
matter.”
Jack could not quite understand why, so he kept silent.
“Well, are you going to tell me?”
“No, sir.”
Jack had no particular reason for not telling, but he had made up his
mind that the professor had no right to know, and he was not going
to give in to him.
“This is your last chance,” warned his guardian. “Are you going to tell
me?”
“No, sir.”
“Then I will tell you what was in that package. It was my gold loving
cup, that the teachers of Underhill College presented to me on the
occasion of my retirement from the faculty of that institution!”
“Your loving cup?” repeated Jack in amazement, for that cup was
one of the professor’s choicest possessions, and quite valuable.
“Yes, my loving cup. You had it in that bundle, and you took it out to
pawn it, in order to get money to go to that show.”
“That’s not true!” cried Jack indignantly. “All I had in that bundle was
my catching glove, which I sold to Tom Berwick.”
“I don’t believe you,” said the professor stiffly. “I say you stole my
loving cup and pawned it. The cup is gone from its accustomed
place on my dresser. I did not miss it until this afternoon, and, when I
asked my sister about it, she said she had not seen it. Then she
recalled your sneaking away from the house with a bundle, and I at
once knew what had become of it.”
“I say you took my cup!”
Page 41
“You couldn’t know, for there is absolutely no truth in this
accusation,” replied Jack hotly.
“Do you mean to say that I am telling an untruth?” asked the
professor sharply. “I say that you took my cup.”
“And I say that I didn’t! I never touched your cup! If it’s gone some
one else took it!”
Jack spoke in loud and excited tones.
“Don’t you dare contradict me, young man!” thundered the former
teacher. “I will not permit it. I say you took that cup! I know you did!”
“I didn’t!” cried Jack.
The professor was so angry that he took a step toward the lad. He
raised his hand, probably unconsciously, as though to deal Jack a
box on the ear, for this was the old teacher’s favorite method of
correcting a refractory student.
Jack, with the instinct of a lad who will assume a defensive attitude
on the first sign of an attack, doubled up his fists.
“What! You dare attempt to strike me?” cried the professor. “You
dare?”
“I’m not going to have you hit me,” murmured Jack. “You are making
an unjust charge. I never took that cup. I can prove what I had in that
package by Tom Berwick.”
“I do not believe you,” went on the professor. “I know you pawned
that cup to get spending money, because I refused to give you any to
waste. I will give you a chance to confess, and tell me where you
disposed of it, before I take harsh measures.”
Jack started. What did the professor mean by harsh measures?
“I can’t confess what I did not do,” he said, more quietly. “I never took
the loving cup.”
“And I say you did!” cried the old teacher, seeming to lose control of
himself. “I say you stole it, and I’ll have you arrested, you young
rascal! Go to your room at once, and remain there until I get an
officer. We’ll see then whether you’ll confess or not. I’ll call in a
policeman at once. See that he does not leave the house,” he added
to his sister, as he hurried from the room.
Jack started from the library.
“Where are you going?” asked Miss Klopper, placing herself in his
path. She was a large woman, and strong.
“I am going to my room,” replied Jack, sore at heart and very
miserable over the unjust accusation.

You might also like