Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Biometrika (2023), 110, 4, pp. 867–869 https://doi.org/10.

1093/biomet/asad043

Discussion of ‘Statistical inference for streamed


longitudinal data’
By YANG NING AND JINGYI DUAN

Downloaded from https://academic.oup.com/biomet/article/110/4/867/7424087 by East China Normal University user on 02 June 2024
Department of Statistics and Data Science, Cornell University,
1188 Comstock Hall, Ithaca, New York 14853, U.S.A.
yn265@cornell.edu jd2222@cornell.edu

1. Introduction
We congratulate the authors for their contributions to statistical analysis for streamed longitudi-
nal data. Streamed data occur frequently in many modern biomedical areas such as mobile health,
drawing increasing attention in recent years. Luo et al. (2023) developed a new statistical inference
framework for streamed longitudinal data via the quadratic inference function approach. The pro-
posed method avoids retrospective calculations using the individual-level data in the previous data
batches, and therefore reduces the memory and computational burdens of storing and analysing the
entire cumulative dataset.
In their work, the authors focused on the case in which the dimension of the regression parameter
β is small and fixed. In this discussion, we comment on the statistical inference for high-dimensional
streamed longitudinal data. There is a large literature on how to construct confidence intervals and
hypothesis tests for high-dimensional generalized linear models via debiased or decorrelated estima-
tors, for instance, Javanmard & Montanari (2014), van de Geer et al. (2014), Zhang & Zhang (2014),
Belloni et al. (2016), Cai & Guo (2017), Ning & Liu (2017), Ning et al. (2017) and Neykov et al.
(2018). Recently, Fang et al. (2020) further extended the decorrelated method to deal with longitudi-
nal data in the offline setting via the quadratic inference function. In the following, we briefly outline
the main idea on how the decorrelated method can be adapted to make inference for high-dimensional
streamed longitudinal data.

2. High-dimensional estimation for streamed longitudinal data


Following the same notation as Luo et al. (2023), we assume that the batch-specific para-
meter βj ∈ Rp is high dimensional, i.e., p  max(nj , m) for j = 1, …, b. For ease of presenta-
tion, we focus on the derivation with two data batches Di1 and Di2 , where Di2 arrives after Di1 . One
possible approach for estimating β1 and β2 is via the penalized quasilikelihood under the working
independence assumption. In particular, the estimation of β1 with the first data batch Di1 can be
attained by
  nj 
m  T β ) 
h(Xi,k1 1 Yik1 − μ
β̂1 = arg min − dμ + Pλ (β1 ) , (1)
i=1 k=1 Yik1 V (μ)

where Pλ (β1 ) is a penalty function encouraging sparsity of β1 , e.g., lasso Pλ (β1 ) = λβ1 1 , and λ
is a tuning parameter. Once the second data batch Di2 arrives, we only need to estimate β2 by min-
imizing the penalized quasilikelihood loss with data Di2 and keep the same estimator β̂1 , since the
quasilikelihood function with the combined data (Di1 , Di2 ) is decomposable.


c The Author(s) 2023. Published by Oxford University Press on behalf of the Biometrika Trust.
All rights reserved. For permissions, please email: journals.permissions@oup.com
868 Y. Ning and J. Duan
Table 1. Estimation errors of the penalized quasilikelihood estimators β̂ and the
penalized quadratic inference function estimators β̃
L1 error L2 error L∞ error
β̂ − β1 β̃ − β1 β̂ − β2 β̃ − β2 β̂ − β∞ β̃ − β∞
ρ = 0.25 0.75 1.26 0.32 0.41 0.21 0.19
ρ = 0.4 0.72 1.22 0.31 0.40 0.20 0.18
ρ = 0.6 0.89 1.26 0.32 0.42 0.20 0.20
= 0.75

Downloaded from https://academic.oup.com/biomet/article/110/4/867/7424087 by East China Normal University user on 02 June 2024
ρ 1.57 1.57 0.36 0.52 0.23 0.25

To account for the dependence of the longitudinal measurements on the same subject, an alterna-
tive approach is based on the quadratic inference function. Given the first data batch Di1 , when the
dimension of β1 is fixed and small, the estimator is given by the solution of the estimating equation
S1T (β1 ){V1 (β1 )}−1 U1 (β1 ) = 0. However, the idea of directly imposing regularizations on the estimating
equation does not work for high-dimension data due to the following reasons. First, the dimension of
matrix V1 (β1 ) scales with p, which may not be invertible in high dimensions. Second, the estimating
equation as a function of β1 is highly nonlinear, such that the practical implementation for solving
a proper regularized estimating equation is complicated. To overcome these difficulties, inspired by
Luo et al. (2023), we can approximate the quadratic inference function Q1 (β1 ) by its second-order
Taylor expansion at β̂1 , Q̃1 (β1 ) := Q1 (β̂1 ) − 2G1 (β̂1 )(β1 − β̂1 ) + (β1 − β̂1 )T H1 (β̂1 )(β1 − β̂1 ), where
G1 (β1 ) = S1T (β1 )M1 U1 (β1 ), H1 (β1 ) = S1T (β1 )M1 S1 (β1 ) and β̂1 is the penalized quasilikelihood esti-
mator (1). Here, M1 is a weight matrix; for example, we can set M1 = I or M1 to be an approximate
inverse of V1 (β̂1 ). Therefore, we can estimate β1 by

β̃1 = arg min{Q̃1 (β1 ) + Pλ (β1 )}. (2)

Similarly, when the second data batch Di2 is available, we can approximate the cumulative quadratic
inference function Q2 (β) by its second-order Taylor expansion at β̂ = (β̂1 , β̂2 ), Q̃2 (β) := Q2 (β̂) −
2G2 (β̂)(β − β̂) + (β − β̂)T H2 (β̂)(β − β̂), where G2 (β) = S2T (β)M2 U2 (β), H2 (β) = S2T (β)M2 S2 (β)
and M2 is another weight matrix. Via the Taylor expansion, we avoid the retrospective calculations
of U1 (β) and S1 (β) with the individual-level data Di1 . Given the loss function Q̃2 (β), we can similarly
define the regularized estimator as in (2).
To compare the performance of the penalized quasilikelihood estimators and the penalized qua-
dratic inference function estimators, we consider the following simulation study. We generate the
response variables from the linear model yi,kj = xTi,kj βj + i,kj , where xi,kj is a p-dimensional covariate
vector for subject i at the kth observation in batch j, k = 1, …, nj , j = 1, …, b, i = 1, …, m with
m = 50, nj = 20, b = 2 and p = 100. For the covariate effects βj , we randomly select s = 5 from the
p components and set the value to be 1 and 0 for the rest of the entries. We generate the covariate
xi,kj ∼ N(0, 1 ), where the (1 , 2 )th element of 1 equals ρ |1 −2 | , and ρ = 0.25, 0.4, 0.6 or 0.75.
For subject i at batch j, (i,1j , . . . , i,nj j ) ∼ N(0, 2 ), where the (1 , 2 )th element of 2 also equals
ρ |1 −2 | . Table 1 shows the L1 , L2 and L∞ errors of the penalized quasilikelihood estimators β̂ and
the penalized quadratic inference function estimators β̃ averaged over 30 replicates. One interesting
result is that the use of the quadratic inference function does not generally reduce the estimation
error compared to the quasilikelihood-type estimator. One possible reason is that the current qua-
dratic inference function estimator is implemented with M1 and M2 being the identity matrix, which
does not correspond to the optimal combinations of U1 (β1 ) and U2 (β). The quadratic inference
function estimator may be improved by using a more informative weight matrix, while the use of the
approximate inverse of V1 (β̂1 ) can be computationally very expensive. We leave this point for future
investigation.
Discussion 869
3. Quadratic decorrelated inference function
In this section, we briefly discuss how to make inference on a component of βj via the quadratic
decorrelated inference function (Fang et al., 2020). Assume that we can partition βj as (θj , γjT )T , where
θj ∈ R is the parameter of interest and γj ∈ Rp−1 is the nuisance parameter. Let θ = (θ1 , θ2 ) and
γ = (γ1T , γ2T )T . Given the first data batch Di1 , the proposed inferential methods in Fang et al. (2020)
can be directly applied to construct confidence intervals and hypothesis tests for θ1 . When the second
data batch Di2 is available, the loss function Q̃2 (β) via the second-order Taylor expansion can be

Downloaded from https://academic.oup.com/biomet/article/110/4/867/7424087 by East China Normal University user on 02 June 2024
used to construct the decorrelated inference function. Following Ning & Liu (2017), the decorrelated
inference function for θ is defined as g(θ ) = ∇θ Q̃2 (θ , γ̂ ) − ŵT ∇γ Q̃2 (θ , γ̂ ), where

ŵ = arg min w1 such that ∇θ2γ Q̃2 (β̂) − wT ∇γ2 γ Q̃2 (β̂)∞  η

with a proper tuning parameter η, ∇θ denoting the derivative with respect to θ and ∇θ2γ denoting
the mixed derivatives with respect to θ and γ . The decorrelated estimator θ̂d is obtained by solving
g(θ ) = 0. We expect that, under similar regularity conditions as in Luo et al. (2023), estimator θ̂d is
asymptotically normal and that the asymptotic variance can be consistently estimated.

Acknowledgement
Ning was supported by the National Science Foundation and National Institutes of Health.

REFERENCES
Belloni, A., Chernozhukov, V. & Wei, Y. (2016). Post-selection inference for generalized linear models with
many controls. J. Bus. Econ. Statist. 34, 606–19.
Cai, T. T. & Guo, Z. (2017). Confidence intervals for high-dimensional linear regression: minimax rates and
adaptivity. Ann. Statist. 45, 615–46.
Fang, E. X., Ning, Y. & Li, R. (2020). Test of significance for high-dimensional longitudinal data. Ann. Statist.
48, 2622.
Javanmard, A. & Montanari, A. (2014). Confidence intervals and hypothesis testing for high-dimensional
regression. J. Mach. Learn. Res. 15, 2869–909.
Luo, L., Wang, J. & Hector, E. C. (2023). Statistical inference for streamed longitudinal data. Biometrika 110,
841–58.
Neykov, M., Ning, Y., Liu, J. S. & Liu, H. (2018). A unified theory of confidence regions and testing for high-
dimensional estimating equations. Statist. Sci. 33, 427–43.
Ning, Y. & Liu, H. (2017). A general theory of hypothesis tests and confidence regions for sparse high
dimensional models. Ann. Statist. 45, 158–95.
Ning, Y., Zhao, T. & Liu, H. (2017). A likelihood ratio framework for high-dimensional semiparametric
regression. Ann. Statist. 45, 2299–327.
Van de Geer, S., Bühlmann, P., Ritov, Y. & Dezeure, R. (2014). On asymptotically optimal confidence regions
and tests for high-dimensional models. Ann. Statist. 42, 1166–202.
Zhang, C.-H. & Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional
linear models. J. R. Statist. Soc. B 76, 217–42.

[Received on 26 June 2023. Editorial decision on 28 June 2023]


Downloaded from https://academic.oup.com/biomet/article/110/4/867/7424087 by East China Normal University user on 02 June 2024

You might also like