The Cox-type proportional rate model (available in reReg() via model = "cox.LWYY") is a common semiparametric regression model for recurrent event processes under the noninformative censoring assumption. Since the construction of the pseudo-partial score function ignores the dependency among recurrent events, the efficiency of the Cox-type proportional rate model can be improved by combining a system of weighted pseudo-partial score equations via the generalized method of moments (GMM) and empirical likelihood (EL) estimation (Huang and Huang 2022). In this vignette, we demonstrate the proposed GMM and EL procedures in the reReg package. We will illustrate the idea with simulated data generated from the simGSC() function.

> library(reReg)
> packageVersion("reReg")
[1] '1.4.2'

Loading data set

We first generate a simulated data where the recurrent event process is generated from a Cox-type proportional rate model. This can be achieved with the default settings of simGSC(); see the vignette on simulating recurrent event data for illustrations of simGSC().

> set.seed(1); head(dat <- simGSC(50))
  id   t.start    t.stop event status x1         x2
1  1 0.0000000 2.8295437     1      0  1  0.1335494
2  1 2.8295437 5.4491071     1      0  1  0.1335494
3  1 5.4491071 6.6067969     0      1  1  0.1335494
4  2 0.0000000 0.2094167     1      0  0 -0.2712600
5  2 0.2094167 0.2450093     1      0  0 -0.2712600
6  2 0.2450093 0.6668239     1      0  0 -0.2712600

Fitting Cox-type model

We can fit the Cox-type proportional rate model (Lin et al. 2000) with reReg() via the following specification.

> fit <- reReg(Recur(t.stop, id, event) ~ x1 + x2, model = "cox.LWYY", data = dat)
> summary(fit)
Call: 
reReg(formula = Recur(t.stop, id, event) ~ x1 + x2, data = dat, 
    model = "cox.LWYY")

Recurrent event process:
   Estimate   StdErr z.value p.value    
x1 -0.94573  0.18261 -5.1789 < 2e-16 ***
x2 -0.69611  0.23438 -2.9700 0.00298 ** 

Huang and Huang (2022) proposed improved semiparametric estimation methods of the above proportional rate model. By combining different weighted pseudo-partial score functions through generalized methods of moments or empirical likelihood methods, Huang and Huang (2022) showed that substantial efficiency gain can be achieved without imposing additional model assumptions than those assumed in Lin et al. (2000). Specifically, the efficiency gain is a result of combining a set of asymptotically independently and identically distributed estimating equations derived from the weighted pseudo-partial score equations of Lin et al. (2000).

Improved estimation

Two available approaches to combine estimating equations are the generalized method of moments (GMM) and the empirical likelihood (EL) estimations, both of which can be specified with the cppl argument within the control list. In addition to the combination approach, weight functions to be combined with the weighted pseudo-partial score functions can also be specified with the cppl.wfun argument within the control list. In theory, combining more weight functions leads to better asymptotic efficiency, especially when the sample size is large. However, as observed in simulation studies of Huang and Huang (2022), combining one or two weighted functions can achieve substantial efficiency gain under scenarios with small to moderate sample sizes. When the sample size is small, Huang and Huang (2022) suggests using a combination of the unit weight and a decreasing function, since the decreasing function can downweight the later time period, in which the events are more effectively censored. For those reason, the reReg() currently only allows user to specify up to two weight functions with cppl.wfun. The reReg() allows users choose either the cumulative baseline rate function or the Gehan’s weight as two possible options for the weight functions. Additionally, the reReg() also allows users to specify arbitrary function in cppl.wfun via function formulas.

The following gives an example of the improved estimation by combining estimating equations via GMM with the cumulative baseline rate function and the Gehan’s weight.

> fit2 <- reReg(Recur(t.stop, id, event) ~ x1 + x2, model = "cox.LWYY", data = dat, 
+               control = list(cppl = "GMM", cppl.wfun = list("Gehan", "cumbase"))) 
> summary(fit2) 
Call: 
reReg(formula = Recur(t.stop, id, event) ~ x1 + x2, data = dat, 
    model = "cox.LWYY", control = list(cppl = "GMM", cppl.wfun = list("Gehan", 
        "cumbase")))

Recurrent event process:
   Estimate   StdErr z.value p.value    
x1 -0.88307  0.16423 -5.3771 < 2e-16 ***
x2 -0.74891  0.19868 -3.7694 0.00016 ***

For this data set, the standard errors changed from 0.1826138 and 0.234383 to 0.1642281 and 0.1986836, respectively. This achieves a 10.07% and 15.23% efficiency gain, respectively. The above result can be called by updating the original fit, fit, with the update() function, as illustrated in the following.

> fit2 <- update(fit, control = list(cppl = "GMM", cppl.wfun = list("Gehan", "cumbase")))

The following gives an example of the improved estimation by combining estimating equations via EL with the user-specified function \(1 / (x + 1)\) and the Gehan’s weight.

> fit3 <- reReg(Recur(t.stop, id, event) ~ x1 + x2, model = "cox.LWYY", data = dat, 
+               control = list(cppl = "EL", cppl.wfun = list(function(x) 1 / (x + 1), "Gehan"))) 
> summary(fit3) 
Call: 
reReg(formula = Recur(t.stop, id, event) ~ x1 + x2, data = dat, 
    model = "cox.LWYY", control = list(cppl = "EL", cppl.wfun = list(function(x) 1/(x + 
        1), "Gehan")))

Recurrent event process:
   Estimate   StdErr z.value p.value    
x1 -0.90487  0.17677 -5.1189 < 2e-16 ***
x2 -0.62554  0.19248 -3.2498 0.00115 ** 

For this data set, the standard errors changed from 0.1826138 and 0.234383 to 0.1767699 and 0.1924846, respectively. This achieves a 3.2% and 17.88% efficiency gain, respectively. Although the above example assumes the covariates are time-independent, the proposed approach and the implementation allow time-dependent covariates.

Conclusion

Although this vignette is focus on improving the proportional rate model, the proposed approaches by Huang and Huang (2022) can be extended to handle other types of rate function. Such an extension will be investigated and the reReg package will be expanded accordingly.

Reference

Huang, Ming-Yueh, and Chiung-Yu Huang. 2022. “Improved Semiparametric Estimation of the Proportional Rate Model with Recurrent Event Data.” In Revision.

Lin, D. Y., L. J. Wei, I. Yang, and Z. Ying. 2000. “Semiparametric Regression for the Mean and Rate Functions of Recurrent Events.” Journal of the Royal Statistical Society: Series B 62 (4): 711–30.