`reReg`

`vignettes/reReg-cppl.Rmd`

`reReg-cppl.Rmd`

The Cox-type proportional rate model (available in `reReg()`

via `model = "cox.LWYY"`

) is a common semiparametric regression model for recurrent event processes under the noninformative censoring assumption. Since the construction of the pseudo-partial score function ignores the dependency among recurrent events, the efficiency of the Cox-type proportional rate model can be improved by combining a system of weighted pseudo-partial score equations via the generalized method of moments (GMM) and empirical likelihood (EL) estimation (Huang and Huang 2022). In this vignette, we demonstrate the proposed GMM and EL procedures in the `reReg`

package. We will illustrate the idea with simulated data generated from the `simGSC()`

function.

`[1] '1.4.2'`

We first generate a simulated data where the recurrent event process is generated from a Cox-type proportional rate model. This can be achieved with the default settings of `simGSC()`

; see the vignette on simulating recurrent event data for illustrations of `simGSC()`

.

```
id t.start t.stop event status x1 x2
1 1 0.0000000 2.8295437 1 0 1 0.1335494
2 1 2.8295437 5.4491071 1 0 1 0.1335494
3 1 5.4491071 6.6067969 0 1 1 0.1335494
4 2 0.0000000 0.2094167 1 0 0 -0.2712600
5 2 0.2094167 0.2450093 1 0 0 -0.2712600
6 2 0.2450093 0.6668239 1 0 0 -0.2712600
```

We can fit the Cox-type proportional rate model (Lin et al. 2000) with `reReg()`

via the following specification.

```
Call:
reReg(formula = Recur(t.stop, id, event) ~ x1 + x2, data = dat,
model = "cox.LWYY")
Recurrent event process:
Estimate StdErr z.value p.value
x1 -0.94573 0.18261 -5.1789 < 2e-16 ***
x2 -0.69611 0.23438 -2.9700 0.00298 **
```

Huang and Huang (2022) proposed improved semiparametric estimation methods of the above proportional rate model. By combining different weighted pseudo-partial score functions through generalized methods of moments or empirical likelihood methods, Huang and Huang (2022) showed that substantial efficiency gain can be achieved without imposing additional model assumptions than those assumed in Lin et al. (2000). Specifically, the efficiency gain is a result of combining a set of asymptotically independently and identically distributed estimating equations derived from the weighted pseudo-partial score equations of Lin et al. (2000).

Two available approaches to combine estimating equations are the generalized method of moments (GMM) and the empirical likelihood (EL) estimations, both of which can be specified with the `cppl`

argument within the `control`

list. In addition to the combination approach, weight functions to be combined with the weighted pseudo-partial score functions can also be specified with the `cppl.wfun`

argument within the `control`

list. In theory, combining more weight functions leads to better asymptotic efficiency, especially when the sample size is large. However, as observed in simulation studies of Huang and Huang (2022), combining one or two weighted functions can achieve substantial efficiency gain under scenarios with small to moderate sample sizes. When the sample size is small, Huang and Huang (2022) suggests using a combination of the unit weight and a decreasing function, since the decreasing function can downweight the later time period, in which the events are more effectively censored. For those reason, the `reReg()`

currently only allows user to specify up to two weight functions with `cppl.wfun`

. The `reReg()`

allows users choose either the cumulative baseline rate function or the Gehan’s weight as two possible options for the weight functions. Additionally, the `reReg()`

also allows users to specify arbitrary function in `cppl.wfun`

via function formulas.

The following gives an example of the improved estimation by combining estimating equations via GMM with the cumulative baseline rate function and the Gehan’s weight.

```
> fit2 <- reReg(Recur(t.stop, id, event) ~ x1 + x2, model = "cox.LWYY", data = dat,
+ control = list(cppl = "GMM", cppl.wfun = list("Gehan", "cumbase")))
> summary(fit2)
```

```
Call:
reReg(formula = Recur(t.stop, id, event) ~ x1 + x2, data = dat,
model = "cox.LWYY", control = list(cppl = "GMM", cppl.wfun = list("Gehan",
"cumbase")))
Recurrent event process:
Estimate StdErr z.value p.value
x1 -0.88307 0.16423 -5.3771 < 2e-16 ***
x2 -0.74891 0.19868 -3.7694 0.00016 ***
```

For this data set, the standard errors changed from 0.1826138 and 0.234383 to 0.1642281 and 0.1986836, respectively. This achieves a 10.07% and 15.23% efficiency gain, respectively. The above result can be called by updating the original fit, `fit`

, with the `update()`

function, as illustrated in the following.

The following gives an example of the improved estimation by combining estimating equations via EL with the user-specified function \(1 / (x + 1)\) and the Gehan’s weight.

```
> fit3 <- reReg(Recur(t.stop, id, event) ~ x1 + x2, model = "cox.LWYY", data = dat,
+ control = list(cppl = "EL", cppl.wfun = list(function(x) 1 / (x + 1), "Gehan")))
> summary(fit3)
```

```
Call:
reReg(formula = Recur(t.stop, id, event) ~ x1 + x2, data = dat,
model = "cox.LWYY", control = list(cppl = "EL", cppl.wfun = list(function(x) 1/(x +
1), "Gehan")))
Recurrent event process:
Estimate StdErr z.value p.value
x1 -0.90487 0.17677 -5.1189 < 2e-16 ***
x2 -0.62554 0.19248 -3.2498 0.00115 **
```

For this data set, the standard errors changed from 0.1826138 and 0.234383 to 0.1767699 and 0.1924846, respectively. This achieves a 3.2% and 17.88% efficiency gain, respectively. Although the above example assumes the covariates are time-independent, the proposed approach and the implementation allow time-dependent covariates.

Although this vignette is focus on improving the proportional rate model, the proposed approaches by Huang and Huang (2022) can be extended to handle other types of rate function. Such an extension will be investigated and the `reReg`

package will be expanded accordingly.

Huang, Ming-Yueh, and Chiung-Yu Huang. 2022. “Improved Semiparametric Estimation of the Proportional Rate Model with Recurrent Event Data.” *In Revision*.

Lin, D. Y., L. J. Wei, I. Yang, and Z. Ying. 2000. “Semiparametric Regression for the Mean and Rate Functions of Recurrent Events.” *Journal of the Royal Statistical Society: Series B* 62 (4): 711–30.