In this vignette, we demonstrate how to use the simSC function in reReg package to simulate recurrent event data from a scale-change model. Since the scale-change model includes the Cox-type model and the accelerated mean model as special cases, simSC can also be used to generate data from these submodels. The simSC function allows the censoring time to be non-informative (independent given covariate) or informative about the recurrent event process.

Notations

Suppose recurrent events can potentially be observed in the time period \([0, \tau]\). For a subject, let \(N_i(t)\) be the number of events in interval \([0, t]\), and \(X_i\) is a \(p\times 1\) covariate vector. Let \(C_i\) be a non-informative censoring time, which is independent of \(N_i(\cdot)\) given \(X_i\). On the contrary, let \(D_i\) be a failure time (informative censoring time), which is associated with \(N_i(\cdot)\) even after conditioning on \(X\). Then the follow-up time is defined as \(Y = \min(C, D, \tau)\). The observed data are independent and identically distributed copies of \(\{N_i(t), Y_i, X_i: t\le Y_i, i = 1, \ldots, n\}\). In the following, we suppress the index for the ease of discussion.

The simSC function

The function simSC generates the recurrent times from a recurrent event process, \(N(t)\), with a pre-specified rate function, \(\lambda(t)\), and the failure time, \(D\), from a pre-specified hazard function, \(h(t)\). By specifying the argument type in the function simSC, recurrent times and failure time can be generated from the following models:

  • type = "cox" for Cox-type models: \[\lambda(t) = Z\lambda_0(t) e^{X^\top\alpha}, h(t) = Zh_0(t) e^{X^\top\beta}.\]
  • type = "am" for accelerated mean models: \[\lambda(t) = Z\lambda_0(te^{X^\top\alpha})e^{X^\top\alpha}, h(t) = Zh_0(te^{X^\top\beta})e^{X^\top\beta}.\]
  • type = "sc" for scale-change models: \[\lambda(t) = Z\lambda_0(te^{X^\top\alpha})e^{X^\top\beta}, h(t) = Zh_0(te^{X^\top\beta})e^{X^\top\beta}.\]

The \(Z\) is a latent frailty variable. In simSC currently only allows two covariates, i.e., \(X = (X_{1}, X_{2})^\top\), where \(X_1\) is a Bernoulli random variable with probability 0.5 and \(X_2\) is a standard normal random variable. The non-informative censoring time, \(C\), is generated separately from an exponential distribution with mean 80. The observed follow-up time is then taken to be \(Y = \min(D, C, \tau)\). We further assume the baseline functions \[\lambda_0(t) = \frac{2}{1 + t}, h_0(t) = \frac{1}{8(1 + t)}.\]

The complete list of arguments in simSC are as follow

> library(reReg)
> args(simSC)
function (n, a, b, indCen = TRUE, type = c("cox", "am", "sc"), 
    tau = 60, summary = FALSE) 
NULL

The arguments are as follows

  • n number of individual
  • a, b numeric vectors of parameter of length two.
  • indCen a logical value indicating whether the censoring assumption is imposed. When indCen = TRUE, we set \(Z = 1\). Otherwise, \(Z\) is generated from a gamma distribution with mean 1 and variance 0.25 (e.g., rgamma(1, 4, 4)).
  • type a character string specifying the underlying model.
  • tau a numeric value specifying the maximum observation time, or \(\tau\) in the above notation.
  • summary a logical value indicating whether a brief data summary will be printed.

Examples

In the following examples, we simulate recurrent event using simSC, with summary = TRUE.

Cox-type model:

> dat.cox <- simSC(200, c(-1, 1), c(-1, 1), summary = TRUE)

Summary results for number of recurrent event per subject:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.00    1.00    2.00    3.87    5.00   33.00 

Number of failures: 49 (24.5%); Number of censored events: 151 (75.5%)

Number of x1 == 1: 96 (48%); Number of x1 == 0: 104 (52%)
Summary results for x2:
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-2.16226 -0.68832 -0.08922 -0.04727  0.61537  2.43023 

Accelerated mean model:

> dat.am <- simSC(200, c(-1, 1), c(-1, 1), type = "am", summary = TRUE)

Summary results for number of recurrent event per subject:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.00    2.00    4.50    4.97    7.00   17.00 

Number of failures: 50 (25%); Number of censored events: 150 (75%)

Number of x1 == 1: 93 (46.5%); Number of x1 == 0: 107 (53.5%)
Summary results for x2:
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-2.73351 -0.61499  0.06812  0.01714  0.63933  2.30666 

Scale-change model:

> dat.sc <- simSC(200, c(-1, 1), c(-1, 1), type = "sc", summary = TRUE)

Summary results for number of recurrent event per subject:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   1.750   3.000   4.055   6.000  14.000 

Number of failures: 58 (29%); Number of censored events: 142 (71%)

Number of x1 == 1: 112 (56%); Number of x1 == 0: 88 (44%)
Summary results for x2:
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-2.33329 -0.70020  0.02404 -0.03567  0.59424  2.78031 

The output of simSC are tibble objects.

> class(dat.cox)
[1] "tbl_df"     "tbl"        "data.frame"
> dat.cox
# A tibble: 974 x 6
      id    Time event status    x1     x2
   <int>   <dbl> <dbl>  <dbl> <dbl>  <dbl>
 1     1  0.0737     1      0     0  0.801
 2     1  0.152      1      0     0  0.801
 3     1  0.190      1      0     0  0.801
 4     1  0.224      0      1     0  0.801
 5     2  1.14       1      0     1 -0.687
 6     2  4.92       1      0     1 -0.687
 7     2  8.84       1      0     1 -0.687
 8     2 60          0      0     1 -0.687
 9     3  0.403      1      0     0  1.79 
10     3  0.505      1      0     0  1.79 
# ... with 964 more rows
> library(DT)
> datatable(dat.cox, options = list(pageLength = 10, scrollX=TRUE)) %>% 
+   formatRound(c("Time", "x2"), 3)

In this example, subject #1 experienced 3 recurrent events (at time 0.074, 0.152, and 0.190) and died at time 0.224. Similary, subject #2 experienced 3 recurrent events (at time 1.136, 4.917, and 8.839) and is alive when censored at time 60.