In this vignette, we demonstrate how to create a reSurv object in reReg package. The reSurv object is then used to create event plots and cumulative sample mean (CSM) plots. We will illustrate the usage of our functions with the readmission data from the frailtypack package (Rondeau, Mazroui, and Gonzalez 2012, González Ruiz et al. (2005)). The data contains re-hospitalization times after surgery in patients diagnosed with colorectal cancer. In this data set, the recurrent event is the readmission and the terminal event is either death or end of study. See ?readmission for data details.

> library(reReg)
> packageVersion("reReg")
[1] '1.2.0'
> data(readmission, package = "frailtypack")
> head(readmission)
  id enum t.start t.stop time event      chemo    sex dukes charlson death
1  1    1       0     24   24     1    Treated Female     D        3     0
2  1    2      24    457  433     1    Treated Female     D        0     0
3  1    3     457   1037  580     0    Treated Female     D        0     0
4  2    1       0    489  489     1 NonTreated   Male     C        0     0
5  2    2     489   1182  693     0 NonTreated   Male     C        0     0
6  3    1       0     15   15     1 NonTreated   Male     C        3     0
> readmission <- subset(readmission, !(id %in% c(60, 109, 280)))
> attach(readmission)

reSurv/Recur objects

The reSurv() function is being deprecated in Version 1.2.0. A replacement of this function is the Recur() function imported from the reda package (Wang, Fu, and Yan 2017). We give some background on the reSurv() function and a brief introduction to the Recur() function here.

The usage of reSurv() is similar to the Surv() function in the survival package. There are five arguments in reSurv() but not all of them need to be specified. The arguments of reSurv() are as follow

> args(reSurv)
function (time1, time2, id, event, status, origin = 0) 
NULL

The five arguments are as follows

  • time1 starting time for the gap time between two successive recurrent events
  • time2 ending time for the gap time between two successive recurrent events
  • id subject’s id
  • event a vector used as the recurrent event indicator; event = 0 for non-recurrent times (such as the censoring times)
  • status a binary vector used as the status indicator for the terminal event; status = 0 for censored time
  • origin a numerical vector indicating the time origin of subjects

The arguments of Recur() are similar to those of reSurv(). THey are listed as follow

> args(Recur)
function (time, id, event, terminal, origin, check = c("hard", 
    "soft", "none"), ...) 
NULL

The six arguments are as follows

  • time is a numerical vector representing the time of recurrence event or censoring, or a list with elements named time1 and time2 for specifying the follow-up of recurrent events. In the latter case, function %to% (or %2%) can be used to construct such list. In addition to numeric values, Date and difftime are allowed and converted to numeric values. An error will be thrown if this argument is not specified.
  • id specifies the subject identity. It can be numeric vector, character vector, or a factor vector. If it is left unspecified, Recur() will assume that each row represents a subject.
  • event is a numeric vector that represents the types of the recurrent events. Logical vector is allowed and converted to numeric vector. Non-positive values are internally converted to zero indicating censoring status.
  • terminal is a numeric vector that represents the status of the terminal event. Logical vector is allowed and converted to numeric vector. Non-positive values are internally converted to zero indicating censoring status. If a scalar value is specified, all subjects will have the same status of terminal events at their last recurrent episodes. The length of the specified terminal should be equal to the number of subjects, or number of data rows. In the latter case, each subject may have at most one positive entry of terminal at the last recurrent episode.
  • origin a numerical vector indicating the time origin of each subject. If a scalar value is specified, all subjects will have the same origin at the specified value. The length of the specified origin should be equal to the number of subjects, or number of data rows. In the latter case, different subjects may have different origins. However, one subject must have the same origin. In addition to numeric values, Date and difftime are also supported and converted to numeric values.
  • check is a character value specifying how to perform the checks for recurrent event data. Errors or warnings will be thrown, respectively, if the check is specified to be "hard""’" (default) or "soft""’". If check = "none" is specified, no data checking procedure will be run.

The main arguments of Recur() come directly from reSurv(), but Recur() comes with a comprehensive check for recurrent event data. Moreover, Recur() returns a S4 class object that can be used for functions in both the reReg package and the reda package. Readers are referred to a separate vignette on Recur() for a detailed introduction of Recur().

The Recur object can be constructed in the same way as an reSurv object. In the current version (1.1.7), the reSurv() function can still be used, but the reSurv object will be automatically transformed to the corresponding Recur object. For example:

> invisible(reSurv(t.stop, id, event, death))
Warning in reSurv(t.stop, id, event, death): 'reSurv()' is being deprecated in Version 1.1.7. Output is prepared by 'Recur()'.
 See '?Recur()' for details.
> identical(suppressWarnings(reSurv(t.stop, id, event, death)), Recur(t.stop, id, event, death))
[1] TRUE

The usage of Recur() is very similar to that of reSurv(). In the following, we introduced different ways to create an Recur object. When the starting times and the ending times of recurrent events are known, one can specify both times in Recur.

> Recur(t.start %to% t.stop, id = id, event = event, status = death)
> Recur(time = t.stop, id = id, event = event, status = death)

The same Recur objects can be achieved without specifying the argument names:

> Recur(t.start %to% t.stop, id, event, death)
> Recur(t.stop, id, event, death)

For each subject, the function Recur() prints intervals to represent the duration until the next event (a recurrent event or a terminal event).

> Recur(t.stop, id, event, death)
...
  [1] 1: (0, 24], (24, 457], (457, 1037+]           
  [2] 2: (0, 489], (489, 1182+]                     
  [3] 3: (0, 15], (15, 783*]                        
  [4] 4: (0, 163], (163, 288], ..., (686, 2048+]    
  [5] 5: (0, 1134], (1134, 1144+]                   
  [6] 6: (0, 627], (627, 1190], ..., (1406, 1407+]  
  [7] 7: (0, 38], (38, 42], ..., (63, 1049+]        
  [8] 8: (0, 1466*]                                 
  [9] 9: (0, 148], (148, 1474+]                     
 [10] 10: (0, 1113+]                                
 [11] 11: (0, 267], (267, 276], (276, 1635+]        
 [12] 12: (0, 1189+]                                
 [13] 13: (0, 1028], (1028, 1207+]                  
 [14] 14: (0, 1125+]                                
 [15] 15: (0, 734], (734, 892], (892, 893*]         
...

The example above shows patient id #1 experienced 2 readmission with a terminal event at t = 1037 (days). The + at t = 1037 indicates the terminal time was censored, e.g., this patient did not experience the event of interest (death) at t = 1037. Similarly, patient id #3 has one readmission and died at t = 783 (days) as indicated by * at 783. On the other hand patient id # 4 has more than 3 readmissions and was censored at t = 2048 (days). The readmission intervals was suppressed to prevent printing results wider than the screen allowance. The number of intervals to be printed can be tuned using the options and argument reda.Recur.maxPrint. For example,

> options(reda.Recur.maxPrint = 6)
> Recur(t.stop, id, event, death)
...
  [1] 1: (0, 24], (24, 457], (457, 1037+]                                                 
  [2] 2: (0, 489], (489, 1182+]                                                           
  [3] 3: (0, 15], (15, 783*]                                                              
  [4] 4: (0, 163], (163, 288], (288, 638], (638, 686], (686, 2048+]                       
  [5] 5: (0, 1134], (1134, 1144+]                                                         
  [6] 6: (0, 627], (627, 1190], (1190, 1406], (1406, 1407+]                               
  [7] 7: (0, 38], (38, 42], (42, 63], (63, 1049+]                                         
  [8] 8: (0, 1466*]                                                                       
  [9] 9: (0, 148], (148, 1474+]                                                           
 [10] 10: (0, 1113+]                                                                      
 [11] 11: (0, 267], (267, 276], (276, 1635+]                                              
 [12] 12: (0, 1189+]                                                                      
 [13] 13: (0, 1028], (1028, 1207+]                                                        
 [14] 14: (0, 1125+]                                                                      
 [15] 15: (0, 734], (734, 892], (892, 893*]                                               
...

Event plots

Event plots are a quick and easy way to glance at recurrent event data. These can be produced by plotting the Recur object with R’s generic function plot, shown in Figure 1.

> reObj <- Recur(t.stop, id, event, death)
> plot(reObj)
Figure 1: Creating an event plot from a `Recur` object.

Figure 1: Creating an event plot from a Recur object.

Common graphical options like xlab, ylab, main, and more can be directly passed down to plot.

> plot(reObj, cex = 1.5, xlab = "Time in days", ylab = "Patients", 
+      main = "Event plot for readmission data", 
+      terminal.name = "Death", 
+      recurrent.name = "Hospital readmission")
Figure 2: Creating an event plot from a `Recur` object with custom labels.

Figure 2: Creating an event plot from a Recur object with custom labels.

Separate (stratified) event plots can be produced with the plotEvents function, which is a more specialized function for event plots. To demonstrate this, we first detach readmission.

> detach(readmission)

Unlike the generic plot function, plotEvents uses a formula object to specify the stratification.

> args(plotEvents)
function (formula, data, result = c("increasing", "decreasing", 
    "asis"), control = list(), ...) 
NULL

Here are some examples to re-create Figure 1 with plotEvents:

> plotEvents(reObj)
> plotEvents(reObj, data = readmission)
> plotEvents(reObj ~ 1, data = readmission)
> plotEvents(Recur(t.stop, id, event, death) ~ 1, data = readmission)

Figure 2 can be re-created with plotEvents in a similar fashion:

> plotEvents(Recur(t.stop, id, event, death) ~ 1, data = readmission,
+            cex = 1.5, xlab = "Time in days", ylab = "Patients", 
+            main = "Event plot for readmission data", 
+            terminal.name = "Death", recurrent.name = "Hospital readmission")

The plotEvents function can also stratify event plots by groups. For example, the following can be used to stratify an event plot by sex.

> plotEvents(Recur(t.stop, id, event, death) ~ sex, data = readmission)
Figure 3: Event plot grouped by `sex`

Figure 3: Event plot grouped by sex

Event plot by sex and chemo:

> plotEvents(Recur(t.stop, id, event, death) ~ sex + chemo, data = readmission)
Figure 4: Event plot grouped by `sex` and `chemo`.

Figure 4: Event plot grouped by sex and chemo.

CSM plots

The non-parametric cumulative sample mean (CSM) function can be used to aid event plots to determine visually whether a trend or other pattern exists. The CSM function is defined as follows: \[\begin{array}{c} \hat \mu_n(t) = \int_0^tY_\cdot^{-1}(t)dN_\cdot(t), \end{array}\] where \(Y_\cdot(t) = \sum_{i=1}^nY_i(t)\) is the total number of subjects who are at risk over \([t, t + dt)\) and \(dN_\cdot(t) = \sum_{i=1}^n dN_i(t)\) is the total number of events over the time interval [0, t]. This estimator is also known as the Nelson-Aalen estimator (Nelson 2003) and is also implemented in the mcf() function in the reda package (Wang, Fu, and Yan 2017). Another variation of the CSM function is discussed in (Cook and Lawless 2007) assuming \(Y_i(t) = n\) for all \(t\). This variation is equivalent to the Nelson-Aalen estimator without adjusting for risk set. We refer the Nelson-Aalen estimator and the Cook-Lawless estimator as the CSM function with (default) and without adjusting for risk set, respectively.

The CSM plot can be created with R’s generic function plot, shown in Figure 5.

> plot(reObj, CSM = TRUE)
Figure 5: Creating a CSM plot from a `Recur` object.

Figure 5: Creating a CSM plot from a Recur object.

The CSM plot can be created with the more specialized function, plotCSM. Both plotEvents and plotCSM return ggplot2 objects. A side-by-side CSM plot, showing the CSM functions with and without risk adjustment, is shown in Figure 6.

> library(gridExtra)
> p1 <- plotCSM(Recur(t.stop, id, event, death) ~ 1, data = readmission, main = "")
> p2 <- plotCSM(Recur(t.stop, id, event, death) ~ 1, data = readmission, 
+               adjrisk = FALSE, main = "")
> grid.arrange(p1, p2, ncol=2)
Figure 6: Creating a CSM plot from a `Recur` object.

Figure 6: Creating a CSM plot from a Recur object.

As with the plotEvents, plotCSM can generate CSM plots given covariates. Figure 7 depicts the CSM plots grouped by sex.

> p1 <- plotCSM(Recur(t.stop, id, event, death) ~ sex, data = readmission, main = "")
> p2 <- plotCSM(Recur(t.stop, id, event, death) ~ sex, data = readmission, 
+               adjrisk = FALSE, main = "")
> grid.arrange(p1, p2, ncol=2)
Figure 7: CSM plot grouped by `sex`.

Figure 7: CSM plot grouped by sex.

As in plotEvents, the more complicated Figure 8 depicts the CSM plots grouped by sex and chemo.

> p1 <- plotCSM(Recur(t.stop, id, event, death) ~ sex + chemo, data = readmission, main = "")
> p2 <- plotCSM(Recur(t.stop, id, event, death) ~ sex + chemo, data = readmission, 
+               adjrisk = FALSE, main = "")
> grid.arrange(p1, p2, ncol=2)
Figure 8: CSM plots grouped by `sex` and `chemo`.

Figure 8: CSM plots grouped by sex and chemo.

Multiple recurrent event types

The functions plotEvents and plotCSM can be used to accommodate recurrent event data with multiple recurrent types. To illustrate this, we generate hypothetical event types and store these in event2.

> set.seed(1)
> readmission$event2 <- readmission$event * sample(1:3, 852, TRUE)

plotEvents and plotCSM functions can still be applied when event is replaced by event2. Different recurrent events will be denoted by different colors and shapes.

> plotEvents(Recur(t.stop, id, event2, death) ~ sex, data = readmission)
Figure 9: Event plots with multiple recurrent event types.

Figure 9: Event plots with multiple recurrent event types.

Default labels can be changed

> plotCSM(Recur(t.stop, id, event2, death) ~ sex, adjrisk = FALSE, data = readmission,
+         recurrent.name = "Event types", recurrent.type = c("Type 1", "Type 2", "Type 3"))
Figure 10: CSM plot with multiple recurrent event types and customized labels.

Figure 10: CSM plot with multiple recurrent event types and customized labels.

> library(ggplot2)
> p1 <- plotCSM(Recur(t.stop, id, event2, death) ~ sex, data = readmission, main = "") +
+   theme(legend.position="none")
> p2 <- plotCSM(Recur(t.stop, id, event2, death) ~ sex, data = readmission, 
+               adjrisk = FALSE, main = "") +
+   theme(legend.position="none")
> grid.arrange(p1, p2, ncol = 2)
Figure 11: CSM plot with multiple recurrent event types.

Figure 11: CSM plot with multiple recurrent event types.

Reference

Cook, Richard J, and Jerald Lawless. 2007. The Statistical Analysis of Recurrent Events. New York: Wiley.

González Ruiz, Juan Ramón, Esteve Fernández Muñoz, Víctor Moreno Aguado, Josepa Ribes Puig, Mercè Peris, Matilde Navarro, Maria Cambray i Amenós, and Borràs Andrés. 2005. “Sex Differences in Hospital Readmission Among Colorectal Cancer Patients.” Journal of Epidemiology and Community Health, 2005, Vol. 59, Núm. 6, P. 506-511. BMJ Group.

Nelson, Wayne B. 2003. Recurrent Events Data Analysis for Product Repairs, Disease Recurrences, and Other Applications. Vol. 10. SIAM.

Rondeau, Virginie, Yassin Mazroui, and Juan R Gonzalez. 2012. “Frailtypack: An R Package for the Analysis of Correlated Survival Data with Frailty Models Using Penalized Likelihood Estimation or Parametrical Estimation.” Journal of Statistical Software 47 (4): 1–28.

Wang, Wenjie, Haoda Fu, and Jun Yan. 2017. reda: Recurrent Event Data Analysis. https://CRAN.R-project.org/package=reda.