In this vignette, we demonstrate how to create a recurrent event object with the Recur() function from the reda package (Wang et al. 2021). The Recur() function is imported when the reReg package is loaded. The Recur objectbundles together a set of recurrent times, failure time, and censoring status, with the convenience that it can be used as the response in model formula in the reReg package. We will illustrate the usage of Recur() with the cgd data set from the survival (Therneau 2021) and the readmission data set from the frailtypack package (Rondeau, Mazroui, and González 2012, @gonzalez2005sex).

> library(reReg)
> packageVersion("reReg")
[1] '1.4.0'
> data(readmission, package = "frailtypack")
> head(readmission)
  id enum t.start t.stop time event      chemo    sex dukes charlson death
1  1    1       0     24   24     1    Treated Female     D        3     0
2  1    2      24    457  433     1    Treated Female     D        0     0
3  1    3     457   1037  580     0    Treated Female     D        0     0
4  2    1       0    489  489     1 NonTreated   Male     C        0     0
5  2    2     489   1182  693     0 NonTreated   Male     C        0     0
6  3    1       0     15   15     1 NonTreated   Male     C        3     0
> readmission <- subset(readmission, !(id %in% c(60, 109, 280)))
> attach(readmission)

TheRecur object

The Recur() function is modeled after the Surv() function in the survival package (Therneau 2021). The function interface of Recur() is

> args(Recur)
function (time, id, event, terminal, origin, check = c("hard", 
    "soft", "none"), ...) 
NULL

The six arguments are

  • time a vector that represents the time of recurrent events and censoring, or as a list of time intervals that contains the starting time and the ending time of the interval. In the latter, the intervals are assumed to be open on the left and closed on the right, where the right end points are the time of recurrent events and censoring.
  • id specifies the subject identity. It can be numeric vector, character vector, or a factor vector. If it is left unspecified, Recur() will assume that each row represents a subject.
  • event is a numeric vector that represents the types of the recurrent events. Logical vector is allowed and converted to numeric vector. Non-positive values are internally converted to zero indicating censoring status.
  • terminal is a numeric vector that represents the status of the terminal event. Logical vector is allowed and converted to numeric vector. Non-positive values are internally converted to zero indicating censoring status. If a scalar value is specified, all subjects will have the same status of terminal events at their last recurrent episodes. The length of the specified terminal should be equal to the number of subjects, or number of data rows. In the latter case, each subject may have at most one positive entry of terminal at the last recurrent episode.
  • origin a numerical vector indicating the time origin of each subject. If a scalar value is specified, all subjects will have the same origin at the specified value. The length of the specified origin should be equal to the number of subjects, or number of data rows. In the latter case, different subjects may have different origins. However, one subject must have the same origin. In addition to numeric values, Date and difftime are also supported and converted to numeric values.
  • check is a character value specifying how to perform the checks for recurrent event data. Errors or warnings will be thrown, respectively, if the check is specified to be "hard" (default) or "soft". If check = "none" is specified, no data checking procedure will be run.
When the time origin is zero for all subjects as in the readmission data set, the time argument can be specified with time = t.stop or with time = t.start %to% t.stop, where the infix operator %to% is used to create a list of two elements containing the endpoints of the time intervals. When check = "hard" or check = "soft", the Recur() function performs an internal check for possible issues on the data structure. The Recur() function terminates and issues an error message once the check failed if check = "hard" (default). On the contrary, Recur() would proceed with a warning message when check = "soft" or without a warning message when check = "none". The checking criterion includes the following:

The Recur() function matches the arguments by position when the arguments’ names are not specified. Among all the arguments, only the argument time does not have default values and has to be specified by users. The default value for the argument id is seq_along(time), thus, Recur() assumes each row specifies the time point for each subject when id is not specified. However, using the default value id defeats the purpose using recurrent event methods. The default value for the argument event is a numerical vector, where the values 0 and 1 are used to indicate whether the endpoint of the time intervals in time is a non-recurrent event or a recurrent event, respectively. The event argument can accommodate more than one types of recurrent events; in this case the reference level (value 0) is used to indicate non-recurrent event. On the other hand, a zero vector is used as the default value for arguments terminal and orgin.

The default values in Recur() are chosen so that Recur() can be conveniently adopted in common situations. For example, in situations where the recurrent events are observed continuously and in the absence of terminal events, the event and terminal arguments can be left unspecified. In this case, the last entry within each subject will be treated as a censoring time. One example is the cgd data from the survival package, where the recurrent event is the serious infection observed from a placebo controlled trial of gamma interferon in chronic granulotamous disease. A terminal event was not defined in the cgd data and the patients were observed through the end of study. For this dataset, the Recur object can be constructed as below:

> data(cgd, package = "survival")
> with(cgd, Recur(tstart %2% tstop, id))
...
  [1] 1: (0, 219], (219, 373], (373, 414+]      
  [2] 2: (0, 8], (8, 26], ..., (350, 439+]      
  [3] 3: (0, 382+]                              
  [4] 4: (0, 388+]                              
  [5] 5: (0, 246], (246, 253], (253, 383+]      
  [6] 6: (0, 364+]                              
  [7] 7: (0, 292], (292, 364+]                  
  [8] 8: (0, 363+]                              
  [9] 9: (0, 294], (294, 349+]                  
 [10] 10: (0, 371+]                             
...

For each subject, the function Recur() prints intervals to represent the duration until the next event (a recurrent event or a terminal event). The Recur object for the readmission dataset can be constructed as below:

> Recur(t.stop, id, event, death)
...
  [1] 1: (0, 24], (24, 457], (457, 1037+]           
  [2] 2: (0, 489], (489, 1182+]                     
  [3] 3: (0, 15], (15, 783*]                        
  [4] 4: (0, 163], (163, 288], ..., (686, 2048+]    
  [5] 5: (0, 1134], (1134, 1144+]                   
  [6] 6: (0, 627], (627, 1190], ..., (1406, 1407+]  
  [7] 7: (0, 38], (38, 42], ..., (63, 1049+]        
  [8] 8: (0, 1466*]                                 
  [9] 9: (0, 148], (148, 1474+]                     
 [10] 10: (0, 1113+]                                
...

The readmission example above shows patient id #1 experienced 2 hostpital readmissions with a terminal event at t = 1037 (days). The + at t = 1037 indicates the terminal time was censored, e.g., this patient did not experience the event of interest (death) at t = 1037. Similarly, patient id #3 has one readmission and died at t = 783 (days) as indicated by * at 783. On the other hand patient id # 4 has more than 3 readmissions and was censored at t = 2048 (days). The readmission intervals was suppressed to prevent printing results wider than the screen allowance. The number of intervals to be printed can be tuned using the options and argument reda.Recur.maxPrint.

Readers are referred to a separate vignette on Recur() for a detailed introduction of Recur(). The reSurv() function is being deprecated in Version 1.2.0. In the current version, the reSurv() function can still be used, but the reSurv object will be automatically transformed to the corresponding Recur object.

Reference

González, Juan Ramón, Esteve Fernandez, Víctor Moreno, Josepa Ribes, Mercè Peris, Matilde Navarro, Maria Cambray, and Josep Maria Borrás. 2005. “Sex Differences in Hospital Readmission Among Colorectal Cancer Patients.” Journal of Epidemiology & Community Health 59 (6): 506–11.

Rondeau, Virginie, Yassin Mazroui, and Juan Ramń González. 2012. “frailtypack: An R Package for the Analysis of Correlated Survival Data with Frailty Models Using Penalized Likelihood Estimation or Parametrical Estimation.” Journal of Statistical Software 47 (4): 1–28.

Therneau, Terry M. 2021. A Package for Survival Analysis in R. https://CRAN.R-project.org/package=survival.

Wang, Wenjie, Haoda Fu, Sy Han Chiou, and Jun Yan. 2021. reda: Recurrent Event Data Analysis. https://github.com/wenjie2wang/reda.