vignettes/reReg-plots.Rmd
reReg-plots.Rmd
In this vignette, we demonstrate how to create a reSurv
object in reReg
package. The reSurv
object is then used to create event plots and cumulative sample mean (CSM) plots. We will illustrate the usage of our functions with the readmission
data from the frailtypack
package (Rondeau, Mazroui, and Gonzalez 2012, González Ruiz et al. (2005)). The data contains re-hospitalization times after surgery in patients diagnosed with colorectal cancer. In this data set, the recurrent event is the readmission and the terminal event is either death or end of study. See ?readmission
for data details.
> library(reReg)
> data(readmission, package = "frailtypack")
> head(readmission)
id enum t.start t.stop time event chemo sex dukes charlson death
1 1 1 0 24 24 1 Treated Female D 3 0
2 1 2 24 457 433 1 Treated Female D 0 0
3 1 3 457 1037 580 0 Treated Female D 0 0
4 2 1 0 489 489 1 NonTreated Male C 0 0
5 2 2 489 1182 693 0 NonTreated Male C 0 0
6 3 1 0 15 15 1 NonTreated Male C 3 0
> attach(readmission)
reSurv
objectsThe usage of reSurv
is similar to the Surv
function in the survival
package. There are five arguments in reSurv
but not all of them need to be specified. The arguments of reSurv
is as follow
> args(reSurv)
function (time1, time2, id, event, status, origin = 0)
NULL
The five arguments are as follows
time1
starting time for the gap time between two successive recurrent eventstime2
ending time for the gap time between two successive recurrent eventsid
subject’s idevent
a binary vector used as the recurrent event indicator; event = 0
for non-recurrent times (such as the censoring times)status
a binary vector used as the status indicator for the terminal event; status = 0
for censored timeorigin
a numerical vector indicating the time origin of subjectsThere are a few ways to create an reSurv
object. When the starting times and the ending times of recurrent events are known, one can specify both times in reSurv
.
> reSurv(time1 = t.start, time2 = t.stop, id = id, event = event, status = death)
> reSurv(time1 = t.stop, id = id, event = event, status = death)
The same reSurv
objects can be achieved without specifying the argument names:
The function reSurv
prints a list-column tibble
.
> reSurv(t.stop, id, event, death)
# A tibble: 403 x 5
id recTime recType temTime temStatus
<int> <list> <list> <dbl> <dbl>
1 1 <dbl [2]> <int [2]> 1037 0
2 2 <dbl [1]> <int [1]> 1182 0
3 3 <dbl [1]> <int [1]> 783 1
4 4 <dbl [4]> <int [4]> 2048 0
5 5 <dbl [1]> <int [1]> 1144 0
# ... with 398 more rows
The example above shows patient id #1 experienced 2 readmissions (tij[1]
is a list of two double
s) with a terminal event at t = 1037
(days). The terminal event was censored (status = 0
). Similarly, patient id #3 has one readmission and died at t = 783
(days). On the other hand patient id # 4 has 4 readmission and was censored at t = 2048
(days).
Event plots are a quick and easy way to glance at recurrent event data. These can be produced by plotting the reSurv
object with R
’s generic function plot
, shown in Figure 1.
> reObj <- reSurv(t.stop, id, event, death)
> plot(reObj)
Figure 1: Creating an event plot from a reSurv
object.
Common graphical options like xlab
, ylab
, main
, and more can be directly passed down to plot
.
> plot(reObj, cex = 1.5, xlab = "Time in days", ylab = "Patients",
+ main = "Event plot for readmission data",
+ terminal.name = "Death",
+ recurrent.name = "Hospital readmission")
Figure 2: Creating an event plot from a reSurv
object with custom labels.
Separate (stratified) event plots can be produced with the plotEvents
function, which is a more specialized function for event plots. To demonstrate this, we first detach readmission
.
> detach(readmission)
Unlike the generic plot function, plotEvents
uses a formula object to specify the stratification.
> args(plotEvents)
function (formula, data, order = TRUE, control = list(), ...)
NULL
Here are some examples to re-create Figure 1 with plotEvents
:
> plotEvents(reObj)
> plotEvents(reObj, data = readmission)
> plotEvents(reObj ~ 1, data = readmission)
> plotEvents(reSurv(t.stop, id, event, death) ~ 1, data = readmission)
Figure 2 can be re-created with plotEvents
in a similar fashion:
> plotEvents(reSurv(t.stop, id, event, death) ~ 1, data = readmission,
+ cex = 1.5, xlab = "Time in days", ylab = "Patients",
+ main = "Event plot for readmission data",
+ terminal.name = "Death", recurrent.name = "Hospital readmission")
The plotEvents
function can also stratify event plots by groups. For example, the following can be used to stratify an event plot by sex
.
> plotEvents(reSurv(t.stop, id, event, death) ~ sex, data = readmission)
Figure 3: Event plot grouped by sex
Event plot by sex
and chemo
:
> plotEvents(reSurv(t.stop, id, event, death) ~ sex + chemo, data = readmission)
Figure 4: Event plot grouped by sex
and chemo
.
The nonparametric cumulative sample mean (CSM) function can be used to aid event plots to determine visually whether a trend or other pattern exists. The CSM function is defined as follows: \[\begin{array}{c}
\hat \mu_n(t) = \sum_{i = 1}^n\int_0^tY_\cdot^{-1}(t)dN_i(t),
\end{array}\] where \(Y_\cdot(t) = \sum_{j=1}^nY_i(t)\) is the total number of subjects who are at risk at time \(t\) and \(N_i(t)\) is the number of events over the time interval [0, t] for the ith process. This estimator is also known as the Nelson-Aalen estimator (Nelson 2003) and is also implemented in the mcf
function in the reda
package (Wang, Fu, and Yan 2017). Another variation of the CSM function is discussed in (Cook and Lawless 2007) assuming \(Y_i(t) = n\) for all \(t\). This variation is equivalent to the Nelson-Aalen estimator without adjusting for risk set. We refer the Nelson-Aalen estimator and the Cook-Lawless estimator as the CSM function with (default) and without adjusting for risk set, respectively.
The CSM plot can be created with R
’s generic function plot
, shown in Figure 5.
> plot(reObj, CSM = TRUE)
Figure 5: Creating a CSM plot from a reSurv
object.
The CSM plot can be created with the more specialized function, plotCSM
. Both plotEvents
and plotCSM
return ggplot2
objects. A side-by-side CSM plot, showing the CSM functions with and without risk adjustment, is shown in Figure 6.
> library(gridExtra)
> p1 <- plotCSM(reSurv(t.stop, id, event, death) ~ 1, data = readmission,
+ CSM = TRUE, main = "")
> p2 <- plotCSM(reSurv(t.stop, id, event, death) ~ 1, data = readmission,
+ CSM = TRUE, adjrisk = FALSE, main = "")
> grid.arrange(p1, p2, ncol=2)
Figure 6: Creating a CSM plot from a reSurv
object.
As with the plotEvents
, plotCSM
can generate CSM plots given covariates. Figure 7 depicts the CSM plots grouped by sex
.
> p1 <- plotCSM(reSurv(t.stop, id, event, death) ~ sex, data = readmission,
+ CSM = TRUE, main = "")
> p2 <- plotCSM(reSurv(t.stop, id, event, death) ~ sex, data = readmission,
+ CSM = TRUE, adjrisk = FALSE, main = "")
> grid.arrange(p1, p2, ncol=2)
Figure 7: CSM plot grouped by sex
.
As in plotEvents
, the more complicated Figure 8 depicts the CSM plots grouped by sex
and chemo
.
> p1 <- plotCSM(reSurv(t.stop, id, event, death) ~ sex + chemo, data = readmission,
+ CSM = TRUE, main = "")
> p2 <- plotCSM(reSurv(t.stop, id, event, death) ~ sex + chemo, data = readmission,
+ CSM = TRUE, adjrisk = FALSE, main = "")
> grid.arrange(p1, p2, ncol=2)
Figure 8: CSM plots grouped by sex
and chemo
.
The functions plotEvents
and plotCSM
can be used to accommodate recurrent event data with multiple recurrent types. To illustrate this, we generate hypothetical event types and store these in event2
.
> library(dplyr)
> set.seed(1)
> readmission <- readmission %>% mutate(event2 = event * sample(1:3, 861, TRUE))
plotEvents
and plotCSM
functions can still be applied when event
is replaced by event2
. Different recurrent events will be denoted by different colors and shapes.
> plotEvents(reSurv(t.stop, id, event2, death) ~ sex, data = readmission)
Figure 9: Event plots with multiple recurrent event types.
Default labels can be changed
> plotCSM(reSurv(t.stop, id, event2, death) ~ sex, adjrisk = FALSE, data = readmission,
+ recurrent.name = "Event types", recurrent.type = c("Type 1", "Type 2", "Type 3"))
Figure 10: CSM plot with multiple recurrent event types and customized labels.
> library(ggplot2)
> p1 <- plotCSM(reSurv(t.stop, id, event2, death) ~ sex, data = readmission, main = "") +
+ theme(legend.position="none")
> p2 <- plotCSM(reSurv(t.stop, id, event2, death) ~ sex, data = readmission,
+ adjrisk = FALSE, main = "") +
+ theme(legend.position="none")
> grid.arrange(p1, p2, ncol = 2)
Figure 11: CSM plot with multiple recurrent event types.
Cook, Richard J, and Jerald Lawless. 2007. The Statistical Analysis of Recurrent Events. New York: Wiley.
González Ruiz, Juan Ramón, Esteve Fernández Muñoz, Víctor Moreno Aguado, Josepa Ribes Puig, Mercè Peris, Matilde Navarro, Maria Cambray i Amenós, and Borràs Andrés. 2005. “Sex Differences in Hospital Readmission Among Colorectal Cancer Patients.” Journal of Epidemiology and Community Health, 2005, Vol. 59, Núm. 6, P. 506-511. BMJ Group.
Nelson, Wayne B. 2003. Recurrent Events Data Analysis for Product Repairs, Disease Recurrences, and Other Applications. Vol. 10. SIAM.
Rondeau, Virginie, Yassin Mazroui, and Juan R Gonzalez. 2012. “Frailtypack: An R Package for the Analysis of Correlated Survival Data with Frailty Models Using Penalized Likelihood Estimation or Parametrical Estimation.” Journal of Statistical Software 47 (4): 1–28.
Wang, Wenjie, Haoda Fu, and Jun Yan. 2017. reda: Recurrent Event Data Analysis. https://CRAN.R-project.org/package=reda.