permDep functionIn this vignette, we demonstrate the usage of the permDep function.
We define the following functions to generate survival data with dependent left-truncation.
In the first setting, we generate the survival time from a Weibull distribution with shape parameter 3 and scale parameter 8.5 and generate the truncation time from an exponential with mean 5. The dependence between the survival time and the truncation time is controlled by a normal copula from the copula package with a pre-specified pre-truncation unconditional Kendall’s tau (tau).
> library(copula)
> simDat1 <- function(n, tau) {
+ k <- 1
+ tt <- xx <- rep(-1, n)
+ kt <- iTau(normalCopula(), tau)
+ cp <- normalCopula(kt, dim = 2)
+ dist <- mvdc(cp, c("exp", "weibull"), list(list(rate = 0.2), list(shape = 3, scale = 8.5)))
+ while(k <= n){
+ dat <- rMvdc(1, dist)
+ tt[k] <- dat[1]
+ xx[k] <- dat[2]
+ if(tt[k] <= xx[k]) k = k + 1
+ }
+ data.frame(Trun = tt, Time = xx)
+ }
The simDat1 function generates monotone dependence model and is used in Chiou et al. (2018) in the absence of censoring. The following scatterplots confirms the monotonic dependence.
> library(ggplot2)
> library(gridExtra)
> set.seed(123)
> tmp <- lapply(-4:4/5, function(x)
+ qplot(Trun, Time, data = simDat1(100, x), main = paste("Tau = ", x)))
> do.call("grid.arrange", c(tmp, ncol = 3))
In the second setting, we consider a scenario with non-monotonic dependence between the survival time and the truncation time. We used a normal copula to specify the joint distribution of \((X, |T- 0.5|)\), where \(X\) is the survival time and \(T\) is the truncation time. We generate \(X\) from a Weibull distribution with shape parameter 1 and scale parameter 4, and generate \(T\) from a uniform distribution between 0 and 1. This implies that the pre-truncation unconditional Kendall’s taus have different signs for \(T < 0.5\) and \(T\ge0.5\).
> simDat2 <- function(n, tau) {
+ k <- 1
+ tt <- xx <- rep(-1, n)
+ kt <- iTau(normalCopula(), tau)
+ cp <- normalCopula(kt, dim = 2)
+ dist <- mvdc(cp, c("unif", "weibull"),
+ list(list(min = 0, max = 1), list(shape = 1, scale = 4)))
+ while(k <= n){
+ tmp <- rMvdc(1, dist)
+ if (k <= n/2) tt[k] <- tmp[1]
+ if (k > n/2) tt[k] <- 2 - tmp[1]
+ xx[k] <- tmp[2]
+ if(tt[k] <= xx[k]) k = k + 1
+ }
+ dat <- data.frame(Trun = tt, Time = xx)
+ dat
+ }
> set.seed(123)
> tmp <- lapply(-4:4/5, function(x)
+ qplot(Trun, Time, data = simDat2(100, x), main = paste("Tau = ", x)))
> do.call("grid.arrange", c(tmp, ncol = 3))
permDep functionThe arguments of permDep are as follow
> args(permDep)
function (trun, obs, permSize, cens, sampling = c("conditional",
"unconditional", "is.conditional", "is.unconditional"), kendallOnly = FALSE,
minp1Only = FALSE, minp2Only = FALSE, nc = ceiling(detectCores()/2),
seed = NULL, minp.eps = NULL, plot.int = FALSE, anim_name = NULL)
NULL
The arguments are
trun is the left truncation time
obs is the observed failure time; this is defined to be the minimum of the survival time and censoring time In addition, this is required to be greater than the truncation time
permSize is the number of permutations
cens is the status indicator; 0 = censored, 1 = event
sampling is a character string specifying the sampling method used in permutation. The following are permitted.
conditional conditional permutationunconditional unconditional permutationis.conditional importance sampling version of the conditional permutation (under development)is.unconditional importance sampling version of the unconditional permutation (under development)kendallOnly is a logical value indicating whether to use the conditional Kendall’s tau as the only test statistics
minp1Only is a logical value indicating whether to use the minp1 test as the only test statistics
minp2Only is a logical value indicating whether to use the minp2 test as the only test statistics
nc is the number of cores used in the permutation test. When nc > 1, permutation is carried out with parallel computing methods. The default is to use half of the available number of CPU cores on the current host.
seed is an optional vector containing random seeds to be used to generate permutation samples. Random seeds will be used when left unspecified.
minp.eps is an optional value indicating the width of the intervals used in minp2 procedure. The following are permitted.
NULL uses the automatic selection algorithm outlined in Chiou et al. (2018)plot.int is a logical value indicating whether to plot the intervals when applying the minp tests. This is more useful when only one of minp1Only and minp2Only is TRUE.
We will illustrate the usage of permDep with simulated data generated from simDat1 and simDat2.
We first use a small sample size example to demonstrate the different features.
> set.seed(123)
> dat <- simDat1(30, .9) ## highly correlated data
Note that the computation time depends on the machine setting in the current host.
> parallel::detectCores()
[1] 8
> set.seed(123); system.time(f1 <- permDep(dat$Trun, dat$Time, 500, nc = 1))
user system elapsed
20.324 0.016 20.343
> set.seed(123); system.time(f2 <- permDep(dat$Trun, dat$Time, 500, nc = 2))
user system elapsed
0.060 0.012 11.918
> identical(f1, f2)
[1] TRUE
> f1
Hypothesis test of quasi-independence based on conditional Kendall's tau statistics
Conditional permutation with size = 500
p-value = 0.0020
Hypothesis test of quasi-independence based on minp1 statistics
Conditional permutation size = 500
p-value = 0.0020
Hypothesis test of quasi-independence based on minp2 statistics
Conditional permutation size = 500
p-value = 0.0060
Computation time is faster with parallel computation. The following codes run parallel computation with all available cores:
> set.seed(123); system.time(f2 <- permDep(dat$Trun, dat$Time, 500, nc = parallel::detectCores()))
However, using 8 does not make the program run 8 faster.
minp.eps options> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE)$p.valueMinp2
[1] 0.005988024
> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE, minp.eps = 0.5)$p.valueMinp2
[1] 0.01197605
> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE, minp.eps = "in")$p.valueMinp2
[1] 0.003992016
> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE, minp.eps = "out")$p.valueMinp2
[1] 0.001996008
The performance of the minp2 test depends on the specification of the interval width (e.g., minp.esp), however, the resulting \(p\)-values from the above example are significant at \(\alpha = 0.05\).
To render the animation in this section, the users might need to install the gifski package.
Enabling plot.int allows users to visualize the intervals used in computing the log-rank statistics when applying the minp1 and the minp2 tests on the observed data.
> permDep(dat$Trun, dat$Time, 500, minp1Only = TRUE,
+ plot.int = TRUE, anim_name = "vig-minp1")
Hypothesis test of quasi-independence based on minp1 statistics
Conditional permutation size = 500
p-value = 0.0020
> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE,
+ plot.int = TRUE, anim_name = "vig-minp2")
Hypothesis test of quasi-independence based on minp2 statistics
Conditional permutation size = 500
p-value = 0.0080
We use simDat2 to generate truncated data with non-monotone dependence. This scenario would emphasize the usefulness of the minp2 test.
> set.seed(123)
> dat <- simDat2(100, -.9) ## highly correlated data
> permDep(dat$Trun, dat$Time, 500, minp1Only = TRUE,
+ plot.int = TRUE, anim_name = "vig-minp1-non-monotone")
Hypothesis test of quasi-independence based on minp1 statistics
Conditional permutation size = 500
p-value = 0.0080
> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE,
+ plot.int = TRUE, anim_name = "vig-minp2-non-monotone")
Hypothesis test of quasi-independence based on minp2 statistics
Conditional permutation size = 500
p-value = 0.0080
> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE, minp.eps = "out",
+ plot.int = TRUE, anim_name = "vig-minp2-2")
Hypothesis test of quasi-independence based on minp2 statistics
Conditional permutation size = 500
p-value = 0.0060
Chiou, Sy Han, Jing Qian, Elizabeth Mormino, Rebecca A Betensky, Lifestyle Flagship Study of Aging, Harvard Aging Brain Study, Alzheimer’s Disease Neuroimaging Initiative, and others. 2018. “Permutation Tests for General Dependent Truncation.” Computational Statistics & Data Analysis 128: 308–24.