permDep
functionIn this vignette, we demonstrate the usage of the permDep
function.
We define the following functions to generate survival data with dependent left-truncation.
In the first setting, we generate the survival time from a Weibull distribution with shape parameter 3 and scale parameter 8.5 and generate the truncation time from an exponential with mean 5. The dependence between the survival time and the truncation time is controlled by a normal copula from the copula
package with a pre-specified pre-truncation unconditional Kendall’s tau (tau
).
> library(copula)
> simDat1 <- function(n, tau) {
+ k <- 1
+ tt <- xx <- rep(-1, n)
+ kt <- iTau(normalCopula(), tau)
+ cp <- normalCopula(kt, dim = 2)
+ dist <- mvdc(cp, c("exp", "weibull"), list(list(rate = 0.2), list(shape = 3, scale = 8.5)))
+ while(k <= n){
+ dat <- rMvdc(1, dist)
+ tt[k] <- dat[1]
+ xx[k] <- dat[2]
+ if(tt[k] <= xx[k]) k = k + 1
+ }
+ data.frame(Trun = tt, Time = xx)
+ }
The simDat1
function generates monotone dependence model and is used in Chiou et al. (2018) in the absence of censoring. The following scatterplots confirms the monotonic dependence.
> library(ggplot2)
> library(gridExtra)
> set.seed(123)
> tmp <- lapply(-4:4/5, function(x)
+ qplot(Trun, Time, data = simDat1(100, x), main = paste("Tau = ", x)))
> do.call("grid.arrange", c(tmp, ncol = 3))
In the second setting, we consider a scenario with non-monotonic dependence between the survival time and the truncation time. We used a normal copula to specify the joint distribution of \((X, |T- 0.5|)\), where \(X\) is the survival time and \(T\) is the truncation time. We generate \(X\) from a Weibull distribution with shape parameter 1 and scale parameter 4, and generate \(T\) from a uniform distribution between 0 and 1. This implies that the pre-truncation unconditional Kendall’s taus have different signs for \(T < 0.5\) and \(T\ge0.5\).
> simDat2 <- function(n, tau) {
+ k <- 1
+ tt <- xx <- rep(-1, n)
+ kt <- iTau(normalCopula(), tau)
+ cp <- normalCopula(kt, dim = 2)
+ dist <- mvdc(cp, c("unif", "weibull"),
+ list(list(min = 0, max = 1), list(shape = 1, scale = 4)))
+ while(k <= n){
+ tmp <- rMvdc(1, dist)
+ if (k <= n/2) tt[k] <- tmp[1]
+ if (k > n/2) tt[k] <- 2 - tmp[1]
+ xx[k] <- tmp[2]
+ if(tt[k] <= xx[k]) k = k + 1
+ }
+ dat <- data.frame(Trun = tt, Time = xx)
+ dat
+ }
> set.seed(123)
> tmp <- lapply(-4:4/5, function(x)
+ qplot(Trun, Time, data = simDat2(100, x), main = paste("Tau = ", x)))
> do.call("grid.arrange", c(tmp, ncol = 3))
permDep
functionThe arguments of permDep
are as follow
> args(permDep)
function (trun, obs, permSize, cens, sampling = c("conditional",
"unconditional", "is.conditional", "is.unconditional"), kendallOnly = FALSE,
minp1Only = FALSE, minp2Only = FALSE, nc = ceiling(detectCores()/2),
seed = NULL, minp.eps = NULL, plot.int = FALSE, anim_name = NULL)
NULL
The arguments are
trun
is the left truncation time
obs
is the observed failure time; this is defined to be the minimum of the survival time and censoring time In addition, this is required to be greater than the truncation time
permSize
is the number of permutations
cens
is the status indicator; 0 = censored, 1 = event
sampling
is a character string specifying the sampling method used in permutation. The following are permitted.
conditional
conditional permutationunconditional
unconditional permutationis.conditional
importance sampling version of the conditional permutation (under development)is.unconditional
importance sampling version of the unconditional permutation (under development)kendallOnly
is a logical value indicating whether to use the conditional Kendall’s tau as the only test statistics
minp1Only
is a logical value indicating whether to use the minp1 test as the only test statistics
minp2Only
is a logical value indicating whether to use the minp2 test as the only test statistics
nc
is the number of cores used in the permutation test. When nc > 1
, permutation is carried out with parallel computing methods. The default is to use half of the available number of CPU cores on the current host.
seed
is an optional vector containing random seeds to be used to generate permutation samples. Random seeds will be used when left unspecified.
minp.eps
is an optional value indicating the width of the intervals used in minp2 procedure. The following are permitted.
NULL
uses the automatic selection algorithm outlined in Chiou et al. (2018)plot.int
is a logical value indicating whether to plot the intervals when applying the minp tests. This is more useful when only one of minp1Only
and minp2Only
is TRUE
.
We will illustrate the usage of permDep
with simulated data generated from simDat1
and simDat2
.
We first use a small sample size example to demonstrate the different features.
> set.seed(123)
> dat <- simDat1(30, .9) ## highly correlated data
Note that the computation time depends on the machine setting in the current host.
> parallel::detectCores()
[1] 8
> set.seed(123); system.time(f1 <- permDep(dat$Trun, dat$Time, 500, nc = 1))
user system elapsed
20.324 0.016 20.343
> set.seed(123); system.time(f2 <- permDep(dat$Trun, dat$Time, 500, nc = 2))
user system elapsed
0.060 0.012 11.918
> identical(f1, f2)
[1] TRUE
> f1
Hypothesis test of quasi-independence based on conditional Kendall's tau statistics
Conditional permutation with size = 500
p-value = 0.0020
Hypothesis test of quasi-independence based on minp1 statistics
Conditional permutation size = 500
p-value = 0.0020
Hypothesis test of quasi-independence based on minp2 statistics
Conditional permutation size = 500
p-value = 0.0060
Computation time is faster with parallel computation. The following codes run parallel computation with all available cores:
> set.seed(123); system.time(f2 <- permDep(dat$Trun, dat$Time, 500, nc = parallel::detectCores()))
However, using 8 does not make the program run 8 faster.
minp.eps
options> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE)$p.valueMinp2
[1] 0.005988024
> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE, minp.eps = 0.5)$p.valueMinp2
[1] 0.01197605
> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE, minp.eps = "in")$p.valueMinp2
[1] 0.003992016
> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE, minp.eps = "out")$p.valueMinp2
[1] 0.001996008
The performance of the minp2 test depends on the specification of the interval width (e.g., minp.esp
), however, the resulting \(p\)-values from the above example are significant at \(\alpha = 0.05\).
To render the animation in this section, the users might need to install the gifski
package.
Enabling plot.int
allows users to visualize the intervals used in computing the log-rank statistics when applying the minp1 and the minp2 tests on the observed data.
> permDep(dat$Trun, dat$Time, 500, minp1Only = TRUE,
+ plot.int = TRUE, anim_name = "vig-minp1")
Hypothesis test of quasi-independence based on minp1 statistics
Conditional permutation size = 500
p-value = 0.0020
> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE,
+ plot.int = TRUE, anim_name = "vig-minp2")
Hypothesis test of quasi-independence based on minp2 statistics
Conditional permutation size = 500
p-value = 0.0080
We use simDat2
to generate truncated data with non-monotone dependence. This scenario would emphasize the usefulness of the minp2 test.
> set.seed(123)
> dat <- simDat2(100, -.9) ## highly correlated data
> permDep(dat$Trun, dat$Time, 500, minp1Only = TRUE,
+ plot.int = TRUE, anim_name = "vig-minp1-non-monotone")
Hypothesis test of quasi-independence based on minp1 statistics
Conditional permutation size = 500
p-value = 0.0080
> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE,
+ plot.int = TRUE, anim_name = "vig-minp2-non-monotone")
Hypothesis test of quasi-independence based on minp2 statistics
Conditional permutation size = 500
p-value = 0.0080
> permDep(dat$Trun, dat$Time, 500, minp2Only = TRUE, minp.eps = "out",
+ plot.int = TRUE, anim_name = "vig-minp2-2")
Hypothesis test of quasi-independence based on minp2 statistics
Conditional permutation size = 500
p-value = 0.0060
Chiou, Sy Han, Jing Qian, Elizabeth Mormino, Rebecca A Betensky, Lifestyle Flagship Study of Aging, Harvard Aging Brain Study, Alzheimer’s Disease Neuroimaging Initiative, and others. 2018. “Permutation Tests for General Dependent Truncation.” Computational Statistics & Data Analysis 128: 308–24.