March 23, 2019, at UTRGV

## Our team

• Dr. Jian Qian
• Department of Epidemiology and Biostatistics
• School of Public Health and Health Sciences
• University of Massachusetts Amherst
• Dr. Elizabeth Mormino
• Department of Neurology
• School of Medicine
• Stanford University
• Dr. Rebecca Betensky
• Department of Biostatistics
• College of Global Public Health
• New York University

## Outline

• Motivation and background
• Existing works
• Proposed model
• Conditional and unconditional permutation
• Test statistics
• $$p$$-value
• Simulation studies
• Cognitive and functional decline in aging study
• Conclusion
• Reference

## Aging study

• 490 cognitively normal older individuals (age $$\ge$$ 66) from
• Alzheimer's Disease Neuroimaging Initiative (ADNI, $$n$$ = 198)
• Australian Imaging Biomarkers and Lifestyle Study of Ageing (AIBL, $$n$$ = 131)
• Harvard Aging Brain Study (HABS, $$n$$ = 161)
• Participants had a Clinical Dementia Rating (CDR) 0 at enrollment.
• Survival features:
• Time to event time from baseline to progression to global CDR of 0.5
• Truncation had a PET imaging scan with 1 year ($$n$$ = 444)
• Right-censoring global CDR $$<$$ 0.5 by the end of study (8.4%)

## Challenges

• Standard survival approaches assume independence between time to event and truncation, these lead to substantial bias.
• Possible dependence:
• Subjects appear to be completely cognitively normal may receive early PET scans as clean baselines.
• Non-monotone dependence:
• Subjects have been declining slowly over extended follow-up receive late PET scans.

## Notations

• $$X$$ is the failure or event time
• $$T$$ is the truncation time for $$X$$
• $$C$$ is the right censoring time
• $$Y$$ is the observed failure time: $$Y=\min(X, C)$$
• $$\delta$$ is the censoring indicator: $$\delta = 1$$ if $$X\leq C$$ and 0 otherwise.
• The observed data are $$(Y, T, \delta\mid Y \geq T)$$

## LTRC survival data

• An example of a Left-truncated right-censored survival data

## Quasi-independence

• Can we test the independence between failure time $$(X)$$ and truncation time $$(T)$$ nonparametrically?
• No information is observed when $$X\le T$$
• We can answer this question by testing for quasi-independence, $$X\perp_q T$$

## Permutation approach

• The knowledge of the distribution of a test statistics under null hypothesis is not required
• To illustrate the idea, we temporarily ignore right-censoring by letting $$C\to\infty$$, the observed data is then $$\{(T_i, X_i); i = 1, \ldots, n\}$$
• In general, permutation test consists of the following procedures:

1. Generate a large number of permuted data under null
2. For each permuted data, compute a test statistics
3. Compute a $$p$$-value

## Permutation approach

• When there is no truncation, all n! ways are equal likely under the null hypothesis of quasi-independence.
• Assume $$X_1\le \ldots \le X_n$$, and define the permuted data as $$\{(T_i^\ast, X_i); i = 1, \ldots, n\}$$
• We consider two permutation algorithms in the presents of left truncation

1. Conditional permutation Tsai (1990), Efron and Petrosian (1992)
2. Unconditional permutation

## Conditional permutation method

• The conditional permutation procedure consists of the steps:

1. Initialize with $$m = 1$$
2. For $$X_m$$, selects a $$T_m^\ast$$ from $$\{i: T_i \leq X_i\}$$
3. Remove $$T_m^*$$ from $$\{T_1, \ldots, T_n\}$$ and repeat step 2. with $$m= 2, \ldots, n$$.
• Example Suppose the observed data consists of 4 observations: $$\{(X,T):(3,2), (5,1), (8,7), (9,6)\}$$ then we have a total of four possible legal permutations:

{(3, 1), (5, 2), (8, 6), (9, 7)}

{(3, 1), (5, 2), (8, 7), (9, 6)}

{(3, 2), (5, 1), (8, 6), (9, 7)}

{(3, 2), (5, 1), (8, 7), (9, 6)}

## Unconditional permutation method

• The unconditional permutation approach consists of the steps

1. Permutes $$T$$ across all subjects in the dataset
2. Delete those with $$T_i^\ast>X_i, i = 1, \ldots, n$$
• Example Suppose Suppose the observed data consists of 4 observations: $$\{(X,T):(3,2), (5,1), (8,7), (9,6)\}$$ then we have a total of $$4!=24$$ possible legal permutations:

{(3, 1), (5, 2), (8, 6), (9, 7)}

{(3, 1), (5, 2), (8, 7), (9, 6)}

{(3, 1), (5, 6), (8, 2), (9, 7)}

{(3, 1), (5, 6), (8, 7), (9, 2)}

## The two permutations

• Conditional permutation
• Suffer from low power when test statistics are risk-set-based, but not fully determined by the sizes of risk sets.
• Unconditional permutation
• Reduce sample sizes due to inadmissibility of some permuted observations but increase the size of the sample space
• Easier to generate

## Test statistics

1. Conditional Kendall's tau
• A consistent estimator of $$\tau_c^*$$ is $\hat\tau_c^*=\frac{1}{M}\sum_{i = 1}^{n-1}\sum_{j=1}^n \mbox{sgn} [(Y_i - Y_j)(T_i-T_j)] I(\Lambda_{ij}),$
• Asymptotic properties are established via U-statistics Martin and Betensky (2005)
• Powerful for monotone relationships, but may completely miss non-montone relationships
2. & 3. Minimally selected $$p$$-value: $$\mbox{minp}_1$$, $$\mbox{minp}_2$$
• Aim to detect non-monotone dependencies
• Asymptotic variance is complicated

## Minimally selected $$p$$-value: $$\mbox{minp}_1$$

• We proposed to obtain $$\mbox{minp}_1$$ $$p$$-value from:

1. Partition the data into two groups: $$\{T<t\}$$ or $$\{T>t\}$$
2. Compute the log-rank statistic ($$p$$ value) for the two groups
3. Repeat 1. and 2. for $$t\in\{T_1, \ldots, T_n\}$$
4. The $$\mbox{minp}_1$$ test statistic ($$p$$-value) is the maximum (minimum) of these realization
• For every cut-point, require at least $$E$$ events in the each group.

## Minimally selected $$p$$-value: $$\mbox{minp}_2$$

• An alternative is the minp 2 test:

1. Partition the data into two groups: $$\{T\in(t-\epsilon, t+\epsilon)\}$$ or $$\{T\not\in(t-\epsilon, t+\epsilon)\}$$
2. Compute the log-rank statistic ($$p$$-value) for the two groups
3. Repeat 1. and 2. for $$t\in\{T_1, \ldots, T_n\}$$
4. The $$\mbox{minp}_2$$ test statistic ($$p$$-value) is the maximum (minimum) of these realization.
• This allows for $$X$$ to be associated with moderate $$T$$ differently from small or large $$T$$.
• Choose $$\epsilon$$ so that each group retains at least $$E$$ events.

## Computing $$p$$-values

• Let $$z_{\mbox{obs}}$$ be the observed test statistic
• The exact permutation $$p$$-value is defined as $p_N = \frac{\sum_{i=1}^NI(|z_i| \geq |z_{\mbox{obs}}|)}{N},$ where $$z_1, z_2, \ldots, z_N$$ are all possible test statistic computed from permuted dataset.
• $$p_N$$ can be approximated by $\hat{p}_N = \frac{\sum_{i=1}^{N^*}I(|z^*_i|\geq |z_{\mbox{obs}}|)}{N^*} \approx \frac{\sum_{i=1}^{N^*}I(|z^*_i|\geq |z_{\mbox{obs}}|)+1}{N^*+1},$ where $$z^\ast_1, \ldots, z^\ast_{N^\ast}$$ are the simpled permutation test statistics.

## Simulation 1: setups

• Generate $$(X,T)$$ from a bivariate normal copula
• $$X\sim\mbox{Weibull}(3, 8.5)$$
• $$T\sim\mbox{exp}(0.2)$$
• Nine levels of dependence %measured by Kendall's tau $$(\tau)$$:
• $$\tau=0, \pm0.2, \pm0.4, \pm0.6, \pm0.8$$
• Sample size after truncation: 100 and 200.
• Censoring times follow an independent $$\mbox{Uniform}(0, c)$$
• $$0\%, 25\%,$$ and $$50\%$$ after truncation
• 5000 permutations
• 1000 replications
• We compare the rejection proportions at a significant level of 0.05

## Simulation 1: convergence

Consistency of the rejection proportions (0% censoring)

## Simulation 1: Timing results

Timing results in seconds (0% censoring)

## Simulation 1: Rejection proportion

Rejection proportion with $$n = 100$$:

## Simulation 2: setups

• Generate $$(\mid T- 2.5\mid, X)$$ from a bivariate normal copula
• $$T\sim\mbox{Uniform}(0, 5)$$
• $$X\sim\mbox{Weibull}(3, 8.5)$$
• Nine dependence level
• $$0, \pm0.2, \pm0.4, \pm0.6, \pm0.8$$
• Sample size after truncation: 100 and 200
• Censoring times follow an independent $$\mbox{Uniform}(0, c)$$
• 0%, 25%, and 50% after truncation
• 5000 permutations
• 1000 replications
• We compare the rejection proportions at a significant level of 0.05.

## Simulation 2: rejection proportion

Rejection proportion with $$n = 100$$: