Threshold regression with a censored covariate

This function fits a linear regression model when there is a censored covaraite. The method involves thresholding the continuous covariate into a binary covariate. A collection of threshold regression methods are implemented to obtain the estimator of the regression coefficient as well as to test the significance of the effect of the censored covariate. When there is no censoring, the method reduces to the simple linear regression.

The model assumes the linear regression model: $$Y = a_0 + a_1X + a_2Z + e,$$ where X is the covariate of interest which is subject to right censoring, Z is a covariate matrix that are fully observed, Y is the response variable, and e is an independent randon error term with mean 0 and finite variance.

The hypothesis test of association is based on the significance of the regression coefficient, a1. However, when deletion threshold regression or complete threshold regression is executed, an equivalent but easy-to-evaluate test is performed. Namely, given a threshold t*, we define a derived binary covariate, X*, such that X* = 1 when X > t* and X* = 0 when X is uncensored and X < t*. The proposed linear regression can be expressed as $$E(Y|X^\ast, Z) = b_0 + b_1X^\ast + b_2Z.$$ The proposed hypothesis test of association can be tested by the significance of b1. Under the assumption that X is independent of Z given X*, b2 is equivalent to a2.

thlm(formula, data, method = c("cc", "reverse", "deletion-threshold",
  "complete-threshold", "all"), B = 0, subset, x.upplim = NULL,
  t0 = NULL, control = thlm.control())

Arguments

formula	A formula expression in the form `response ~ predictors`. The response variable is assumed to be fully observed. The `thlm` function can accommodate at most one censored covariate, which is entered as an `Surv` object; see `survival::Surv` for more detail. When all the covariates are uncensored, the `thlm` function returns a `lm` object.
data	An optional data frame list or environment contains variables in the `formula` and the `subset` argument. If left unspecified, the variables are taken from `environment(formula)`, typically the environment from which `thlm` is called.
method	A character string specifying the threshold regression methods to be used. The following are permitted: `cc` for complete-cases regression `reverse` for reverse survival regression `deletion-threshold` for deletion threshold regression `complete-threshold` for complete threshold regression `all` for all four approaches
B	A numeric value specifies the bootstrap size for estimating the standard deviation of regression coefficient for the censored covariate when `method = "deletion-threshold"` or `method = "complete-threshold"`. When `B = 0`, only the beta estimate will be displayed.
subset	An optional vector specifying a subset of observations to be used in the fitting process.
x.upplim	An optional numeric value specifies the upper support of the censored covariate. When left unspecified, the maximum of the censored covariate will be used.
t0	An optional numeric value specifies the threshold when `method = "dt"` or `"ct"`. When left unspecified, an optimal threshold will be determined to optimize test power using the proposed procedure in Qian et al (2018).
control	A list of parameters. The parameters are `t0.interval` controls the end points of the interval to be searched for the optimal threshold when `t0` is left unspecified `t0.plot` controls whether the objective function will be plotted. When `t0.plot` is ture, both the raw `t0.plot` values and the smoothed estimates (using local polynomial regression fitting) are plotted.

References

Qian, J., Chiou, S.H., Maye, J.E., Atem, F., Johnson, K.A. and Betensky, R.A. (2018) Threshold regression to accommodate a censored covariate, Biometrics, 74(4): 1261--1270.

Atem, F., Qian, J., Maye J.E., Johnson, K.A. and Betensky, R.A. (2017), Linear regression with a randomly censored covariate: Application to an Alzheimer's study. Journal of the Royal Statistical Society: Series C, 66(2):313--328.

Examples

simDat <- function(n) {
  X <- rexp(n, 3)
  Z <- runif(n, 1, 6)
  Y <- 0.5 + 0.5 * X - 0.5 * Z + rnorm(n, 0, .75)
  cstime <- rexp(n, .75)
  delta <- (X <= cstime) * 1
  X <- pmin(X, cstime)
  data.frame(Y = Y, X = X, Z = Z, delta = delta)
}

set.seed(0)
dat <- simDat(200)

library(survival)
## Falsely assumes all covariates are free of censoring
thlm(Y ~ X + Z, data = dat)
#> 
#>  Call: thlm(formula = Y ~ X + Z, data = dat)
#> 
#>  Hypothesis test of association
#>  H0: a1 = 0, p-value = 0.0023 
#> 

## Complete cases regression
thlm(Y ~ Surv(X, delta) + Z, data = dat, method = "cc")
#> 
#>  Call: thlm(formula = Y ~ Surv(X, delta) + Z, data = dat, method = "cc")
#> 
#>  Hypothesis test of association
#> H0: a1 = 0, p-value = 0.0033 
#> 

## reverse survival regression
thlm(Y ~ Surv(X, delta) + Z, data = dat, method = "rev")
#> 
#>  Call: thlm(formula = Y ~ Surv(X, delta) + Z, data = dat, method = "rev")
#> 
#>  Hypothesis test of association
#> H0: a1 = 0, p-value = 0.0026 
#> 

## threshold regression without bootstrap
thlm(Y ~ Surv(X, delta) + Z, data = dat, method = "del")
#> 
#>  Call: thlm(formula = Y ~ Surv(X, delta) + Z, data = dat, method = "del")
#> 
#>  Hypothesis test of association
#>  H0: b1 = 0, p-value = 0.0080 
#> 
thlm(Y ~ Surv(X, delta) + Z, data = dat, method = "com", control =
list(t0.interval = c(0.2, 0.6), t0.plot = FALSE))
#> 
#>  Call: thlm(formula = Y ~ Surv(X, delta) + Z, data = dat, method = "com", 
#>     control = list(t0.interval = c(0.2, 0.6), t0.plot = FALSE))
#> 
#>  Hypothesis test of association
#>  H0: b1 = 0, p-value = 0.0040 
#> 

## threshold regression with bootstrap
thlm(Y ~ Surv(X, delta) + Z, data = dat, method = "del", B = 100)
#> 
#>  Call: thlm(formula = Y ~ Surv(X, delta) + Z, data = dat, method = "del", 
#>     B = 100)
#> 
#>  Hypothesis test of association
#>  H0: b1 = 0, p-value = 0.0080
#>  H0: a1 = 0, p-value = 0.0082 
#> 
thlm(Y ~ Surv(X, delta) + Z, data = dat, method = "com", B = 100)
#> 
#>  Call: thlm(formula = Y ~ Surv(X, delta) + Z, data = dat, method = "com", 
#>     B = 100)
#> 
#>  Hypothesis test of association
#>  H0: b1 = 0, p-value = 0.0053
#>  H0: a1 = 0, p-value = 0.0111 
#> 

## display all
thlm(Y ~ Surv(X, delta) + Z, data = dat, method = "all", B = 100)
#> 
#>  Call: thlm(formula = Y ~ Surv(X, delta) + Z, data = dat, method = "all", 
#>     B = 100)
#> 
#>  Hypothesis test of association
#> 
#>  Complete-cases
#>  H0: a1 = 0, p-value = 0.0033 
#> 
#>  Reverse survival
#>  H0: a1 = 0, p-value = 0.0026 
#> 
#>  Deletion threshold
#>  H0: b1 = 0: p-value = 0.0080
#>  H0: a1 = 0: p-value = 0.0090
#> 
#>  Complete threshold
#>  H0: b1 = 0: p-value = 0.0053
#>  H0: a1 = 0: p-value = 0.0136

Threshold regression with a censored covariate

Arguments

References

Examples

Contents