| Title: | Generalized Promotion Time Cure Model with Bayesian Shrinkage Priors |
|---|---|
| Description: | Generalized promotion time cure model (GPTCM) via Bayesian hierarchical modeling for multiscale data integration (Zhao et al. (2025) <doi:10.48550/arXiv.2509.01001>). The Bayesian GPTCMs are applicable for both low- and high-dimensional data. |
| Authors: | Zhi Zhao [aut, cre] |
| Maintainer: | Zhi Zhao <[email protected]> |
| License: | GPL-3 |
| Version: | 2.0.0 |
| Built: | 2026-06-04 23:10:34 UTC |
| Source: | https://github.com/ocbe-uio/GPTCM |
Extract the posterior estimate of the parameters of a GPTCM class object.
getEstimator(object, estimator = "gamma", Pmax = 0, type = "marginal")getEstimator(object, estimator = "gamma", Pmax = 0, type = "marginal")
object |
an object of class |
estimator |
the name of one estimator. Default is the latent indicator
estimator " |
Pmax |
threshold that truncate the estimator " |
type |
the type of output beta. Default is |
Return the estimator from an object of class GPTCM. It is
a matrix or vector
Zhao Z, Kızılaslan F, Wang S, Zucknick M (2025). Generalized promotion time cure model: A new modeling framework to identify cell-type-specific genes and improve survival prognosis. arXiv:2509.01001
# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) # run a Bayesian GPTCM model: GPTCM-Ber2 fit <- GPTCM(dat, nIter = 10, burnin = 0) gamma.hat <- getEstimator(fit, estimator = "gamma")# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) # run a Bayesian GPTCM model: GPTCM-Ber2 fit <- GPTCM(dat, nIter = 10, burnin = 0) gamma.hat <- getEstimator(fit, estimator = "gamma")
This is the main function to fit the Bayesian GPTCMs (Zhao et al. 2025) with multiscale data for sparse identification of high-dimensional covariates. The core code for MCMC algorithm uses Rcpp (Eddelbuettel and François 2011) and RcppArmadillo (Eddelbuettel and Sanderson 2014)
GPTCM( dat, nIter = 500, burnin = 200, thin = 1, tick = 500, proportion.model = TRUE, dirichlet = TRUE, hyperpar = NULL, BVS = TRUE, threads = 1, kappaIGamma = FALSE, kappaSampler = "arms", gammaPrior = "bernoulli", gammaSampler = "MC3", etaPrior = "bernoulli", etaSampler = "MC3", w0IGamma = TRUE, initial = NULL, arms.list = NULL )GPTCM( dat, nIter = 500, burnin = 200, thin = 1, tick = 500, proportion.model = TRUE, dirichlet = TRUE, hyperpar = NULL, BVS = TRUE, threads = 1, kappaIGamma = FALSE, kappaSampler = "arms", gammaPrior = "bernoulli", gammaSampler = "MC3", etaPrior = "bernoulli", etaSampler = "MC3", w0IGamma = TRUE, initial = NULL, arms.list = NULL )
dat |
input data as a list containing survival data sub-list
|
nIter |
the number of iterations of the chain |
burnin |
number of iterations to discard at the start of the chain |
thin |
thinning MCMC intermediate results to be stored |
tick |
an integer used for printing the iteration index and some updated parameters every tick-th iteration. Default is 1 |
proportion.model |
logical value; should the proportions be modeled or
not. If ( |
dirichlet |
logical value; should the proportions be modeled via the
common ( |
hyperpar |
a list of relevant hyperparameters |
BVS |
logical value for implementing Bayesian variable selection |
threads |
maximum threads used for parallelization. Default is 1 |
kappaIGamma |
logical value for using inverse-gamma prior ( |
kappaSampler |
one of |
gammaPrior |
one of |
gammaSampler |
one of |
etaPrior |
one of |
etaSampler |
one of |
w0IGamma |
logical value; if |
initial |
a list of initial values for parameters "kappa", "xi", "betas", and "zetas" |
arms.list |
a list of parameters for the ARMS method |
An object of a list including the following components:
input - a list of all input parameters by the user
output - a list of the all mcmc output estimates:
"xi" - a matrix with MCMC intermediate estimates of effects on clinical variables
"kappa" - a vector with MCMC intermediate estimates of the Weibull's shape parameter
"betas" - a matrix with MCMC intermediate estimates of effects on cluster-specific survival
"zetas" - a matrix with MCMC intermediate estimates of effects on cluster-specific proportions
"gammas" - a matrix with MCMC intermediate estimates of inclusion indicators of variables for cluster-specific survival
"gamma_acc_rate" - acceptance rate of the M-H sampling for gammas
"etas" - a matrix with MCMC intermediate estimates of inclusion indicators of variables for cluster-specific proportions
"eta_acc_rate" - acceptance rate of the M-H sampling for etas
"loglikelihood" - a matrix with MCMC intermediate estimates of individuals' likelihoods
"tauSq" - a vector with MCMC intermediate estimates of tauSq
"wSq" - a matrix with MCMC intermediate estimates of wSq
"vSq" - a matrix with MCMC intermediate estimates of vSq
"post" - a list with posterior means of "xi", "kappa", "betas", "zetas", "gammas", "etas"
call - the matched call
Eddelbuettel D, Sanderson C (2014). RcppArmadillo: Accelerating R with high-performance C++ linear algebra. Computational Statistics and Data Analysis, 71, 1054–1063
Zhao Z, Kızılaslan F, Wang S, Zucknick M (2025). Generalized promotion time cure model: A new modeling framework to identify cell-type-specific genes and improve survival prognosis. arXiv:2509.01001
# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) # run a Bayesian GPTCM model: GPTCM-Ber2 fit <- GPTCM(dat, nIter = 10, burnin = 0)# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) # run a Bayesian GPTCM model: GPTCM-Ber2 fit <- GPTCM(dat, nIter = 10, burnin = 0)
Random number generator via Metropolis-Hastings algorithm.
metropolis_sampler( initial_value, n = n, proposal_shape = 1, proposal_scale = 1, theta = 1, proportion = 0.5, mu = 1, kappas = 0.9, burnin = 0, lag = 1 )metropolis_sampler( initial_value, n = n, proposal_shape = 1, proposal_scale = 1, theta = 1, proportion = 0.5, mu = 1, kappas = 0.9, burnin = 0, lag = 1 )
initial_value |
initial values |
n |
number of draws |
proposal_shape |
Weibull's shape parameter in the proposal |
proposal_scale |
Weibull's scale parameter in the proposal |
theta |
cure rate parameter (log scale) |
proportion |
proportions data |
mu |
mean survival time |
kappas |
Weibull's true shape parameter |
burnin |
length of burn-in period |
lag |
discarding lag-1 values in the Metropolis step |
A dataframe consisting of the sampled values and acceptance rate
times <- metropolis_sampler(10, 5)times <- metropolis_sampler(10, 5)
Predict time-dependent Brier scores based on different survival models
plotBrier( dat, datMCMC, dat.new = NULL, time.star = NULL, xlab = "Time", ylab = "Brier score", PTCM = TRUE, ... )plotBrier( dat, datMCMC, dat.new = NULL, time.star = NULL, xlab = "Time", ylab = "Brier score", PTCM = TRUE, ... )
dat |
input data as a list containing survival data sub-list
|
datMCMC |
returned object from the main function |
dat.new |
input data for out-sample prediction, with the same format
as |
time.star |
largest time for survival prediction |
xlab |
a title for the x axis |
ylab |
a title for the y axis |
PTCM |
logical value for adding survival prediction by the PTCM |
... |
other parameters |
A ggplot2::ggplot object. See ?ggplot2::ggplot for more
details of the object.
Zhao Z, Kızılaslan F, Wang S, Zucknick M (2025). Generalized promotion time cure model: A new modeling framework to identify cell-type-specific genes and improve survival prognosis. arXiv:2509.01001
# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) # run a Bayesian GPTCM model: GPTCM-Ber2 fit <- GPTCM(dat, nIter = 5, burnin = 0) plotBrier(dat, datMCMC = fit, PTCM = FALSE)# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) # run a Bayesian GPTCM model: GPTCM-Ber2 fit <- GPTCM(dat, nIter = 5, burnin = 0) plotBrier(dat, datMCMC = fit, PTCM = FALSE)
create nice plots for estimated coefficients and 95
plotCoeff( dat, datMCMC, estimator = "beta", intercept = FALSE, bandwidth = NULL, xlim = NULL, xlab = NULL, label.y = NULL, first.coef = NULL, y.axis.size = 8, ... )plotCoeff( dat, datMCMC, estimator = "beta", intercept = FALSE, bandwidth = NULL, xlim = NULL, xlab = NULL, label.y = NULL, first.coef = NULL, y.axis.size = 8, ... )
dat |
input data as a list containing survival data sub-list
|
datMCMC |
returned object from the main function |
estimator |
print estimators, one of
|
intercept |
logical value to print intercepts |
bandwidth |
a value of bandwidth used for the ridgeplot |
xlim |
numeric vectors of length 2, giving the x-coordinate range. |
xlab |
a title for the x axis |
label.y |
a title for the y axis |
first.coef |
number of the first variables. Default |
y.axis.size |
text size in pts |
... |
others |
A ggplot2::ggplot object. See ?ggplot2::ggplot for more
details of the object.
Zhao Z, Kızılaslan F, Wang S, Zucknick M (2025). Generalized promotion time cure model: A new modeling framework to identify cell-type-specific genes and improve survival prognosis. arXiv:2509.01001
# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) # run a Bayesian GPTCM model: GPTCM-Ber2 fit <- GPTCM(dat, nIter = 10, burnin = 0) plotCoeff(dat, datMCMC = fit, estimator = "beta")# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) # run a Bayesian GPTCM model: GPTCM-Ber2 fit <- GPTCM(dat, nIter = 10, burnin = 0) plotCoeff(dat, datMCMC = fit, estimator = "beta")
Trace-plots of regression coefficients over MCMC iterations
plotMCMC(dat, datMCMC, estimator = "xi")plotMCMC(dat, datMCMC, estimator = "xi")
dat |
input data as a list containing survival data sub-list
|
datMCMC |
returned object from the main function |
estimator |
print estimators, one of
|
A ggplot2::ggplot object. See ?ggplot2::ggplot for more
details of the object.
Zhao Z, Kızılaslan F, Wang S, Zucknick M (2025). Generalized promotion time cure model: A new modeling framework to identify cell-type-specific genes and improve survival prognosis. arXiv:2509.01001
# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) # run a Bayesian GPTCM model: GPTCM-Ber2 fit <- GPTCM(dat, nIter = 10, burnin = 0) plotMCMC(dat, datMCMC = fit, estimator = "xi")# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) # run a Bayesian GPTCM model: GPTCM-Ber2 fit <- GPTCM(dat, nIter = 10, burnin = 0) plotMCMC(dat, datMCMC = fit, estimator = "xi")
Compute predicted survival probability for a GPTCM
## S3 method for class 'GPTCM' predict(object, dat, newdata = NULL, type = "survival", times = NULL, ...)## S3 method for class 'GPTCM' predict(object, dat, newdata = NULL, type = "survival", times = NULL, ...)
object |
the results of a |
dat |
the dataset used in |
newdata |
optional new data at which to do predictions. If missing, the prediction will be based on the training data |
type |
the type of predicted value. Currently it is only valid with
|
times |
evaluation time points for survival prediction. Default
|
... |
for future methods |
A matrix object
Zhao Z, Kızılaslan F, Wang S, Zucknick M (2025). Generalized promotion time cure model: A new modeling framework to identify cell-type-specific genes and improve survival prognosis. arXiv:2509.01001
# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) # run a Bayesian GPTCM model: GPTCM-Ber2 fit <- GPTCM(dat, nIter = 10, burnin = 0) pred.survival <- predict(fit, dat, newdata = dat, times = c(1, 3, 5))# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) # run a Bayesian GPTCM model: GPTCM-Ber2 fit <- GPTCM(dat, nIter = 10, burnin = 0) pred.survival <- predict(fit, dat, newdata = dat, times = c(1, 3, 5))
Main function implemented in C++ for the MCMC loop
run_mcmc( nIter, burnin, thin, tick, n, nsamp, ninit, convex, npoint, dirichlet, proportion_model, BVS, threads, gamma_prior, gamma_sampler, eta_prior, eta_sampler, initList, rangeList, hyperparList, datEvent, datTime, datX, datX0, datProportionConst )run_mcmc( nIter, burnin, thin, tick, n, nsamp, ninit, convex, npoint, dirichlet, proportion_model, BVS, threads, gamma_prior, gamma_sampler, eta_prior, eta_sampler, initList, rangeList, hyperparList, datEvent, datTime, datX, datX0, datProportionConst )
nIter |
number of MCMC iterations |
burnin |
length of MCMC burn-in period |
thin |
number of thinning |
tick |
an integer used for printing the iteration index |
n |
number of samples to draw |
nsamp |
how many samples to draw for generating each sample; only the last draw will be kept |
ninit |
number of initials as meshgrid values for envelop search |
convex |
adjustment for convexity (non-negative value, default 1.0) |
npoint |
maximum number of envelope points |
dirichlet |
not yet implemented |
proportion_model |
logical value for modeling the proportions data |
BVS |
logical value for implementing Bayesian variable selection |
threads |
maximum threads used for parallelization. Default is 1 |
gamma_prior |
one of |
gamma_sampler |
one of |
eta_prior |
one of |
eta_sampler |
one of |
initList |
a list of initial values for parameters "kappa", "xi", "betas", and "zetas" |
rangeList |
a list of ranges of initial values for parameters "kappa", "xi", "betas", and "zetas" |
hyperparList |
a list of relevant hyperparameters |
datEvent |
a vector of survival status |
datTime |
a vector of survival times |
datX |
an array of cluster-specific covariates |
datX0 |
a matrix of mandatory variables |
datProportionConst |
an array of cluster-specific proportions |
Simulate survival data based on a GPTCM or Cox model
simData( n = 200, p = 10, L = 3, Sigma = NULL, kappas = 2, proportion.model = "dirichlet", model = "GPTCM" )simData( n = 200, p = 10, L = 3, Sigma = NULL, kappas = 2, proportion.model = "dirichlet", model = "GPTCM" )
n |
number of subjects |
p |
number of covariates in each cluster |
L |
number of clusters |
Sigma |
NULL (for a default covariance matrix) or "independent" (i.e. Sigma=diag(p*L)) or a self-defined matrix |
kappas |
value of the Weibull's shape parameter |
proportion.model |
One of |
model |
one of |
An object of a list with 12 components
"survObj" - a list including events and times
"accepted" - a vector with acceptance rates to generate each time-to-event data point by Metropolis-Hastings algorithm.
"proportion.model" - value to indicate the model type for simulating proportions data.
"proportion" - a matrix with simulated proportions data.
"kappas" - value of the Weibull's shape parameter.
"x0" - a matrix with the data of clinical variables
"X" - an array with cluster-specific covariates
"xi" - effects of clinical variables
"beta0" - intercepts related to cluster-specific-survival.
"betas" - effects related to cluster-specific-survival.
"zetas" - effects related to cluster-specific-proportions.
"mrfG" - a graph corresponding to the precision matrix of cluster-specific covariates
"mrfG2" - a graph corresponding to every second edge in "mrfG"
Zhao Z, Kızılaslan F, Wang S, Zucknick M (2025). Generalized promotion time cure model: A new modeling framework to identify cell-type-specific genes and improve survival prognosis. arXiv:2509.01001
# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) str(dat)# simulate data set.seed(123) n <- 200 # subjects p <- 10 # variable selection predictors L <- 3 # cell types dat <- simData(n, p, L) str(dat)
Predefined target density corresponding to the population survival function of GPTCM
target(x, theta, proportion, mu, kappas)target(x, theta, proportion, mu, kappas)
x |
value generated from the proposal distribution |
theta |
cure rate parameter (log scale) |
proportion |
proportions data |
mu |
mean survival time |
kappas |
Weibull's true shape parameter |
value of the targeted (improper) probability density function
time1 <- target(1.2, 0.1, c(0.2, 0.3, 0.5), c(0.2, 0.1, 0.4), 2)time1 <- target(1.2, 0.1, c(0.2, 0.3, 0.5), c(0.2, 0.1, 0.4), 2)