Title: | Bayesian Survival Models for High-Dimensional Data |
---|---|
Description: | An implementation of Bayesian survival models with graph-structured selection priors for sparse identification of omics features predictive of survival (Madjar et al., 2021 <doi:10.1186/s12859-021-04483-z>) and its extension to use a fixed graph via a Markov Random Field (MRF) prior for capturing known structure of omics features, e.g. disease-specific pathways from the Kyoto Encyclopedia of Genes and Genomes database. |
Authors: | Zhi Zhao [aut, cre], Waldir Leoncio [aut], Katrin Madjar [aut], Tobias Østmo Hermansen [aut], Manuela Zucknick [ctb], Jörg Rahnenführer [ctb] |
Maintainer: | Zhi Zhao <[email protected]> |
License: | GPL-3 |
Version: | 0.0.5 |
Built: | 2025-01-02 06:29:43 UTC |
Source: | https://github.com/ocbe-uio/bayessurvive |
This is the main function to fit a Bayesian Cox model with graph-structured selection priors for sparse identification of high-dimensional covariates.
BayesSurvive( survObj, model.type = "Pooled", MRF2b = FALSE, MRF.G = TRUE, g.ini = 0, hyperpar = NULL, initial = NULL, nIter = 1, burnin = 0, thin = 1, output_graph_para = FALSE, verbose = TRUE, cpp = FALSE )
BayesSurvive( survObj, model.type = "Pooled", MRF2b = FALSE, MRF.G = TRUE, g.ini = 0, hyperpar = NULL, initial = NULL, nIter = 1, burnin = 0, thin = 1, output_graph_para = FALSE, verbose = TRUE, cpp = FALSE )
survObj |
a list containing observed data from |
model.type |
a method option from
|
MRF2b |
logical value. |
MRF.G |
logical value. |
g.ini |
initial values for latent edge inclusion indicators in graph, should be a value in [0,1]. 0 or 1: set all random edges to 0 or 1; value in (0,1): rate of indicators randomly set to 1, the remaining indicators are 0 |
hyperpar |
a list containing prior parameter values |
initial |
a list containing prior parameters' initial values |
nIter |
the number of iterations of the chain |
burnin |
number of iterations to discard at the start of the chain. Default is 0 |
thin |
thinning MCMC intermediate results to be stored |
output_graph_para |
allow ( |
verbose |
logical value to display the progess of MCMC |
cpp |
logical, whether to use C++ code for faster computation |
An object of class BayesSurvive
is saved as
obj_BayesSurvive.rda
in the output file, including the following components:
input - a list of all input parameters by the user
output - a list of the all output estimates:
"gamma.p
" - a matrix with MCMC intermediate estimates of the indicator variables of regression coefficients.
"beta.p
" - a matrix with MCMC intermediate estimates of the regression coefficients.
"h.p
" - a matrix with MCMC intermediate estimates of the increments in the cumulative baseline hazard in each interval.
call - the matched call.
library("BayesSurvive") set.seed(123) # Load the example dataset data("simData", package = "BayesSurvive") dataset <- list( "X" = simData[[1]]$X, "t" = simData[[1]]$time, "di" = simData[[1]]$status ) # Initial value: null model without covariates initial <- list("gamma.ini" = rep(0, ncol(dataset$X))) # Hyperparameters hyperparPooled <- list( "c0" = 2, # prior of baseline hazard "tau" = 0.0375, # sd (spike) for coefficient prior "cb" = 20, # sd (spike) for coefficient prior "pi.ga" = 0.02, # prior variable selection probability for standard Cox models "a" = -4, # hyperparameter in MRF prior "b" = 0.1, # hyperparameter in MRF prior "G" = simData$G # hyperparameter in MRF prior ) # run Bayesian Cox with graph-structured priors fit <- BayesSurvive( survObj = dataset, hyperpar = hyperparPooled, initial = initial, nIter = 100 ) # show posterior mean of coefficients and 95% credible intervals library("GGally") plot(fit) + coord_flip() + theme(axis.text.x = element_text(angle = 90, size = 7))
library("BayesSurvive") set.seed(123) # Load the example dataset data("simData", package = "BayesSurvive") dataset <- list( "X" = simData[[1]]$X, "t" = simData[[1]]$time, "di" = simData[[1]]$status ) # Initial value: null model without covariates initial <- list("gamma.ini" = rep(0, ncol(dataset$X))) # Hyperparameters hyperparPooled <- list( "c0" = 2, # prior of baseline hazard "tau" = 0.0375, # sd (spike) for coefficient prior "cb" = 20, # sd (spike) for coefficient prior "pi.ga" = 0.02, # prior variable selection probability for standard Cox models "a" = -4, # hyperparameter in MRF prior "b" = 0.1, # hyperparameter in MRF prior "G" = simData$G # hyperparameter in MRF prior ) # run Bayesian Cox with graph-structured priors fit <- BayesSurvive( survObj = dataset, hyperpar = hyperparPooled, initial = initial, nIter = 100 ) # show posterior mean of coefficients and 95% credible intervals library("GGally") plot(fit) + coord_flip() + theme(axis.text.x = element_text(angle = 90, size = 7))
Estimate regression coefficients with posterior mean/median, credible intervals, standard deviation, or MPM estimates, posterior gammas
## S3 method for class 'BayesSurvive' coef( object, MPM = FALSE, type = "mean", CI = 95, SD = FALSE, subgroup = 1, ... )
## S3 method for class 'BayesSurvive' coef( object, MPM = FALSE, type = "mean", CI = 95, SD = FALSE, subgroup = 1, ... )
object |
an object of class |
MPM |
logical value to obtain MPM coefficients. Default: FALSE |
type |
type of point estimates of regression coefficients. One of
|
CI |
size (level, as a percentage) of the credible interval to report. Default: 95, i.e. a 95% credible interval |
SD |
logical value to show each coefficient's standard deviation over MCMC iterations |
subgroup |
index of the subgroup for visualizing posterior coefficients |
... |
other arguments |
dataframe object
library("BayesSurvive") set.seed(123) # Load the example dataset data("simData", package = "BayesSurvive") dataset <- list( "X" = simData[[1]]$X, "t" = simData[[1]]$time, "di" = simData[[1]]$status ) # Initial value: null model without covariates initial <- list("gamma.ini" = rep(0, ncol(dataset$X))) # Hyperparameters hyperparPooled <- list( "c0" = 2, # prior of baseline hazard "tau" = 0.0375, # sd for coefficient prior "cb" = 20, # sd for coefficient prior "pi.ga" = 0.02, # prior variable selection probability for standard Cox models "a" = -4, # hyperparameter in MRF prior "b" = 0.1, # hyperparameter in MRF prior "G" = simData$G # hyperparameter in MRF prior ) # run Bayesian Cox with graph-structured priors fit <- BayesSurvive( survObj = dataset, hyperpar = hyperparPooled, initial = initial, nIter = 100 ) # show posterior coefficients betas <- coef(fit) head(betas)
library("BayesSurvive") set.seed(123) # Load the example dataset data("simData", package = "BayesSurvive") dataset <- list( "X" = simData[[1]]$X, "t" = simData[[1]]$time, "di" = simData[[1]]$status ) # Initial value: null model without covariates initial <- list("gamma.ini" = rep(0, ncol(dataset$X))) # Hyperparameters hyperparPooled <- list( "c0" = 2, # prior of baseline hazard "tau" = 0.0375, # sd for coefficient prior "cb" = 20, # sd for coefficient prior "pi.ga" = 0.02, # prior variable selection probability for standard Cox models "a" = -4, # hyperparameter in MRF prior "b" = 0.1, # hyperparameter in MRF prior "G" = simData$G # hyperparameter in MRF prior ) # run Bayesian Cox with graph-structured priors fit <- BayesSurvive( survObj = dataset, hyperpar = hyperparPooled, initial = initial, nIter = 100 ) # show posterior coefficients betas <- coef(fit) head(betas)
This an internal function for MCMC sampling
func_MCMC( survObj, hyperpar, initial, nIter, thin, burnin, S, method, MRF_2b, MRF_G, output_graph_para, verbose, cpp = FALSE )
func_MCMC( survObj, hyperpar, initial, nIter, thin, burnin, S, method, MRF_2b, MRF_G, output_graph_para, verbose, cpp = FALSE )
survObj |
a list containing observed data from |
hyperpar |
a list containing prior parameter values |
initial |
a list containing prior parameters' initial values |
nIter |
the number of iterations of the chain |
thin |
thinning MCMC intermediate results to be stored |
burnin |
number of iterations to discard at the start of the chain. Default is 0 |
S |
the number of subgroups |
method |
a method option from
|
MRF_2b |
two different b in MRF prior for subgraphs G_ss and G_rs |
MRF_G |
logical value. |
output_graph_para |
allow ( |
verbose |
logical value to display the progess of MCMC |
cpp |
logical, whether to use C++ code for faster computation |
A list object saving the MCMC results with components including 'gamma.p', 'beta.p', 'h.p', 'gamma.margin', 'beta.margin', 's', 'eta0', 'kappa0', 'c0', 'pi.ga', 'tau', 'cb', 'accept.RW', 'log.jpost', 'log.like', 'post.gamma'
This an internal function for MCMC sampling
func_MCMC_graph(sobj, hyperpar, ini, S, method, MRF_2b, cpp = FALSE)
func_MCMC_graph(sobj, hyperpar, ini, S, method, MRF_2b, cpp = FALSE)
sobj |
a list containing observed data from |
hyperpar |
a list containing prior parameter values |
ini |
a list containing prior parameters' ini values |
S |
the number of subgroups |
method |
a method option from
|
MRF_2b |
two different b in MRF prior for subgraphs G_ss and G_rs |
cpp |
logical, whether to use C++ code for faster computation |
A list object with components "Sig" the updated covariance matrices, "G.ini" the updated graph, "V.ini" the updated variances for precision matrices in all subgroups, "C.ini" the updated precision matrices omega for each subgroup
Plot point estimates of regression coefficients and 95% credible intervals
## S3 method for class 'BayesSurvive' plot(x, type = "mean", interval = TRUE, subgroup = 1, ...)
## S3 method for class 'BayesSurvive' plot(x, type = "mean", interval = TRUE, subgroup = 1, ...)
x |
an object of class |
type |
type of point estimates of regression coefficients. One of
|
interval |
logical argument to show 95% credible intervals. Default
is |
subgroup |
index of the subgroup for visualizing posterior coefficients |
... |
additional arguments sent to |
ggplot object
library("BayesSurvive") set.seed(123) # Load the example dataset data("simData", package = "BayesSurvive") dataset <- list( "X" = simData[[1]]$X, "t" = simData[[1]]$time, "di" = simData[[1]]$status ) # Initial value: null model without covariates initial <- list("gamma.ini" = rep(0, ncol(dataset$X))) # Hyperparameters hyperparPooled <- list( "c0" = 2, # prior of baseline hazard "tau" = 0.0375, # sd for coefficient prior "cb" = 20, # sd for coefficient prior "pi.ga" = 0.02, # prior variable selection probability for standard Cox models "a" = -4, # hyperparameter in MRF prior "b" = 0.1, # hyperparameter in MRF prior "G" = simData$G # hyperparameter in MRF prior ) # run Bayesian Cox with graph-structured priors fit <- BayesSurvive( survObj = dataset, hyperpar = hyperparPooled, initial = initial, nIter = 100 ) # show posterior mean of coefficients and 95% credible intervals library("GGally") plot(fit) + coord_flip() + theme(axis.text.x = element_text(angle = 90, size = 7))
library("BayesSurvive") set.seed(123) # Load the example dataset data("simData", package = "BayesSurvive") dataset <- list( "X" = simData[[1]]$X, "t" = simData[[1]]$time, "di" = simData[[1]]$status ) # Initial value: null model without covariates initial <- list("gamma.ini" = rep(0, ncol(dataset$X))) # Hyperparameters hyperparPooled <- list( "c0" = 2, # prior of baseline hazard "tau" = 0.0375, # sd for coefficient prior "cb" = 20, # sd for coefficient prior "pi.ga" = 0.02, # prior variable selection probability for standard Cox models "a" = -4, # hyperparameter in MRF prior "b" = 0.1, # hyperparameter in MRF prior "G" = simData$G # hyperparameter in MRF prior ) # run Bayesian Cox with graph-structured priors fit <- BayesSurvive( survObj = dataset, hyperpar = hyperparPooled, initial = initial, nIter = 100 ) # show posterior mean of coefficients and 95% credible intervals library("GGally") plot(fit) + coord_flip() + theme(axis.text.x = element_text(angle = 90, size = 7))
Predict time-dependent Brier scores based on Cox regression models
plotBrier( object, survObj.new = NULL, method = "mean", times = NULL, subgroup = 1 )
plotBrier( object, survObj.new = NULL, method = "mean", times = NULL, subgroup = 1 )
object |
fitted object obtained with |
survObj.new |
a list containing observed data from new subjects with
components |
method |
option to use the posterior mean ( |
times |
maximum time point to evaluate the prediction |
subgroup |
index of the subgroup in |
A ggplot2::ggplot
object. See ?ggplot2::ggplot
for more
details of the object.
library("BayesSurvive") set.seed(123) # Load the example dataset data("simData", package = "BayesSurvive") dataset <- list( "X" = simData[[1]]$X, "t" = simData[[1]]$time, "di" = simData[[1]]$status ) # Initial value: null model without covariates initial <- list("gamma.ini" = rep(0, ncol(dataset$X))) # Hyperparameters hyperparPooled <- list( "c0" = 2, # prior of baseline hazard "tau" = 0.0375, # sd for coefficient prior "cb" = 20, # sd for coefficient prior "pi.ga" = 0.02, # prior variable selection probability for standard Cox models "a" = -4, # hyperparameter in MRF prior "b" = 0.1, # hyperparameter in MRF prior "G" = simData$G # hyperparameter in MRF prior ) # run Bayesian Cox with graph-structured priors fit <- BayesSurvive( survObj = dataset, hyperpar = hyperparPooled, initial = initial, nIter = 100 ) # predict survival probabilities of the train data plotBrier(fit, survObj.new = dataset)
library("BayesSurvive") set.seed(123) # Load the example dataset data("simData", package = "BayesSurvive") dataset <- list( "X" = simData[[1]]$X, "t" = simData[[1]]$time, "di" = simData[[1]]$status ) # Initial value: null model without covariates initial <- list("gamma.ini" = rep(0, ncol(dataset$X))) # Hyperparameters hyperparPooled <- list( "c0" = 2, # prior of baseline hazard "tau" = 0.0375, # sd for coefficient prior "cb" = 20, # sd for coefficient prior "pi.ga" = 0.02, # prior variable selection probability for standard Cox models "a" = -4, # hyperparameter in MRF prior "b" = 0.1, # hyperparameter in MRF prior "G" = simData$G # hyperparameter in MRF prior ) # run Bayesian Cox with graph-structured priors fit <- BayesSurvive( survObj = dataset, hyperpar = hyperparPooled, initial = initial, nIter = 100 ) # predict survival probabilities of the train data plotBrier(fit, survObj.new = dataset)
Predict survival probability, (cumulative) hazard or (integrated) Brier scores based on Cox regression models
## S3 method for class 'BayesSurvive' predict( object, survObj.new, type = "brier", method = "mean", times = NULL, subgroup = 1, verbose = TRUE, ... )
## S3 method for class 'BayesSurvive' predict( object, survObj.new, type = "brier", method = "mean", times = NULL, subgroup = 1, verbose = TRUE, ... )
object |
fitted object obtained with |
survObj.new |
a list containing observed data from new subjects with
components |
type |
option to chose for predicting brier scores with |
method |
option to use the posterior mean ( |
times |
time points at which to evaluate the risks. If |
subgroup |
index of the subgroup in |
verbose |
logical value to print IBS of the NULL model and the Bayesian Cox model |
... |
not used |
A list object with 5 components if type="brier"
including
"model", "times", "Brier", "IBS" and "IPA" (Index of Prediction Accuracy),
otherwise a list of 7 components with the first component as
the specified argument type
and "se", "band", "type", "diag",
"baseline" and "times", see function riskRegression::predictCox
for
details
library("BayesSurvive") set.seed(123) # Load the example dataset data("simData", package = "BayesSurvive") dataset <- list( "X" = simData[[1]]$X, "t" = simData[[1]]$time, "di" = simData[[1]]$status ) # Initial value: null model without covariates initial <- list("gamma.ini" = rep(0, ncol(dataset$X))) # Hyperparameters hyperparPooled <- list( "c0" = 2, # prior of baseline hazard "tau" = 0.0375, # sd for coefficient prior "cb" = 20, # sd for coefficient prior "pi.ga" = 0.02, # prior variable selection probability for standard Cox models "a" = -4, # hyperparameter in MRF prior "b" = 0.1, # hyperparameter in MRF prior "G" = simData$G # hyperparameter in MRF prior ) # run Bayesian Cox with graph-structured priors fit <- BayesSurvive( survObj = dataset, hyperpar = hyperparPooled, initial = initial, nIter = 100 ) # predict survival probabilities of the train data predict(fit, survObj.new = dataset)
library("BayesSurvive") set.seed(123) # Load the example dataset data("simData", package = "BayesSurvive") dataset <- list( "X" = simData[[1]]$X, "t" = simData[[1]]$time, "di" = simData[[1]]$status ) # Initial value: null model without covariates initial <- list("gamma.ini" = rep(0, ncol(dataset$X))) # Hyperparameters hyperparPooled <- list( "c0" = 2, # prior of baseline hazard "tau" = 0.0375, # sd for coefficient prior "cb" = 20, # sd for coefficient prior "pi.ga" = 0.02, # prior variable selection probability for standard Cox models "a" = -4, # hyperparameter in MRF prior "b" = 0.1, # hyperparameter in MRF prior "G" = simData$G # hyperparameter in MRF prior ) # run Bayesian Cox with graph-structured priors fit <- BayesSurvive( survObj = dataset, hyperpar = hyperparPooled, initial = initial, nIter = 100 ) # predict survival probabilities of the train data predict(fit, survObj.new = dataset)
Simulated data set for a quick test. The data set is a list with six
components: covariates "X"
, survival times "time"
,
event status "status"
. The R code for generating the simulated data
is given in the Examples.
simData
simData
An object of class list
of length 3.
This contains subfunctions to update parameters gammas, betas, baseline hazard and graph learning parameters
UpdateGamma(sobj, hyperpar, ini, S, method, MRF_G, MRF_2b)
UpdateGamma(sobj, hyperpar, ini, S, method, MRF_G, MRF_2b)
sobj |
a list containing observed data |
hyperpar |
a list containing prior parameter values |
ini |
a list containing prior parameters' initial values |
S |
the number of subgroups |
method |
a method option from
|
MRF_G |
logical value. |
MRF_2b |
two different b in MRF prior for subgraphs G_ss and G_rs |
A list object with two components for the latent variable selection indicators gamma with either independent Bernoulli prior
This an internal function to update coefficients of the Bayesian Cox Lasso Model
UpdateRPlee11(sobj, hyperpar, ini, S, method, MRF_G)
UpdateRPlee11(sobj, hyperpar, ini, S, method, MRF_G)
sobj |
a list containing observed data |
hyperpar |
a list containing prior parameter values |
ini |
a list containing prior parameters' initial values |
S |
the number of subgroups |
method |
a method option from
|
MRF_G |
logical value. |
A list object with component 'beta.ini' for the updated coefficients and component 'acceptlee' for the MCMC acceptance rate
Perform variable selection using the 95 neighborhood criterion (SNC), median probability model (MPM) or Bayesian false discovery rate (FDR). Note that the Bayesian FDR only applies for each subgroup if there are subgroups.
VS(x, method = "FDR", threshold = NA, subgroup = 1)
VS(x, method = "FDR", threshold = NA, subgroup = 1)
x |
fitted object obtained with |
method |
variable selection method to choose from
|
threshold |
SNC threshold value (default 0.5) or the Bayesian expected false discovery rate threshold (default 0.05) |
subgroup |
index(es) of subgroup(s) for visualizing variable selection |
A boolean vector of selected (= TRUE) and rejected (= FALSE) variables for one group or a list for multiple groups
Lee KH, Chakraborty S, Sun J (2015). Survival prediction and variable selection with simultaneous shrinkage and grouping priors. Statistical Analysis and Data Mining, 8:114-127
Newton MA, Noueiry A, Sarkar D, Ahlquist P (2004). Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics, 5(2), 155-76
library("BayesSurvive") set.seed(123) # Load the example dataset data("simData", package = "BayesSurvive") dataset <- list( "X" = simData[[1]]$X, "t" = simData[[1]]$time, "di" = simData[[1]]$status ) # Initial value: null model without covariates initial <- list("gamma.ini" = rep(0, ncol(dataset$X))) # Hyperparameters hyperparPooled <- list( "c0" = 2, # prior of baseline hazard "tau" = 0.0375, # sd for coefficient prior "cb" = 20, # sd for coefficient prior "pi.ga" = 0.02, # prior variable selection probability for standard Cox models "a" = -4, # hyperparameter in MRF prior "b" = 0.1, # hyperparameter in MRF prior "G" = simData$G # hyperparameter in MRF prior ) # run Bayesian Cox with graph-structured priors fit <- BayesSurvive( survObj = dataset, hyperpar = hyperparPooled, initial = initial, nIter = 100, burnin = 50 ) # show variable selection VS(fit, method = "FDR")
library("BayesSurvive") set.seed(123) # Load the example dataset data("simData", package = "BayesSurvive") dataset <- list( "X" = simData[[1]]$X, "t" = simData[[1]]$time, "di" = simData[[1]]$status ) # Initial value: null model without covariates initial <- list("gamma.ini" = rep(0, ncol(dataset$X))) # Hyperparameters hyperparPooled <- list( "c0" = 2, # prior of baseline hazard "tau" = 0.0375, # sd for coefficient prior "cb" = 20, # sd for coefficient prior "pi.ga" = 0.02, # prior variable selection probability for standard Cox models "a" = -4, # hyperparameter in MRF prior "b" = 0.1, # hyperparameter in MRF prior "G" = simData$G # hyperparameter in MRF prior ) # run Bayesian Cox with graph-structured priors fit <- BayesSurvive( survObj = dataset, hyperpar = hyperparPooled, initial = initial, nIter = 100, burnin = 50 ) # show variable selection VS(fit, method = "FDR")