Department of Statistical Science, UCL
Resources
Slides and code here: github.com/n8thangreen/stan-code-generation-talk
if/else
for the Distributionsample_if_else <- function(distribution_name, n, ...) {
if (distribution_name == "norm") {
return(rnorm(n, ...))
} else if (distribution_name == "pois") {
return(rpois(n, ...))
} else if (distribution_name == "unif") {
return(runif(n, ...))
} else if (distribution_name == "exp") {
return(rexp(n, ...))
} else {
stop("Unsupported distribution.")
}
}
sample_if_else("norm", 5, mean = 0, sd = 1)
sample_if_else("pois", 5, lambda = 5)
Provide a string for the distribution name
Could use a switch
statement
Duplicate code
Pass kernel / root of distribution name
Create the appropriate function name
Then execute with additional arguments
generate_code <- function(dist_base_name, n, ...) {
func_name_string <- paste0("r", dist_base_name)
dots <- match.call(expand.dots = FALSE)$`...`
arg_strings <- c()
arg_strings <- c(arg_strings, paste0("n = ", n))
# Add other arguments from '...'
if (!is.null(dots)) {
# ...
}
all_args_string <- paste(arg_strings, collapse = ", ")
r_code_string <-
paste0(func_name_string, "(", all_args_string, ")")
return(r_code_string)
}
norm_code <-
generate_code("norm", 10, mean = 5, sd = 2.5)
norm_samples <- eval(parse(text = norm_code))
Separate the creation of the sampling function code from the execution
Two-step approach allows for repeated use of code
Extend for a combination of generated code and passed arguments
A functional paradigm may be better
Code Generation vs. Configuration/Parameterization
Compile-time vs. Run-time Flexibility
Code generation leans towards compile-time flexibility. The variations are created before the program runs (at “compile time” in a broader sense, even for interpreted languages like R where it’s more about script generation). This can lead to highly optimized, lean code for each specific use case.
Flexible inputs/flags lean towards run-time flexibility. The same code handles different scenarios by evaluating parameters or flags at the time the program is executed. This offers greater adaptability without needing to re-generate or re-deploy.
Generality vs. Specialization
Code generation allows for highly specialized code, tailored precisely to a given set of requirements. This can lead to better performance and simpler code for that specific instance.
Flexible inputs favour generality, creating a single solution that can handle a broader range of cases. This can lead to more complex code internally (e.g., more if/else statements, more parameters) but fewer distinct code artifacts.
DRY (Don’t Repeat Yourself) vs. YAGNI (You Ain’t Gonna Need It)
Sometimes code generation is used to adhere to DRY principles by generating repetitive boilerplate. However, if the variations are minor, generating separate code for each can introduce repetition at a higher level (e.g., repeating the “model structure” with slight changes).
The “flexible inputs” approach often aligns with YAGNI, as you’re not generating extra code until it’s explicitly needed via configuration.
Performance: Generated code can be highly optimized for a specific task, as it doesn’t need to carry the overhead of conditional logic for other scenarios.
Simplicity (per generated file): Each generated code file can be simpler and more readable because it only contains the logic relevant to its specific configuration.
Reduced runtime overhead: No need to parse flags or configuration settings at runtime.
Type safety/early error detection: If the generated code is truly distinct, type errors or other issues might be caught earlier in the development cycle.
Increased Maintenance Overhead: You now have multiple code files to manage. If the underlying logic changes, you might need to regenerate all of them, which can be prone to errors if the generation process isn’t robust.
Debugging Challenges: Debugging generated code can be harder, especially if the generation logic itself is complex. The actual code you’re debugging might not be what you wrote directly.
Version Control Complexity: Managing many generated files in version control can be cumbersome.
Potential for “Code Bloat”: If the variations are numerous, you could end up with a large number of very similar code files.
Easier Maintenance: You maintain one core piece of code. Changes to the underlying logic only need to be applied in one place.
Greater Flexibility: It’s easier to introduce new variations or modify existing ones by simply changing configuration or input parameters, without needing to touch the core code or regenerate anything.
Reduced Code Duplication: Promotes the DRY principle by centralizing common logic.
Simpler Deployment: Only one version of the code needs to be deployed.
Increased Code Complexity: The single code base might become more complex due to the need for conditional logic (e.g., if/else, switch statements) to handle different configurations.
Potential Performance Overhead: Runtime checks for flags or parameters can introduce minor performance overhead.
Readability Challenges: A highly flexible, parameterized code base can sometimes be harder to read and understand due to the many branches and conditions.
Risk of “God Function”: A single function or object trying to do too much based on various flags can become an unmanageable “God” entity.
From Green et al. (2025)
Reference
Green, N., Kurt, M., Moshyk, A., Larkin, J. and Baio, G. (2025), A Bayesian Hierarchical Mixture Cure Modelling Framework to Utilize Multiple Survival Datasets for Long-Term Survivorship Estimates: A Case Study From Previously Untreated Metastatic Melanoma. Statistics in Medicine, 44: e70132. https://doi.org/10.1002/sim.70132
functions {
#include /include/distributions.stan
}
data {
int<lower=1> nTx;
int<lower=1, upper=3> cf_model; // cure fraction
int<lower=0> N_1;
array[nTx] int<lower=0> n_1;
int<lower=0> H_1;
vector<lower=0>[N_1] t_1;
vector<lower=0, upper=1>[N_1] d_1;
matrix[N_1, H_1] X_1;
vector[H_1] mu_S_1;
vector<lower=0>[H_1] sigma_S_1;
int<lower=0> N_2;
array[nTx] int<lower=0> n_2;
int<lower=0> H_2;
vector<lower=0>[N_2] t_2;
vector<lower=0, upper=1>[N_2] d_2;
matrix[N_2, H_2] X_2;
vector[H_2] mu_S_2;
vector<lower=0>[H_2] sigma_S_2;
int<lower=0> N_3;
array[nTx] int<lower=0> n_3;
int<lower=0> H_3;
vector<lower=0>[N_3] t_3;
vector<lower=0, upper=1>[N_3] d_3;
matrix[N_3, H_3] X_3;
vector[H_3] mu_S_3;
vector<lower=0>[H_3] sigma_S_3;
int<lower=0> N_4;
array[nTx] int<lower=0> n_4;
int<lower=0> H_4;
vector<lower=0>[N_4] t_4;
vector<lower=0, upper=1>[N_4] d_4;
matrix[N_4, H_4] X_4;
vector[H_4] mu_S_4;
vector<lower=0>[H_4] sigma_S_4;
int<lower=1, upper=2> bg_model;
vector[bg_model == 1 ? H_1 : 0] mu_bg;
vector<lower=0>[bg_model == 1 ? H_1 : 0] sigma_bg;
vector<lower=0>[bg_model == 2 ? N_1 : 0] h_bg_1;
vector<lower=0>[bg_model == 2 ? N_2 : 0] h_bg_2;
vector<lower=0>[bg_model == 2 ? N_3 : 0] h_bg_3;
vector<lower=0>[bg_model == 2 ? N_4 : 0] h_bg_4;
matrix[nTx, nTx] Tx_dmat; // treatment design matrix
vector[cf_model == 3 ? nTx : 0] mu_alpha; // treatment regression coefficients
vector<lower=0>[cf_model == 3 ? nTx : 0] sigma_alpha;
int<lower=0> t_max;
vector[cf_model == 2 ? nTx : 0] mu_alpha_1;
vector<lower=0>[cf_model == 2 ? nTx : 0] sigma_alpha_1;
vector[cf_model == 2 ? nTx : 0] mu_alpha_2;
vector<lower=0>[cf_model == 2 ? nTx : 0] sigma_alpha_2;
vector[cf_model == 2 ? nTx : 0] mu_alpha_3;
vector<lower=0>[cf_model == 2 ? nTx : 0] sigma_alpha_3;
vector[cf_model == 2 ? nTx : 0] mu_alpha_4;
vector<lower=0>[cf_model == 2 ? nTx : 0] sigma_alpha_4;
array[cf_model == 1 ? 1 : 0] real<lower=0> a_cf;
array[cf_model == 1 ? 1 : 0] real<lower=0> b_cf;
vector[cf_model == 3 ? nTx : 0] mu_sd_cf;
vector<lower=0>[cf_model == 3 ? nTx : 0] sigma_sd_cf;
}
parameters {
// coefficients in linear predictor (including intercept)
vector[bg_model == 1 ? H_1 : 0] beta_bg;
vector[cf_model != 2 ? nTx : 0] alpha;
vector[cf_model == 2 ? nTx : 0] alpha_1;
vector[H_1] beta_1;
vector[cf_model == 2 ? nTx : 0] alpha_2;
vector[H_2] beta_2;
vector[cf_model == 2 ? nTx : 0] alpha_3;
vector[H_3] beta_3;
vector[cf_model == 2 ? nTx : 0] alpha_4;
vector[H_4] beta_4;
vector<lower=0, upper=1>[cf_model == 1 ? nTx : 0] cf_pooled;
vector[cf_model == 3 ? nTx : 0] lp_cf_1;
vector[cf_model == 3 ? nTx : 0] lp_cf_2;
vector[cf_model == 3 ? nTx : 0] lp_cf_3;
vector[cf_model == 3 ? nTx : 0] lp_cf_4;
vector<lower=0>[cf_model == 3 ? nTx : 0] sd_cf;
}
transformed parameters {
vector[N_1] lp_1_bg;
vector<lower=0>[N_1] lambda_1_bg;
vector[N_2] lp_2_bg;
vector<lower=0>[N_2] lambda_2_bg;
vector[N_3] lp_3_bg;
vector<lower=0>[N_3] lambda_3_bg;
vector[N_4] lp_4_bg;
vector<lower=0>[N_4] lambda_4_bg;
vector[N_1] lp_1;
// rate parameters
vector<lower=0>[N_1] lambda_1;
vector[N_2] lp_2;
// rate parameters
vector<lower=0>[N_2] lambda_2;
vector[N_3] lp_3;
// rate parameters
vector<lower=0>[N_3] lambda_3;
vector[N_4] lp_4;
// rate parameters
vector<lower=0>[N_4] lambda_4;
vector<lower=0, upper=1>[cf_model == 3 ? nTx : 0] cf_global;
vector<lower=0, upper=1>[nTx] cf_1;
vector[cf_model == 2 ? nTx : 0] tx_cf_1;
vector<lower=0, upper=1>[nTx] cf_2;
vector[cf_model == 2 ? nTx : 0] tx_cf_2;
vector<lower=0, upper=1>[nTx] cf_3;
vector[cf_model == 2 ? nTx : 0] tx_cf_3;
vector<lower=0, upper=1>[nTx] cf_4;
vector[cf_model == 2 ? nTx : 0] tx_cf_4;
vector[cf_model == 3 ? nTx : 0] lp_cf_global;
// correlated event times
lp_1 = X_1*beta_1;
// background survival with uncertainty
if (bg_model == 1) {
lp_1_bg = X_1*beta_bg;
} else {
lp_1_bg = log(h_bg_1);
}
lambda_1_bg = exp(lp_1_bg);
// correlated event times
lp_2 = X_2*beta_2;
// background survival with uncertainty
if (bg_model == 1) {
lp_2_bg = X_2*beta_bg;
} else {
lp_2_bg = log(h_bg_2);
}
lambda_2_bg = exp(lp_2_bg);
// correlated event times
lp_3 = X_3*beta_3;
// background survival with uncertainty
if (bg_model == 1) {
lp_3_bg = X_3*beta_bg;
} else {
lp_3_bg = log(h_bg_3);
}
lambda_3_bg = exp(lp_3_bg);
// correlated event times
lp_4 = X_4*beta_4;
// background survival with uncertainty
if (bg_model == 1) {
lp_4_bg = X_4*beta_bg;
} else {
lp_4_bg = log(h_bg_4);
}
lambda_4_bg = exp(lp_4_bg);
lambda_1 = exp(lp_1);
lambda_2 = exp(lp_2);
lambda_3 = exp(lp_3);
lambda_4 = exp(lp_4);
if (cf_model == 1) {
cf_1 = cf_pooled;
cf_2 = cf_pooled;
cf_3 = cf_pooled;
cf_4 = cf_pooled;
}
if (cf_model == 3) {
lp_cf_global = Tx_dmat*alpha;
cf_global = inv_logit(lp_cf_global);
cf_1 = inv_logit(lp_cf_1);
cf_2 = inv_logit(lp_cf_2);
cf_3 = inv_logit(lp_cf_3);
cf_4 = inv_logit(lp_cf_4);
}
if (cf_model == 2) {
tx_cf_1 = Tx_dmat*alpha_1;
cf_1 = inv_logit(tx_cf_1);
tx_cf_2 = Tx_dmat*alpha_2;
cf_2 = inv_logit(tx_cf_2);
tx_cf_3 = Tx_dmat*alpha_3;
cf_3 = inv_logit(tx_cf_3);
tx_cf_4 = Tx_dmat*alpha_4;
cf_4 = inv_logit(tx_cf_4);
}
}
model {
int idx_1;
int idx_2;
int idx_3;
int idx_4;
beta_1 ~ normal(mu_S_1, sigma_S_1);
beta_2 ~ normal(mu_S_2, sigma_S_2);
beta_3 ~ normal(mu_S_3, sigma_S_3);
beta_4 ~ normal(mu_S_4, sigma_S_4);
if (bg_model == 1) {
beta_bg ~ normal(mu_bg, sigma_bg);
}
// cure fraction
if (cf_model == 3) {
alpha ~ normal(mu_alpha, sigma_alpha);
sd_cf ~ cauchy(mu_sd_cf, sigma_sd_cf); // truncated
lp_cf_1 ~ normal(lp_cf_global, sd_cf);
lp_cf_2 ~ normal(lp_cf_global, sd_cf);
lp_cf_3 ~ normal(lp_cf_global, sd_cf);
lp_cf_4 ~ normal(lp_cf_global, sd_cf);
} else if (cf_model == 2) {
alpha_1 ~ normal(mu_alpha_1, sigma_alpha_1);
alpha_2 ~ normal(mu_alpha_2, sigma_alpha_2);
alpha_3 ~ normal(mu_alpha_3, sigma_alpha_3);
alpha_4 ~ normal(mu_alpha_4, sigma_alpha_4);
} else {
cf_pooled ~ beta(a_cf, b_cf);
}
idx_1 = 1;
// likelihood
for (Tx in 1:nTx) {
for (i in idx_1:(idx_1 + n_1[Tx] - 1)) {
target += log_sum_exp(
log(cf_1[Tx]) +
exp_log_S(t_1[i], lambda_1_bg[i]),
log1m(cf_1[Tx]) +
exp_log_S(t_1[i], lambda_1_bg[i]) + exp_log_S(t_1[i], lambda_1[i]));
target += d_1[i] * log_sum_exp(
log(lambda_1_bg[i]),
log1m(cf_1[Tx]) +
exp_lpdf(t_1[i] | lambda_1[i]) - log(cf_1[Tx] + (1 - cf_1[Tx])*exp_Surv(t_1[i], lambda_1[i])));
}
idx_1 = idx_1 + n_1[Tx];
}
idx_2 = 1;
// likelihood
for (Tx in 1:nTx) {
for (i in idx_2:(idx_2 + n_2[Tx] - 1)) {
target += log_sum_exp(
log(cf_2[Tx]) +
exp_log_S(t_2[i], lambda_2_bg[i]),
log1m(cf_2[Tx]) +
exp_log_S(t_2[i], lambda_2_bg[i]) + exp_log_S(t_2[i], lambda_2[i]));
target += d_2[i] * log_sum_exp(
log(lambda_2_bg[i]),
log1m(cf_2[Tx]) +
exp_lpdf(t_2[i] | lambda_2[i]) - log(cf_2[Tx] + (1 - cf_2[Tx])*exp_Surv(t_2[i], lambda_2[i])));
}
idx_2 = idx_2 + n_2[Tx];
}
idx_3 = 1;
// likelihood
for (Tx in 1:nTx) {
for (i in idx_3:(idx_3 + n_3[Tx] - 1)) {
target += log_sum_exp(
log(cf_3[Tx]) +
exp_log_S(t_3[i], lambda_3_bg[i]),
log1m(cf_3[Tx]) +
exp_log_S(t_3[i], lambda_3_bg[i]) + exp_log_S(t_3[i], lambda_3[i]));
target += d_3[i] * log_sum_exp(
log(lambda_3_bg[i]),
log1m(cf_3[Tx]) +
exp_lpdf(t_3[i] | lambda_3[i]) - log(cf_3[Tx] + (1 - cf_3[Tx])*exp_Surv(t_3[i], lambda_3[i])));
}
idx_3 = idx_3 + n_3[Tx];
}
idx_4 = 1;
// likelihood
for (Tx in 1:nTx) {
for (i in idx_4:(idx_4 + n_4[Tx] - 1)) {
target += log_sum_exp(
log(cf_4[Tx]) +
exp_log_S(t_4[i], lambda_4_bg[i]),
log1m(cf_4[Tx]) +
exp_log_S(t_4[i], lambda_4_bg[i]) + exp_log_S(t_4[i], lambda_4[i]));
target += d_4[i] * log_sum_exp(
log(lambda_4_bg[i]),
log1m(cf_4[Tx]) +
exp_lpdf(t_4[i] | lambda_4[i]) - log(cf_4[Tx] + (1 - cf_4[Tx])*exp_Surv(t_4[i], lambda_4[i])));
}
idx_4 = idx_4 + n_4[Tx];
}
}
generated quantities {
real mean_bg;
// real pbeta_bg;
real log_lik = 0;
vector[t_max] S_bg;
vector[t_max] S_1;
vector[nTx] rmst_1;
vector[nTx] median_1;
matrix[t_max, nTx] S_1_pred;
real mean_1;
int idx_1;
real log_lik_1;
// real pbeta_1 = normal_rng(mu_S_1[1], sigma_S_1[1]);
vector[t_max] S_2;
vector[nTx] rmst_2;
vector[nTx] median_2;
matrix[t_max, nTx] S_2_pred;
real mean_2;
int idx_2;
real log_lik_2;
// real pbeta_2 = normal_rng(mu_S_2[1], sigma_S_2[1]);
vector[t_max] S_3;
vector[nTx] rmst_3;
vector[nTx] median_3;
matrix[t_max, nTx] S_3_pred;
real mean_3;
int idx_3;
real log_lik_3;
// real pbeta_3 = normal_rng(mu_S_3[1], sigma_S_3[1]);
vector[t_max] S_4;
vector[nTx] rmst_4;
vector[nTx] median_4;
matrix[t_max, nTx] S_4_pred;
real mean_4;
int idx_4;
real log_lik_4;
// real pbeta_4 = normal_rng(mu_S_4[1], sigma_S_4[1]);
mean_1 = exp(beta_1[1]);
mean_2 = exp(beta_2[1]);
mean_3 = exp(beta_3[1]);
mean_4 = exp(beta_4[1]);// background rate
if (bg_model == 1) {
mean_bg = exp(beta_bg[1]);
} else {
// mean_bg = 0.001;
mean_bg = mean(h_bg_1);
mean_bg = mean(h_bg_2);
mean_bg = mean(h_bg_3);
mean_bg = mean(h_bg_4);
}
// posterior mean checks
for (j in 1:nTx) {
for (i in 1:t_max) {
S_bg[i] = exp_Surv(i, mean_bg);
S_1[i] = exp_exp_Surv(i, mean_1, mean_bg);
S_1_pred[i, j] = cf_1[j]*S_bg[i] + (1 - cf_1[j])*S_1[i];
}
}
// posterior mean checks
for (j in 1:nTx) {
for (i in 1:t_max) {
S_bg[i] = exp_Surv(i, mean_bg);
S_2[i] = exp_exp_Surv(i, mean_2, mean_bg);
S_2_pred[i, j] = cf_2[j]*S_bg[i] + (1 - cf_2[j])*S_2[i];
}
}
// posterior mean checks
for (j in 1:nTx) {
for (i in 1:t_max) {
S_bg[i] = exp_Surv(i, mean_bg);
S_3[i] = exp_exp_Surv(i, mean_3, mean_bg);
S_3_pred[i, j] = cf_3[j]*S_bg[i] + (1 - cf_3[j])*S_3[i];
}
}
// posterior mean checks
for (j in 1:nTx) {
for (i in 1:t_max) {
S_bg[i] = exp_Surv(i, mean_bg);
S_4[i] = exp_exp_Surv(i, mean_4, mean_bg);
S_4_pred[i, j] = cf_4[j]*S_bg[i] + (1 - cf_4[j])*S_4[i];
}
}
// restricted mean survival time
for (i in 1:nTx) {
rmst_1[i] = cf_1[i]*t_max + (1 - cf_1[i])*rmst_exp(mean_1, t_max);
median_1[i] = median_surv_cf_exp(mean_1, cf_1[i]);
//median_1[i] = median_surv_exp(mean_1); // uncured
}
// restricted mean survival time
for (i in 1:nTx) {
rmst_2[i] = cf_2[i]*t_max + (1 - cf_2[i])*rmst_exp(mean_2, t_max);
median_2[i] = median_surv_cf_exp(mean_2, cf_2[i]);
//median_2[i] = median_surv_exp(mean_2); // uncured
}
// restricted mean survival time
for (i in 1:nTx) {
rmst_3[i] = cf_3[i]*t_max + (1 - cf_3[i])*rmst_exp(mean_3, t_max);
median_3[i] = median_surv_cf_exp(mean_3, cf_3[i]);
//median_3[i] = median_surv_exp(mean_3); // uncured
}
// restricted mean survival time
for (i in 1:nTx) {
rmst_4[i] = cf_4[i]*t_max + (1 - cf_4[i])*rmst_exp(mean_4, t_max);
median_4[i] = median_surv_cf_exp(mean_4, cf_4[i]);
//median_4[i] = median_surv_exp(mean_4); // uncured
}
// likelihood
idx_1 = 1;
for (Tx in 1:nTx) {
for (i in idx_1:(idx_1 + n_1[Tx] - 1)) {
log_lik_1 += log_sum_exp(
log(cf_1[Tx]) +
surv_exp_lpdf(t_1[i] | d_1[i], lambda_1_bg[i]),
log1m(cf_1[Tx]) +
joint_exp_exp_lpdf(t_1[i] | d_1[i], lambda_1[i], lambda_1_bg[i]));
}
idx_1 = idx_1 + n_1[Tx];
}
log_lik = log_lik + log_lik_1;
// likelihood
idx_2 = 1;
for (Tx in 1:nTx) {
for (i in idx_2:(idx_2 + n_2[Tx] - 1)) {
log_lik_2 += log_sum_exp(
log(cf_2[Tx]) +
surv_exp_lpdf(t_2[i] | d_2[i], lambda_2_bg[i]),
log1m(cf_2[Tx]) +
joint_exp_exp_lpdf(t_2[i] | d_2[i], lambda_2[i], lambda_2_bg[i]));
}
idx_2 = idx_2 + n_2[Tx];
}
log_lik = log_lik + log_lik_2;
// likelihood
idx_3 = 1;
for (Tx in 1:nTx) {
for (i in idx_3:(idx_3 + n_3[Tx] - 1)) {
log_lik_3 += log_sum_exp(
log(cf_3[Tx]) +
surv_exp_lpdf(t_3[i] | d_3[i], lambda_3_bg[i]),
log1m(cf_3[Tx]) +
joint_exp_exp_lpdf(t_3[i] | d_3[i], lambda_3[i], lambda_3_bg[i]));
}
idx_3 = idx_3 + n_3[Tx];
}
log_lik = log_lik + log_lik_3;
// likelihood
idx_4 = 1;
for (Tx in 1:nTx) {
for (i in idx_4:(idx_4 + n_4[Tx] - 1)) {
log_lik_4 += log_sum_exp(
log(cf_4[Tx]) +
surv_exp_lpdf(t_4[i] | d_4[i], lambda_4_bg[i]),
log1m(cf_4[Tx]) +
joint_exp_exp_lpdf(t_4[i] | d_4[i], lambda_4[i], lambda_4_bg[i]));
}
idx_4 = idx_4 + n_4[Tx];
}
log_lik = log_lik + log_lik_4;
}
create_stancode <- function(models) {
n_grps <- length(models)
# generate separate blocks of Stan code
stancode <- create_code_skeleton(n_grps)
cf_code <- create_cf_code(n_grps)
latent_model_code <- list()
priorpred_code <- postpred_code <- list()
estimates_code <- list()
loo_code <- loglik_code <- list()
for (i in seq_along(models)) {
latent_model_code[[i]] <- make_latent_model_code(models[i], id = i)
priorpred_code[[i]] <- make_priorpred(models[i], id = i)
postpred_code[[i]] <- make_postpred(models[i], id = i)
estimates_code[[i]] <- make_summary_estimates(models[i], id = i)
loglik_code[[i]] <- make_pop_loglik(models[i], id = i)
# loglik_code[[i]] <- make_loglik(models[i], id = i)
loo_code[[i]] <- make_loo(models[i], id = i)
}
# interleave all model lists
latent_model_code <- rearrange_blocks(latent_model_code)
priorpred_code <- rearrange_blocks(priorpred_code)
postpred_code <- rearrange_blocks(postpred_code)
estimates_code <- rearrange_blocks(estimates_code)
loglik_code <- rearrange_blocks(loglik_code)
loo_code <- rearrange_blocks(loo_code)
scode <- list()
scode$functions <-
c("functions {\n#include /include/distributions.stan\n}\n\n")
# generate data block
scode$data <- paste0(
"data {\n",
stancode$data_def,
cf_code$data_def,
latent_model_code$data_def,
stancode$data_main,
cf_code$data_main,
"\n}\n\n"
)
# generate parameters block
scode$parameters <- paste0(
"parameters {\n",
latent_model_code$parameters,
stancode$parameters,
cf_code$parameters,
"\n}\n\n"
)
# generate transformed parameters block
scode$trans_params <- paste0(
"transformed parameters {\n",
stancode$trans_params_def,
latent_model_code$trans_params_def,
cf_code$trans_params_def,
stancode$trans_params_main,
latent_model_code$trans_params_main,
cf_code$trans_params_main,
"\n}\n\n"
)
# combine likelihood with prior part
scode$model <- paste0(
"model {\n",
stancode$model,
latent_model_code$model,
cf_code$model,
loglik_code,
"\n}\n\n"
)
# generate generated quantities block
scode$generated_quantities <- paste0(
"generated quantities {\n",
stancode$generated_quantities_def,
latent_model_code$generated_quantities_def,
latent_model_code$generated_quantities_main,
stancode$generated_quantities_main,
cf_code$generated_quantities_main,
do.call(paste, postpred_code),
do.call(paste, estimates_code),
do.call(paste, priorpred_code),
do.call(paste, loo_code),
"}\n"
)
# combine all elements into a complete Stan model
paste0(
scode$functions,
scode$data,
scode$parameters,
scode$trans_params,
scode$model,
scode$generated_quantities
)
}
make_latent_model_code <- function(model, id = 1L) {
scode <- list()
scode$data_def <-
common_code_event_data(id)
scode$trans_params_def <-
glue("\nvector[N_{id}] lp_{id};\n")
if (model == "exp") {
scode$trans_params_def <-
glue(scode$trans_params_def,
"\n// rate parameters
vector<lower=0>[N_{id}] lambda_{id};\n")
scode$parameters <- ""
scode$trans_params_main <-
glue("\nlambda_{id} = exp(lp_{id});\n")
scode$model <- ""
scode$generated_quantities_def <- ""
scode$generated_quantities_main <-
glue("mean_{id} = exp(beta_{id}[1]);\n")
}
if (model %in% c("gompertz", "loglogistic", "weibull")) {
scode$data_def <-
glue(scode$data_def,
"real<lower=0> a_shape_{id};\n",
"real<lower=0> b_shape_{id};\n")
# "array[0] real x_r; // empty data array\n",
# "array[0] int x_i; // for integration\n")
scode$parameters <-
glue("real<lower=0> shape_{id};\n")
scode$trans_params_def <-
glue(scode$trans_params_def,
"// rate parameters
vector[N_{id}] lambda_{id};\n")
scode$trans_params_main <-
glue("lambda_{id} = exp(lp_{id});\n",
"for (i in 1:num_elements(lambda_{id})) {{\n",
" if (lambda_{id}[i] > 100) {{ // constrain upper limit\n",
" lambda_{id}[i] = 100;\n",
" }\n",
" if (lambda_{id}[i] <= 0) {{ // constrain lower limit\n",
" lambda_{id}[i] = 1e-10;\n",
" }\n",
"}\n")
scode$model <-
glue("shape_{id} ~ gamma(a_shape_{id}, b_shape_{id});\n")
scode$generated_quantities_def <-
glue("real pshape_{id} = gamma_rng(a_shape_{id}, b_shape_{id});\n")
scode$generated_quantities_main <-
glue("mean_{id} = exp(beta_{id}[1]);\n")
}
create_code_skeleton <- function(n_grp) {
scode <- list()
ids <- data.frame(id = 1:n_grp)
scode$data_def <-
c(" int<lower=1> nTx;\n")
scode$data_main <-
paste0(
paste("\nint<lower=1, upper=2> bg_model;\n",
"vector[bg_model == 1 ? H_1 : 0] mu_bg;\n",
"vector<lower=0>[bg_model == 1 ? H_1 : 0] sigma_bg;\n", collapse = "\n"),
cglue_data(ids, " vector<lower=0>[bg_model == 2 ? N_{id} : 0] h_bg_{id};\n"),
paste("\n matrix[nTx, nTx] Tx_dmat; // treatment design matrix\n",
"vector[cf_model == 3 ? nTx : 0] mu_alpha; // treatment regression coefficients\n",
"vector<lower=0>[cf_model == 3 ? nTx : 0] sigma_alpha;\n",
"int<lower=0> t_max;\n", collapse = "\n"),
cglue_data(ids,
"vector[cf_model == 2 ? nTx : 0] mu_alpha_{id};
vector<lower=0>[cf_model == 2 ? nTx : 0] sigma_alpha_{id};\n"))
scode$parameters <-
paste0(
paste("// coefficients in linear predictor (including intercept)\n",
"vector[bg_model == 1 ? H_1 : 0] beta_bg;\n",
"vector[cf_model != 2 ? nTx : 0] alpha;\n", collapse = "\n"),
cglue_data(ids, "vector[cf_model == 2 ? nTx : 0] alpha_{id};
vector[H_{id}] beta_{id};\n"), collapse = "\n")
scode$trans_params_def <-
cglue_data(ids,
"vector[N_{id}] lp_{id}_bg;\n
vector<lower=0>[N_{id}] lambda_{id}_bg;\n
")
scode$trans_params_main <-
cglue_data(ids,
"// correlated event times
lp_{id} = X_{id}*beta_{id};
// background survival with uncertainty\n
if (bg_model == 1) {{
lp_{id}_bg = X_{id}*beta_bg;
} else {{
lp_{id}_bg = log(h_bg_{id});
}\n
lambda_{id}_bg = exp(lp_{id}_bg);\n
")
scode$model <-
paste0(
cglue_data(ids, "\nint idx_{id};"),
cglue_data(ids,"
\nbeta_{id} ~ normal(mu_S_{id}, sigma_S_{id});\n"),
paste0("\nif (bg_model == 1) {\n",
"\t beta_bg ~ normal(mu_bg, sigma_bg);\n",
"}\n"), collapse = "\n")
...
Degree of variation: If the differences between your Stan models are truly substantial (e.g., entirely different likelihoods, highly custom priors that fundamentally change the model structure), then code generation might be beneficial.
Commonalities: If there’s a lot of shared structure and only minor tweaks (e.g., changing the number of groups, adding a specific type of covariate), flexible inputs with flags would likely be more maintainable.
Expertise and comfort: If you have strong metaprogramming skills in R and are comfortable with generating code dynamically, it might be a powerful approach. If you prioritize simpler, more direct code that’s easy to read and debug, flexible inputs are probably better.
Performance requirements: For highly performance-critical Stan models, generating optimized code for each specific case might be worth the extra complexity.
Nathan Green | UCL | n.green@ucl.ac.uk