The Stan Language

1 The Stan Language

The Stan language is a strongly statically typed compiled language (similar to C or C++). For us in practive, this means we have two major differences between Stan and interpreted languages that we use for our daily tasks like R and Python:

We declare a type (int, real, array, vector, matrix, ...) for each variable, and this type cannot change. This is contrary to languages like R and Python where we don’t have to declare a variable type, and we can overwrite and change a variable anytime.
A Stan program must be written completely and then compiled - we can’t run it line-by-line as we can in an interpreted language like R.

We write a Stan model down in a .stan file¹, after which the Stan program is internally translated to C++ and compiled.

2 Stan Program Blocks

A Stan model is written in program blocks, similarly to NONMEM with $PROB, $DATA, $PK, .... There is a good explanation of the Stan program blocks here, and I highly recommend you read that to get a more thorough understanding. Here we give a brief overview²:

2.1 `functions`

The functions block is an optional block at the beginning of the program where user-defined functions appear.
User defined random number generator functions and probability distributions can be defined here
Void functions (those that return no value) are allowed

Here is a functions block with two functions:

depot_1cmt() returns a vector containing the concentration of drug after a single dose assuming first-order absorption and elimination at a vector of timepoints.
normal_lb_rng() generates a single random number from a normal distribution that is truncated below at lb (often this is 0):

functions{
  
  vector depot_1cmt(real dose, real cl, real vc, real ka, 
                    vector time_since_dose){
    
    real ke = cl/vc;
    
    vector[num_elements(time_since_dose)] conc = dose/vc * ka/(ka - ke) .* 
      (exp(-ke*time_since_dose) - exp(-ka*time_since_dose));
    
    return conc;
    
  }
  
  real normal_lb_rng(real mu, real sigma, real lb){
    
    real p_lb = normal_cdf(lb | mu, sigma);
    real u = uniform_rng(p_lb, 1);
    real y = mu + sigma * inv_Phi(u);
    return y; 

  }
  
}

2.2 `data`

Data are specified upfront and remain fixed
They are either specified in the block or read from outside - generally as R users using cmdstanr we supply the $sample() function with either a list of R objects (most examples on this website do this) or the path to a data file containing the required variables.
They are read once at the beginning of the process

Example: Define observed PK (dv) data (and PD if you have it). We can also define our independent variables (time), parameters for our prior distributions (scale_x), covariates, times at which we want to make predictions (time_new), or anything else we want to input into the model.

data{
  
  int n_obs;                        // Number of observations
  real<lower = 0> dose;             // Dose amount
  real time_of_first_dose;          // Time of first dose
  array[n_obs] real<lower = time_of_first_dose> time; // Times at which we have observations
  vector[n_obs] dv;                 // Observed PK data
  
  real<lower = 0> location_cl;      // Prior location parameter for CL
  real<lower = 0> location_vc;      // Prior location parameter for VC
  real<lower = 0> location_ka;      // Prior location parameter for KA
  
  real<lower = 0> scale_cl;         // Prior Scale parameter for CL
  real<lower = 0> scale_vc;         // Prior Scale parameter for VC
  real<lower = 0> scale_ka;         // Prior Scale parameter for KA
  
  real<lower = 0> scale_sigma;      // Prior Scale parameter for lognormal error
  
  int n_time_new;                   // Number of new times at which to make a prediction
  array[n_time_new] real time_new;  // New times at which to make a prediction
 
}

2.3 `transformed data`

We declare and define variables that do not need to be changed when running the program.
We can hard code variables here (e.g. n_cmt).
We can also manipulate our data variables into a form we will use later in the Stan program.
The statements in transformed data are executed only once and directly after reading the data in the data block.

transformed data{ 
  
  vector[n_obs] time_since_dose = to_vector(time) - time_of_first_dose;
  vector[n_time_new] time_since_dose_new = to_vector(time_new) - 
                                                     time_of_first_dose;
  int n_cmt = 2;                                        
  
}

2.4 `parameters`

Parameters are altered during the sampling process. These are the model parameters sampled by Stan.
These are the ones we provide priors and initial estimates for later on
Can specify bounds here

Example: we define the parameters for the one-compartment depot model and constrain the absorption rate constant to be larger than elimination to ensure no flip-flop kinetics.

parameters{  
  
  real<lower = 0> CL;     
  real<lower = 0> VC;
  real<lower = CL/VC> KA;
  
  real<lower = 0> sigma;
  
}

2.5 `transformed parameters`

We define and calculate variables that are needed for the calculation of the target density or other values we want to keep. In practice, this means we calculate values needed to compute the likelihood.
If parameters depend on both data and parameters, we specify them in the transformed parameters block.
If parameters depend on only data, they should be specified in transformed data.
The statements in transformed parameters are calculated at every leapfrog step in the NUTS algorithm, so the calculation is relatively expensive. Quantities that you wish to keep but aren’t necessary for computing the target density should be computed in generated quantities.

Example: Calculate the PK expected value (ipred) before accounting for the residual error

transformed parameters{
  
  vector[n_obs] ipred = depot_1cmt(dose, CL, VC, KA, time_since_dose);
  
}

2.6 `model`

We define the model here
Stochastic definitions and sampling statements are included here
Constraints on parameters and the statements in this block define prior distributions
Likelihood statement is defined here

Examples:

Specifying the prior distributions (CL ~ , VC ~, KA ~)
Likelihood dv is defined in vectorized notation here.

model{ 
  
  // Priors
  CL ~ lognormal(log(location_cl), scale_cl);
  VC ~ lognormal(log(location_vc), scale_vc);
  KA ~ lognormal(log(location_ka), scale_ka) T[CL/VC, ];
  
  sigma ~ normal(0, scale_sigma);
  
  // Likelihood
  dv ~ lognormal(log(ipred), sigma);

}

Note

You can also increment the log density with the equivalent statements target += lognormal_lpdf(CL | log(location_cl), scale_cl); and so on.

model{ 
  
  // Priors
  target += lognormal_lpdf(CL | log(location_cl), scale_cl);
  target += lognormal_lpdf(VC | log(location_vc), scale_vc);
  target += lognormal_lpdf(KA | log(location_ka), scale_ka) - 
            lognormal_lccdf(CL/VC | log(location_ka), scale_sigma);
            
  target += normal_lpdf(sigma | 0, scale_sigma) - 
            normal_lccdf(0 | 0, scale_sigma); 
  
  // Likelihood
  target += lognormal_lpdf(dv | log(ipred), sigma))

}

2.7 `generated quantities`

Used to calculate a derived quantity or some other quantity you wish to keep in the output (posterior event probabilities, transforming parameters for reporting, …)
Used to make predictions
This block is executed only once per iteration, so is computationally inexpensive.

Examples:

Posterior predictive check (dv_ppc) or a measurement of a plasma concentration (dv_pred) at an unobserved time.
We might want to have samples for the elimination rate constant, KE, but it did not play a role in the model, so we do that here rather than in transformed parameters.

generated quantities{
  
  real<lower = 0> KE = CL/V;
  real<lower = 0> sigma_sq = square(sigma);
  vector[n_obs] dv_ppc;
  vector[n_time_new] ipred_new;
  vector[n_time_new] dv_new;
  
  dv_ppc = to_vector(lognormal_rng(log(ipred), sigma));
  ipred_new = depot_1cmt(dose, CL, V, KA, time_since_dose_new);
  dv_new = to_vector(lognormal_rng(log(ipred_new), sigma));
  
}

Footnotes

The model can be written inline in a text string, but it’s highly discouraged for anything beyond the simplest of models (nothing we see in the PK/PD world)↩︎
See here for a simple linear regression example that showcases the Stan program blocks and gives some tips and examples and here for a PK example with a single individual.↩︎

--- title: "The Stan Language" format: html: code-fold: false code-summary: "Code" code-tools: true number-sections: true toc-depth: 5 toc-location: right execute: warning: false error: false editor_options: chunk_output_type: console --- ```{r} #| label: other-setup #| include: true #| message: false #| warning: false #| echo: false #| output: false cmdstanr::register_knitr_engine(override = TRUE) ``` ## The Stan Language The Stan language is a strongly statically typed [compiled language](https://www.geeksforgeeks.org/difference-between-compiled-and-interpreted-language/) (similar to C or C++). For us in practive, this means we have two major differences between Stan and interpreted languages that we use for our daily tasks like R and Python: 1. We declare a type (`int, real, array, vector, matrix, ...`) for each variable, and this type cannot change. This is contrary to languages like R and Python where we don't have to declare a variable type, and we can overwrite and change a variable anytime. 2. A Stan program must be written completely and then compiled - we can't run it line-by-line as we can in an interpreted language like R. We write a Stan model down in a `.stan` file^[The model can be written inline in a text string, but it's highly discouraged for anything beyond the simplest of models (nothing we see in the PK/PD world)], after which the Stan program is internally translated to C++ and compiled. ## Stan Program Blocks A Stan model is written in program blocks, similarly to NONMEM with `$PROB, $DATA, $PK, ...`. There is a good explanation of the Stan program blocks [here](https://mc-stan.org/docs/reference-manual/blocks.html#overview-of-stans-program-blocks), and I highly recommend you read that to get a more thorough understanding. Here we give a brief overview[^examples]: [^examples]: See [here](/Content/Fundamentals/Stan/simple_linear_regression.qmd) for a simple linear regression example that showcases the Stan program blocks and gives some tips and examples and [here](/Content/Fundamentals/Stan/single_individual_pk.qmd) for a PK example with a single individual. ### `functions` - The `functions` block is an optional block at the beginning of the program where [user-defined functions](https://mc-stan.org/docs/reference-manual/user-functions.html) appear. - User defined random number generator functions and probability distributions can be defined here - Void functions (those that return no value) are allowed Here is a `functions` block with two functions: 1. `depot_1cmt()` returns a vector containing the [concentration of drug](http://www.pfim.biostat.fr/pfim/download/PFIM_PKPD_library.pdf) after a single dose assuming first-order absorption and elimination at a vector of timepoints. 2. `normal_lb_rng()` generates a single random number from a normal distribution that is truncated below at ```lb``` (often this is 0): ```{stan, output.var="functions"} #| label: functions #| cache: true functions{ vector depot_1cmt(real dose, real cl, real vc, real ka, vector time_since_dose){ real ke = cl/vc; vector[num_elements(time_since_dose)] conc = dose/vc * ka/(ka - ke) .* (exp(-ke*time_since_dose) - exp(-ka*time_since_dose)); return conc; } real normal_lb_rng(real mu, real sigma, real lb){ real p_lb = normal_cdf(lb | mu, sigma); real u = uniform_rng(p_lb, 1); real y = mu + sigma * inv_Phi(u); return y; } } ``` ### `data` - Data are specified upfront and remain fixed - They are either specified in the block or read from outside - generally as R users using ```cmdstanr``` we supply the [$sample()](https://mc-stan.org/cmdstanr/reference/model-method-sample.html) function with either a list of R objects (most examples on this website do this) or the path to a data file containing the required variables. - They are read once at the beginning of the process Example: Define observed PK (`dv`) data (and PD if you have it). We can also define our independent variables (`time`), parameters for our prior distributions (`scale_x`), covariates, times at which we want to make predictions (`time_new`), or anything else we want to input into the model. ```stan data{ int n_obs; // Number of observations real<lower = 0> dose; // Dose amount real time_of_first_dose; // Time of first dose array[n_obs] real<lower = time_of_first_dose> time; // Times at which we have observations vector[n_obs] dv; // Observed PK data real<lower = 0> location_cl; // Prior location parameter for CL real<lower = 0> location_vc; // Prior location parameter for VC real<lower = 0> location_ka; // Prior location parameter for KA real<lower = 0> scale_cl; // Prior Scale parameter for CL real<lower = 0> scale_vc; // Prior Scale parameter for VC real<lower = 0> scale_ka; // Prior Scale parameter for KA real<lower = 0> scale_sigma; // Prior Scale parameter for lognormal error int n_time_new; // Number of new times at which to make a prediction array[n_time_new] real time_new; // New times at which to make a prediction } ``` ### `transformed data` - We declare and define variables that do not need to be changed when running the program. - We can hard code variables here (e.g. `n_cmt`). - We can also manipulate our `data` variables into a form we will use later in the Stan program. - The statements in `transformed data` are executed only once and directly after reading the data in the `data` block. ```stan transformed data{ vector[n_obs] time_since_dose = to_vector(time) - time_of_first_dose; vector[n_time_new] time_since_dose_new = to_vector(time_new) - time_of_first_dose; int n_cmt = 2; } ``` ### `parameters` - Parameters are altered during the sampling process. These are the model parameters sampled by Stan. - These are the ones we provide priors and initial estimates for later on - Can specify bounds here Example: we define the parameters for the one-compartment depot model and constrain the absorption rate constant to be larger than elimination to ensure no flip-flop kinetics. ```stan parameters{ real<lower = 0> CL; real<lower = 0> VC; real<lower = CL/VC> KA; real<lower = 0> sigma; } ``` ### `transformed parameters` - We define and calculate variables that are needed for the calculation of the target density or other values we want to keep. In practice, this means we calculate values needed to compute the likelihood. - If parameters depend on both data and parameters, we specify them in the transformed parameters block. - If parameters depend on only data, they should be specified in `transformed data`. - The statements in `transformed parameters` are calculated at every leapfrog step in the NUTS algorithm, so the calculation is relatively expensive. Quantities that you wish to keep but aren't necessary for computing the target density should be computed in `generated quantities`. Example: Calculate the PK expected value (`ipred`) before accounting for the residual error ```stan transformed parameters{ vector[n_obs] ipred = depot_1cmt(dose, CL, VC, KA, time_since_dose); } ``` ### `model` - We define the model here - Stochastic definitions and sampling statements are included here\ - Constraints on parameters and the statements in this block define prior distributions\ - Likelihood statement is defined here Examples: 1. Specifying the prior distributions (`CL ~ , VC ~, KA ~`) 2. Likelihood `dv` is defined in vectorized notation here. ```stan model{ // Priors CL ~ lognormal(log(location_cl), scale_cl); VC ~ lognormal(log(location_vc), scale_vc); KA ~ lognormal(log(location_ka), scale_ka) T[CL/VC, ]; sigma ~ normal(0, scale_sigma); // Likelihood dv ~ lognormal(log(ipred), sigma); } ``` :::{.callout-note} You can also increment the log density with the equivalent statements `target += lognormal_lpdf(CL | log(location_cl), scale_cl);` and so on. ::: ```stan model{ // Priors target += lognormal_lpdf(CL | log(location_cl), scale_cl); target += lognormal_lpdf(VC | log(location_vc), scale_vc); target += lognormal_lpdf(KA | log(location_ka), scale_ka) - lognormal_lccdf(CL/VC | log(location_ka), scale_sigma); target += normal_lpdf(sigma | 0, scale_sigma) - normal_lccdf(0 | 0, scale_sigma); // Likelihood target += lognormal_lpdf(dv | log(ipred), sigma)) } ``` ### `generated quantities` - Used to calculate a derived quantity or some other quantity you wish to keep in the output (posterior event probabilities, transforming parameters for reporting, ...) - Used to make predictions - This block is executed only once per iteration, so is computationally inexpensive. Examples: 1. Posterior predictive check (`dv_ppc`) or a measurement of a plasma concentration (`dv_pred`) at an unobserved time. 2. We might want to have samples for the elimination rate constant, `KE`, but it did not play a role in the model, so we do that here rather than in `transformed parameters`. ```stan generated quantities{ real<lower = 0> KE = CL/V; real<lower = 0> sigma_sq = square(sigma); vector[n_obs] dv_ppc; vector[n_time_new] ipred_new; vector[n_time_new] dv_new; dv_ppc = to_vector(lognormal_rng(log(ipred), sigma)); ipred_new = depot_1cmt(dose, CL, V, KA, time_since_dose_new); dv_new = to_vector(lognormal_rng(log(ipred_new), sigma)); } ```

1 The Stan Language

2 Stan Program Blocks

2.1 functions

2.2 data

2.3 transformed data

2.4 parameters

2.5 transformed parameters

2.6 model

2.7 generated quantities

Footnotes

2.1 `functions`

2.2 `data`

2.3 `transformed data`

2.4 `parameters`

2.5 `transformed parameters`

2.6 `model`

2.7 `generated quantities`