MCMC Sampling#

Overview#

Stan’s MCMC sampler implements the Hamiltonian Monte Carlo (HMC) algorithm and its adaptive variant the no-U-turn sampler (NUTS). It creates a set of draws from the posterior distribution of the model conditioned on the data, allowing for exact Bayesian inference of the model parameters. Each draw consists of the values for all parameter, transformed parameter, and generated quantities variables, reported on the constrained scale.

The CmdStanModel sample method wraps the CmdStan sample method. Underlyingly, the CmdStan outputs are a set of per-chain Stan CSV files. In addition to the resulting sample, reported as one row per draw, the Stan CSV files encode information about the inference engine configuration and the sampler state. The NUTS-HMC adaptive sampler algorithm also outputs the per-chain HMC tuning parameters step_size and metric.

The sample method returns a CmdStanMCMC object, which provides access to the disparate information from the Stan CSV files. Accessor functions allow the user to access the sample in whatever data format is needed for further analysis.

The sample can be extracted in tabular format, either as
- an numpy.ndarray
- a pandas.DataFrame
The sample can be treated as a collection of named, structured variables, and extracted as
- a Python dict mapping names to numpy.ndarray objects
- an xarray.Dataset

In addition, the CmdStanMCMC object has accessor methods for

The per-chain HMC tuning parameters step_size and metric
The CmdStan run configuration and console outputs
The sampler algorithm diagnostics
The mapping between the Stan model variables and the corresponding CSV file columns.

Notebook prerequisites#

CmdStanPy displays progress bars during sampling via use of package tqdm. In order for these to display properly in a Jupyter notebook, you must have the ipywidgets package installed, and depending on your version of Jupyter or JupyterLab, you must enable it via command:

[1]:

!jupyter nbextension enable --py widgetsnbextension

Enabling notebook extension jupyter-js-widgets/extension...
      - Validating: OK

For more information, see the the installation instructions, also this tqdm GitHub issue.

Fitting the model and data#

In this example we use the CmdStan example model bernoulli.stan and data file bernoulli.data.json.

We instantiate a CmdStanModel from the Stan program file

[2]:

import os
from cmdstanpy import CmdStanModel

# instantiate, compile bernoulli model
model = CmdStanModel(stan_file='bernoulli.stan')

By default, the model is compiled during instantiation. The compiled executable is created in the same directory as the program file. If the directory already contains an executable file with a newer timestamp, the model is not recompiled.

We run the sampler on the data using all default settings: 4 chains, each of which runs 1000 warmup and sampling iterations.

[3]:

# run CmdStan's sample method, returns object `CmdStanMCMC`
fit = model.sample(data='bernoulli.data.json')

17:32:09 - cmdstanpy - INFO - CmdStan start processing

17:32:09 - cmdstanpy - INFO - CmdStan done processing.

The CmdStanMCMC object records the command, the return code, and the paths to the sampler output csv and console files. The sample is lazily instantiated on first access of either the draws or the HMC tuning parameters, i.e., the step size and metric.

The string representation of this object displays the CmdStan commands and the location of the output files. Output filenames are composed of the model name, a timestamp in the form YYYYMMDDhhmmss and the chain id, plus the corresponding filetype suffix, either ‘.csv’ for the CmdStan output or ‘.txt’ for the console messages, e.g. bernoulli-20220617170100_1.csv.

[4]:

fit

[4]:

CmdStanMCMC: model=bernoulli chains=4['method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
 csv_files:
        /var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmp86q_w0yg/bernoullibqw94rqo/bernoulli-20220626173209_1.csv
        /var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmp86q_w0yg/bernoullibqw94rqo/bernoulli-20220626173209_2.csv
        /var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmp86q_w0yg/bernoullibqw94rqo/bernoulli-20220626173209_3.csv
        /var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmp86q_w0yg/bernoullibqw94rqo/bernoulli-20220626173209_4.csv
 output_files:
        /var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmp86q_w0yg/bernoullibqw94rqo/bernoulli-20220626173209-stdout.txt

Sampler Progress#

Your model make take a long time to fit. The sample method provides two arguments:

visual progress bar: show_progress=True
stream CmdStan output to the console - show_console=True

By default, CmdStanPy displays a progress bar during sampling, as seen above. Since the progress bars are only visible while the sampler is running and the bernoulli example model takes no time at all to fit, we run this model for 200K iterations, in order to see the progress bars in action.

[5]:

fit = model.sample(data='bernoulli.data.json', iter_warmup=100000, iter_sampling=100000, show_progress=True)

17:32:09 - cmdstanpy - INFO - CmdStan start processing

17:32:11 - cmdstanpy - INFO - CmdStan done processing.

To see the CmdStan console outputs instead of progress bars, specify show_console=True This will stream all CmdStan messages to the terminal while the sampler is running. This option will allow you to debug a Stan program using the Stan language print statement.

[6]:

fit = model.sample(data='bernoulli.data.json', chains=2, parallel_chains=1, show_console=True)

17:32:13 - cmdstanpy - INFO - CmdStan start processing
17:32:13 - cmdstanpy - INFO - CmdStan done processing

method = sample (Default)
sample
num_samples = 1000 (Default)
num_warmup = 1000 (Default)
save_warmup = 0 (Default)
thin = 1 (Default)
adapt
engaged = 1 (Default)
gamma = 0.050000000000000003 (Default)
delta = 0.80000000000000004 (Default)
kappa = 0.75 (Default)
t0 = 10 (Default)
init_buffer = 75 (Default)
term_buffer = 50 (Default)
window = 25 (Default)
algorithm = hmc (Default)
hmc
engine = nuts (Default)
nuts
max_depth = 10 (Default)
metric = diag_e (Default)
metric_file =  (Default)
stepsize = 1 (Default)
stepsize_jitter = 0 (Default)
num_chains = 2
id = 1 (Default)
data
file = bernoulli.data.json
init = 2 (Default)
random
seed = 51542
output
file = /var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmp86q_w0yg/bernoullix2pr1fwg/bernoulli-20220626173213.csv
diagnostic_file =  (Default)
refresh = 100 (Default)
sig_figs = -1 (Default)
profile_file = profile.csv (Default)
num_threads = 1 (Default)


Gradient evaluation took 5e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
Adjust your expectations accordingly!



Gradient evaluation took 1e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.01 seconds.
Adjust your expectations accordingly!


Chain [1] Iteration:    1 / 2000 [  0%]  (Warmup)
Chain [1] Iteration:  100 / 2000 [  5%]  (Warmup)
Chain [1] Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain [1] Iteration:  300 / 2000 [ 15%]  (Warmup)
Chain [1] Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain [1] Iteration:  500 / 2000 [ 25%]  (Warmup)
Chain [1] Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain [1] Iteration:  700 / 2000 [ 35%]  (Warmup)
Chain [1] Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain [1] Iteration:  900 / 2000 [ 45%]  (Warmup)
Chain [1] Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain [1] Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain [1] Iteration: 1100 / 2000 [ 55%]  (Sampling)
Chain [1] Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain [1] Iteration: 1300 / 2000 [ 65%]  (Sampling)
Chain [1] Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain [1] Iteration: 1500 / 2000 [ 75%]  (Sampling)
Chain [1] Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain [1] Iteration: 1700 / 2000 [ 85%]  (Sampling)
Chain [1] Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain [1] Iteration: 1900 / 2000 [ 95%]  (Sampling)
Chain [1] Iteration: 2000 / 2000 [100%]  (Sampling)

Elapsed Time: 0.004 seconds (Warm-up)
0.011 seconds (Sampling)
0.015 seconds (Total)

Chain [2] Iteration:    1 / 2000 [  0%]  (Warmup)
Chain [2] Iteration:  100 / 2000 [  5%]  (Warmup)
Chain [2] Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain [2] Iteration:  300 / 2000 [ 15%]  (Warmup)
Chain [2] Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain [2] Iteration:  500 / 2000 [ 25%]  (Warmup)
Chain [2] Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain [2] Iteration:  700 / 2000 [ 35%]  (Warmup)
Chain [2] Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain [2] Iteration:  900 / 2000 [ 45%]  (Warmup)
Chain [2] Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain [2] Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain [2] Iteration: 1100 / 2000 [ 55%]  (Sampling)
Chain [2] Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain [2] Iteration: 1300 / 2000 [ 65%]  (Sampling)
Chain [2] Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain [2] Iteration: 1500 / 2000 [ 75%]  (Sampling)
Chain [2] Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain [2] Iteration: 1700 / 2000 [ 85%]  (Sampling)
Chain [2] Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain [2] Iteration: 1900 / 2000 [ 95%]  (Sampling)
Chain [2] Iteration: 2000 / 2000 [100%]  (Sampling)

Elapsed Time: 0.004 seconds (Warm-up)
0.01 seconds (Sampling)
0.014 seconds (Total)

Checking the fit#

The first question to ask of the CmdStanMCMC object is: is this a valid sample from the posterior?

It is important to check whether or not the sampler was able to fit the model given the data. Often, this is not possible, for any number of reasons. To appreciate the sampler diagnostics, we use a hierarchical model which, given a small amount of data, encounters difficulty: the centered parameterization of the “8-schools” model (Rubin, 1981). The “8-schools” model is a simple hierarchical model, first developed on a dataset taken from an experiment was conducted in 8 schools, with only treatment effects and their standard errors reported.

The Stan model and the original dataset are in files eight_schools.stan and eight_schools.data.json.

eight_schools.stan

[7]:

with open('eight_schools.stan', 'r') as fd:
    print(fd.read())

data {
  int<lower=0> J; // number of schools
  array[J] real y; // estimated treatment effect (school j)
  array[J] real<lower=0> sigma; // std err of effect estimate (school j)
}
parameters {
  real mu;
  array[J] real theta;
  real<lower=0> tau;
}
model {
  theta ~ normal(mu, tau);
  y ~ normal(theta, sigma);
}

eight_schools.data.json

[8]:

with open('eight_schools.data.json', 'r') as fd:
    print(fd.read())

{
    "J" : 8,
    "y" : [28,8,-3,7,-1,1,18,12],
    "sigma" : [15,10,16,11,9,11,10,18],
    "tau" : 25
}

Because there is not much data, the geometry of posterior distribution is highly curved, thus the sampler may encounter difficulty in fitting the model. By specifying the initial seed for the pseudo-random number generator, we insure that the sampler will have difficulty in fitting this model. In particular, some post-warmup iterations diverge, resulting in a biased sample. In addition, some post-warmup iterations hit the maximum allowed treedepth before the trajectory hits the “U-turn” condition of the NUTS algorithm, in which case the sampler may fail to properly explore the entire posterior.

These diagnostics are checked for automatically at the end of each run; if problems are detected, a WARNING message is logged.

[9]:

eight_schools_model = CmdStanModel(stan_file='eight_schools.stan')
eight_schools_fit = eight_schools_model.sample(data='eight_schools.data.json', seed=55157)

17:32:14 - cmdstanpy - INFO - CmdStan start processing

17:32:14 - cmdstanpy - INFO - CmdStan done processing.
17:32:14 - cmdstanpy - WARNING - Some chains may have failed to converge.
        Chain 1 had 10 divergent transitions (1.0%)
        Chain 2 had 143 divergent transitions (14.3%)
        Chain 3 had 5 divergent transitions (0.5%)
        Chain 4 had 4 divergent transitions (0.4%)
        Chain 4 had 6 iterations at max treedepth (0.6%)
        Use function "diagnose()" to see further information.

The number of post-warmup divergences and iterations which hit the maximum treedepth can be inspected directly via properties divergences and max_treedepths.

[10]:

print(f'divergences:\n{eight_schools_fit.divergences}\niterations at max_treedepth:\n{eight_schools_fit.max_treedepths}')

divergences:
[ 10 143   5   4]
iterations at max_treedepth:
[0 0 0 6]

Summarizing the sample#

The summary method reports the R-hat statistic, a measure of how well the sampler chains have converged.

[11]:

eight_schools_fit.summary()

[11]:

	Mean	MCSE	StdDev	5%	50%	95%	N_Eff	N_Eff/s	R_hat
name
lp__	-16.47510	1.393830	7.33884	-26.342300	-17.71920	-5.82214	27.72280	44.14460	1.12057
mu	7.36580	0.358441	5.36980	-0.090157	6.98708	16.39460	224.43000	357.37300	1.03328
theta[1]	11.02040	0.693969	8.59668	0.398902	9.67094	27.12670	153.45500	244.35500	1.03967
theta[2]	7.44547	0.222970	6.32424	-2.193170	6.77458	18.33640	804.49900	1281.05000	1.01501
theta[3]	5.46814	0.230146	7.65499	-7.436710	5.20094	17.52600	1106.33000	1761.67000	1.00842
theta[4]	7.07208	0.257662	6.80149	-3.296620	6.58401	18.50130	696.79800	1109.55000	1.01756
theta[5]	4.59664	0.193292	6.13195	-6.067650	4.04537	14.53360	1006.39000	1602.54000	1.00407
theta[6]	5.69326	0.199079	6.77053	-5.402900	5.30265	16.48680	1156.63000	1841.76000	1.00729
theta[7]	10.04950	0.628118	7.10079	0.675246	9.17402	23.36110	127.80000	203.50400	1.03588
theta[8]	7.84009	0.291526	8.04207	-4.016420	7.00731	21.58680	760.99400	1211.77000	1.01463
tau	6.60333	0.557360	5.77492	1.019850	5.21419	17.68580	107.35287	170.94406	1.04858

Sampler Diagnostics#

The diagnose() method provides more information about the sample.

[12]:

print(eight_schools_fit.diagnose())

Processing csv files: /var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmp86q_w0yg/eight_schoolssk9k0rkd/eight_schools-20220626173214_1.csv, /var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmp86q_w0yg/eight_schoolssk9k0rkd/eight_schools-20220626173214_2.csv, /var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmp86q_w0yg/eight_schoolssk9k0rkd/eight_schools-20220626173214_3.csv, /var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmp86q_w0yg/eight_schoolssk9k0rkd/eight_schools-20220626173214_4.csv

Checking sampler transitions treedepth.
6 of 4000 (0.15%) transitions hit the maximum treedepth limit of 10, or 2^10 leapfrog steps.
Trajectories that are prematurely terminated due to this limit will result in slow exploration.
For optimal performance, increase this limit.

Checking sampler transitions for divergences.
162 of 4000 (4.05%) transitions ended with a divergence.
These divergent transitions indicate that HMC is not fully able to explore the posterior distribution.
Try increasing adapt delta closer to 1.
If this doesn't remove all divergences, try to reparameterize the model.

Checking E-BFMI - sampler transitions HMC potential energy.
The E-BFMI, 0.18, is below the nominal threshold of 0.30 which suggests that HMC may have trouble exploring the target distribution.
If possible, try to reparameterize the model.

Effective sample size satisfactory.

Split R-hat values satisfactory all parameters.

Processing complete.

Accessing the sampler outputs#

[13]:

fit = model.sample(data='bernoulli.data.json')

17:32:14 - cmdstanpy - INFO - CmdStan start processing

17:32:15 - cmdstanpy - INFO - CmdStan done processing.

Extracting the draws in tabular format#

The sample can be accessed either as a numpy array or a pandas DataFrame:

[14]:

print(f'sample as ndarray: {fit.draws().shape}\nfirst 2 draws, chain 1:\n{fit.draws()[:2, 0, :]}')

sample as ndarray: (1000, 4, 8)
first 2 draws, chain 1:
[[-6.75944   1.        1.13928   2.        3.        0.        6.77954
   0.2692  ]
 [-7.09693   0.867647  1.13928   1.        1.        0.        7.09717
   0.36266 ]]

[15]:

fit.draws_pd().head()

[15]:

	lp__	accept_stat__	stepsize__	treedepth__	n_leapfrog__	energy__	theta
0	-6.75944	1.000000	1.13928	2.0	3.0	6.77954	0.269200
1	-7.09693	0.867647	1.13928	1.0	1.0	7.09717	0.362660
2	-7.23328	0.937972	1.13928	1.0	1.0	7.38250	0.384201
3	-7.05500	1.000000	1.13928	2.0	3.0	7.16122	0.161502
4	-7.28793	0.941969	1.13928	1.0	1.0	7.33886	0.137127

Extracting the draws as structured Stan program variables#

Per-variable draws can be accessed as either a numpy.ndarray object via method stan_variable or as an xarray.Dataset object via draws_xr.

[16]:

print(fit.stan_variable('theta'))

[0.2692   0.36266  0.384201 ... 0.277657 0.277316 0.101517]

[17]:

print(fit.draws_xr('theta'))

<xarray.Dataset>
Dimensions:  (draw: 1000, chain: 4)
Coordinates:
  * chain    (chain) int64 1 2 3 4
  * draw     (draw) int64 0 1 2 3 4 5 6 7 8 ... 992 993 994 995 996 997 998 999
Data variables:
    theta    (chain, draw) float64 0.2692 0.3627 0.3842 ... 0.2777 0.2773 0.1015
Attributes:
    stan_version:        2.29.2
    model:               bernoulli_model
    num_draws_sampling:  1000

The stan_variables method returns a Python dict over all Stan variables in the output.

[18]:

for k, v in fit.stan_variables().items():
    print(f'name: {k}, shape: {v.shape}')

name: theta, shape: (4000,)

Extracting sampler method diagnostics#

[19]:

for k, v in fit.method_variables().items():
    print(f'name: {k}, shape: {v.shape}')

name: lp__, shape: (1000, 4)
name: accept_stat__, shape: (1000, 4)
name: stepsize__, shape: (1000, 4)
name: treedepth__, shape: (1000, 4)
name: n_leapfrog__, shape: (1000, 4)
name: divergent__, shape: (1000, 4)
name: energy__, shape: (1000, 4)

Extracting the per-chain HMC tuning parameters#

[20]:

print(f'adapted step_size per chain\n{fit.step_size}\nmetric_type: {fit.metric_type}\nmetric:\n{fit.metric}')

adapted step_size per chain
[1.13928  1.02295  1.13435  0.993666]
metric_type: diag_e
metric:
[[0.515798]
 [0.456194]
 [0.588636]
 [0.474007]]

Extracting the sample meta-data#

[21]:

print('sample method variables:\n{}\n'.format(fit.metadata.method_vars_cols.keys()))
print('stan model variables:\n{}'.format(fit.metadata.stan_vars_cols.keys()))

sample method variables:
dict_keys(['lp__', 'accept_stat__', 'stepsize__', 'treedepth__', 'n_leapfrog__', 'divergent__', 'energy__'])

stan model variables:
dict_keys(['theta'])

Saving the sampler output files#

The sampler output files are written to a temporary directory which is deleted upon session exit unless the output_dir argument is specified. The save_csvfiles function moves the CmdStan CSV output files to a specified directory without having to re-run the sampler. The console output files are not saved. These files are treated as ephemeral; if the sample is valid, all relevant information is recorded in the CSV files.

Parallelization via multi-threaded processing#

Stan’s multi-threaded processing is based on the Intel Threading Building Blocks (TBB) library, which must be linked to by the C++ compiler. To take advantage of this option, you must compile (or recompile) the program with the the C++ compiler option STAN_THREADS. The CmdStanModel object constructor and its compile method both have argument cpp_options which takes as its value a dictionary of compiler flags.

We compile the example model bernoulli.stan, this time with arguments cpp_options and compile, and use the function exe_info() to check that the model has been compiled for multi-threading.

[22]:

model = CmdStanModel(stan_file='bernoulli.stan',
                     cpp_options={'STAN_THREADS': 'TRUE'},
                     compile='force')
model.exe_info()

17:32:15 - cmdstanpy - INFO - compiling stan file /Users/mitzi/github/stan-dev/cmdstanpy/docsrc/examples/bernoulli.stan to exe file /Users/mitzi/github/stan-dev/cmdstanpy/docsrc/examples/bernoulli
17:32:24 - cmdstanpy - INFO - compiled model executable: /Users/mitzi/github/stan-dev/cmdstanpy/docsrc/examples/bernoulli

[22]:

{'stan_version_major': '2',
 'stan_version_minor': '29',
 'stan_version_patch': '2',
 'STAN_THREADS': 'true',
 'STAN_MPI': 'false',
 'STAN_OPENCL': 'false',
 'STAN_NO_RANGE_CHECKS': 'false',
 'STAN_CPP_OPTIMS': 'false'}

Cross-chain multi-threading#

As of version CmdStan 2.28, it is possible to run the NUTS-HMC sampler on multiple chains from within a single executable using threads. This has the potential to speed up sampling. It also reduces the overall memory footprint required for sampling as all chains share the same copy of data.the input data. When using within-chain parallelization all chains started within a single executable can share all the available threads and once a chain finishes the threads will be reused.

The sample program argument parallel_chains takes an integer value which specifies how many chains to run in parallel. For models which have been compiled with option STAN_THREADS set, all chains are run from within a single process and the value of the parallel_chains argument specifies the total number of threads.

[23]:

fit = model.sample(data='bernoulli.data.json', parallel_chains=4)

17:32:24 - cmdstanpy - INFO - CmdStan start processing

17:32:24 - cmdstanpy - INFO - CmdStan done processing.

Within-chain multi-threading#

The Stan language reduce_sum function provides within-chain parallelization. For models which require computing the sum of a number of independent function evaluations, e.g., when evaluating a number of conditionally independent terms in a log-likelihood, the reduce_sum function is used to parallelize this computation.

To see how this works, we run the “reflag” model, used in the reduce_sum minimal example case study. The Stan model and the original dataset are in files “redcard_reduce_sum.stan” and “redcard.json”.

[24]:

with open('redcard_reduce_sum.stan', 'r') as fd:
    print(fd.read())

functions {
  real partial_sum(array[] int slice_n_redcards, int start, int end,
                   array[] int n_games, vector rating, vector beta) {
    return binomial_logit_lpmf(slice_n_redcards | n_games[start : end], beta[1]
                                                                    + beta[2]
                                                                    * rating[start : end]);
  }
}
data {
  int<lower=0> N;
  array[N] int<lower=0> n_redcards;
  array[N] int<lower=0> n_games;
  vector[N] rating;
  int<lower=1> grainsize;
}
parameters {
  vector[2] beta;
}
model {
  beta[1] ~ normal(0, 10);
  beta[2] ~ normal(0, 1);

  target += reduce_sum(partial_sum, n_redcards, grainsize, n_games, rating,
                       beta);
}

As before, we compile the model specifying argument cpp_options.

[25]:

redcard_model = CmdStanModel(stan_file='redcard_reduce_sum.stan',
                     cpp_options={'STAN_THREADS': 'TRUE'},
                     compile='force')
redcard_model.exe_info()

17:32:24 - cmdstanpy - INFO - compiling stan file /Users/mitzi/github/stan-dev/cmdstanpy/docsrc/examples/redcard_reduce_sum.stan to exe file /Users/mitzi/github/stan-dev/cmdstanpy/docsrc/examples/redcard_reduce_sum
17:32:33 - cmdstanpy - INFO - compiled model executable: /Users/mitzi/github/stan-dev/cmdstanpy/docsrc/examples/redcard_reduce_sum

[25]:

{'stan_version_major': '2',
 'stan_version_minor': '29',
 'stan_version_patch': '2',
 'STAN_THREADS': 'true',
 'STAN_MPI': 'false',
 'STAN_OPENCL': 'false',
 'STAN_NO_RANGE_CHECKS': 'false',
 'STAN_CPP_OPTIMS': 'false'}

The sample method argument threads_per_chain specifies the number of threads allotted to each chain; this corresponds to CmdStan’s num_threads argument.

[28]:

redcard_fit = redcard_model.sample(data='redcard.json', threads_per_chain=4)

17:33:36 - cmdstanpy - INFO - CmdStan start processing

17:34:21 - cmdstanpy - INFO - CmdStan done processing.

The number of threads to use is passed to the model exe file by means of the shell environment variable STAN_NUM_THREADS.

On my machine, which has 4 cores, all 4 chains are run in parallel from within a single process. Therefore, the total number of threads used by this process will be threads_per_chain * chains. To check this, we examine the shell environment variable STAN_NUM_THREADS.

[29]:

os.environ['STAN_NUM_THREADS']

[29]:

'16'

[ ]:

CmdStanPy Examples

Maximum Likelihood Estimation