Bayesian Analysis Notes

Developing the new temperature calibration model using Ring Index.

This page contains notes for the culture-informed temperature calibration model using Ring Index using Bayesian inference approach. The model is called culRI-Bayesian (to make it a homophone of “calibration”).

Based on the Temperature-RI distribution across all archives—cultures, mesocosms, and coretop—the logistic functio with a fixed upper asymptote (see the interactive plot to see how each hyperparameter impacts the curve) is selected because:

Below are examples of studies highlighting the use of logistic function to describe growth curves:


The model is implemented using Stan, a probabilistic programming language for Bayesian inference. The model is fit to the data using Markov Chain Monte Carlo (MCMC) sampling.

Logistic functional forms (interactive plot)


Model Name T scaled RI gdgt23ratio x0 k b beta0 beta1
joint-cul-meso      
joint-cul-meso-coretop      
joint-cul-meso-coretop-multivariate
joint-cul-meso-coretop-multivariate-fixedbeta1  
joint-all-full-multivariate
joint-all-full-multivariate-fixedbeta1  


Legend | Culture — Mesocosm — Coretop

Stan model files

joint_all_full_multivariate.stan



// joint_all_full_multivariate.stan
data {
  // first dataset
  int<lower=1> N1;
  vector[N1] x1;
  vector[N1] y1;
  vector[N1] z1;        // e.g. nutrient concentration or depth, etc.

  // second dataset
  int<lower=1> N2;
  vector[N2] x2;
  vector[N2] y2;
  vector[N2] z2;        // e.g. nutrient concentration or depth, etc.

  // third dataset, now with an extra predictor z3
  int<lower=1>   N3;
  vector[N3]     x3;
  vector[N3]     y3;
  vector[N3]     z3;        // e.g. nutrient concentration or depth, etc.
}

parameters {
  real<lower=-4>        x0;      // shared inflection
  real<lower=0>         k;       // shared steepness
  real<lower=0,upper=1> b;       // shared lower asymptote

  real                  beta1; // intercept for dataset3 (coretop)
  real                  beta0;      // new slope for z3
  real<lower=0>         sigma1;  // noise DS1
  real<lower=0>         sigma2;  // noise DS2
  real<lower=0>         sigma3;  // noise DS3
}

model {
  // Priors
  x0     ~ normal(20, 20) T[-4, ];
  k      ~ normal(0, 0.25);
  b      ~ beta(2, 5);

  // prior on the linear term
  beta1     ~ normal(0, 0.3);
  beta0     ~ normal(0, 0.01);

  // noise priors
  sigma1 ~ normal(0.01, 0.1) T[0.01, ];
  sigma2 ~ normal(0.01, 0.1) T[0.01, ];
  sigma3 ~ normal(0.01, 0.1) T[0.01, ];

  // Logistic‐curve means (vectorized)
  vector[N1] mu1 = ((1 - b) * inv_logit(k * (x1 - x0)) + b)
                  + beta0 * z1 + beta1;

  vector[N2] mu2 = ((1 - b) * inv_logit(k * (x2 - x0)) + b)
                  + beta0 * z2 + beta1;

  // for dataset 3, add the linear term beta3 * z3
  vector[N3] mu3 = ((1 - b) * inv_logit(k * (x3 - x0)) + b)
                  + beta0 * z3 + beta1;

  // Likelihoods
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
  y3 ~ normal(mu3, sigma3);

}

generated quantities {
  // ratios for diagnostics
  real sigma2_sigma1 = sigma2 / sigma1;
  real sigma3_sigma1 = sigma3 / sigma1;
}

joint_all_full_multivariate_fixedbeta1.stan



// joint_all_full_multivariate_fixedbeta1.stan
data {
  // first dataset
  int<lower=1> N1;
  vector[N1] x1;
  vector[N1] y1;
  vector[N1] z1;        // e.g. nutrient concentration or depth, etc.

  // second dataset
  int<lower=1> N2;
  vector[N2] x2;
  vector[N2] y2;
  vector[N2] z2;        // e.g. nutrient concentration or depth, etc.

  // third dataset, now with an extra predictor z3
  int<lower=1>   N3;
  vector[N3]     x3;
  vector[N3]     y3;
  vector[N3]     z3;        // e.g. nutrient concentration or depth, etc.
}

parameters {
  real<lower=-4>        x0;      // shared inflection
  real<lower=0>         k;       // shared steepness
  real<lower=0,upper=1> b;       // shared lower asymptote

  real                  beta0;      // new slope for z3
  real<lower=0>         sigma1;  // noise DS1
  real<lower=0>         sigma2;  // noise DS2
  real<lower=0>         sigma3;  // noise DS3
}

model {
  // Priors
  x0     ~ normal(20, 20) T[-4, ];
  k      ~ normal(0, 0.25);
  b      ~ beta(2, 5);

  // prior on the linear term
  beta0     ~ normal(0, 0.01);

  // noise priors
  sigma1 ~ normal(0.01, 0.1) T[0.01, ];
  sigma2 ~ normal(0.01, 0.1) T[0.01, ];
  sigma3 ~ normal(0.01, 0.1) T[0.01, ];

  // Logistic‐curve means (vectorized)
  vector[N1] mu1 = ((1 - b) * inv_logit(k * (x1 - x0)) + b)
                  + beta0 * z1;
  vector[N2] mu2 = ((1 - b) * inv_logit(k * (x2 - x0)) + b)
                  + beta0 * z2;
  // for dataset 3, add the linear term beta3 * z3
  vector[N3] mu3 = ((1 - b) * inv_logit(k * (x3 - x0)) + b)
                  + beta0 * z3;

  // Likelihoods
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
  y3 ~ normal(mu3, sigma3);
}

generated quantities {
  // ratios for diagnostics
  real sigma2_sigma1 = sigma2 / sigma1;
  real sigma3_sigma1 = sigma3 / sigma1;
}

joint_culture_meso.stan



// joint_culture_meso.stan
data {
  // first dataset
  int<lower=1> N1;       
  vector[N1] x1;         
  vector[N1] y1;         

  // second dataset
  int<lower=1> N2;       
  vector[N2] x2;         
  vector[N2] y2;         
}

parameters {
  real<lower=-4>         x0;      // inflection point
  real<lower=0>          k;       // steepness
  real<lower=0,upper=1>  b;       // lower asymptote
  
  real<lower=0> sigma1;           // noise for dataset 1
  real<lower=0> sigma2;           // noise for dataset 2
}

model {
  // Priors
  x0     ~ normal(20, 20) T[-4, ];
  k      ~ normal(0, 0.25);
  b      ~ beta(2, 5);
  sigma1 ~ normal(0.01, 0.1) T[0.01, ];
  sigma2 ~ normal(0.01, 0.1) T[0.01, ];

  // Logistic means using inv_logit for elementwise operations
  // vector[N1] mu1 = (1 - b) ./ (1 + exp(-k * (x1 - x0))) + b;
  // vector[N2] mu2 = (1 - b) ./ (1 + exp(-k * (x2 - x0))) + b;

  // Solve
  // (1 - b) * inv_logit(z) + b  
  // = (1 - b) * (1 / (1 + exp(-z))) + b  
  // = (1 - b) / (1 + exp(-z))         + b

  vector[N1] mu1 = (1 - b) * inv_logit(k * (x1 - x0)) + b;
  vector[N2] mu2 = (1 - b) * inv_logit(k * (x2 - x0)) + b;

  // Likelihoods
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
}

generated quantities {
  real sigma_ratio = sigma2 / sigma1;
}

joint_culture_meso_coretop.stan



// joint_culture_meso_three.stan
data {
  // first dataset
  int<lower=1> N1;
  vector[N1] x1;
  vector[N1] y1;

  // second dataset
  int<lower=1> N2;
  vector[N2] x2;
  vector[N2] y2;

  // third dataset
  int<lower=1> N3;
  vector[N3] x3;
  vector[N3] y3;
}

parameters {
  real<lower=-4>         x0;      // inflection point
  real<lower=0>          k;       // steepness
  real<lower=0,upper=1>  b;       // lower asymptote

  real<lower=0> sigma1;          // noise for dataset 1
  real<lower=0> sigma2;          // noise for dataset 2
  real<lower=0> sigma3;          // noise for dataset 3
}

model {
  // Priors
  x0     ~ normal(20, 20) T[-4, ];
  k      ~ normal(0, 0.25);
  b      ~ beta(2, 5);
  sigma1 ~ normal(0.01, 0.1) T[0.01, ];
  sigma2 ~ normal(0.01, 0.1) T[0.01, ];
  sigma3 ~ normal(0.01, 0.1) T[0.01, ];

  // Logistic‐curve means (vectorized)
  vector[N1] mu1 = (1 - b) * inv_logit(k * (x1 - x0)) + b;
  vector[N2] mu2 = (1 - b) * inv_logit(k * (x2 - x0)) + b;
  vector[N3] mu3 = (1 - b) * inv_logit(k * (x3 - x0)) + b;

  // Likelihoods
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
  y3 ~ normal(mu3, sigma3);
}

generated quantities {
  // ratios for diagnostics
  real sigma2_sigma1 = sigma2 / sigma1;
  real sigma3_sigma1 = sigma3 / sigma1;
}

joint_culture_meso_coretop_multivariate.stan



// joint_culture_meso_coretop_multivariate.stan
data {
  // first dataset
  int<lower=1> N1;
  vector[N1] x1;
  vector[N1] y1;

  // second dataset
  int<lower=1> N2;
  vector[N2] x2;
  vector[N2] y2;

  // third dataset, now with an extra predictor z3
  int<lower=1>   N3;
  vector[N3]     x3;
  vector[N3]     y3;
  vector[N3]     z3;        // e.g. nutrient concentration or depth, etc.
}

parameters {
  real<lower=-4>        x0;      // shared inflection
  real<lower=0>         k;       // shared steepness
  real<lower=0,upper=1> b;       // shared lower asymptote

  real                  beta1; // intercept for dataset3 (coretop)
  real                  beta0;      // new slope for z3
  real<lower=0>         sigma1;  // noise DS1
  real<lower=0>         sigma2;  // noise DS2
  real<lower=0>         sigma3;  // noise DS3
}

model {
  // Priors
  x0     ~ normal(20, 20) T[-4, ];
  k      ~ normal(0, 0.25);
  b      ~ beta(2, 5);

  // prior on the linear term
  beta1     ~ normal(0, 0.3);
  beta0     ~ normal(0, 0.01);

  // noise priors
  sigma1 ~ normal(0.01, 0.1) T[0.01, ];
  sigma2 ~ normal(0.01, 0.1) T[0.01, ];
  sigma3 ~ normal(0.01, 0.1) T[0.01, ];

  // Logistic‐curve means (vectorized)
  vector[N1] mu1 = (1 - b) * inv_logit(k * (x1 - x0)) + b;
  vector[N2] mu2 = (1 - b) * inv_logit(k * (x2 - x0)) + b;

  // for dataset 3, add the linear term beta3 * z3
  vector[N3] mu3 = ((1 - b) * inv_logit(k * (x3 - x0)) + b)
                  + beta0 * z3
                  + beta1;

  // Likelihoods
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
  y3 ~ normal(mu3, sigma3);

}

generated quantities {
  // ratios for diagnostics
  real sigma2_sigma1 = sigma2 / sigma1;
  real sigma3_sigma1 = sigma3 / sigma1;
}

joint_culture_meso_coretop_multivariate_fixedbeta1.stan



// joint_culture_meso_coretop_multivariate_fixedbeta1.stan
data {
  // first dataset
  int<lower=1> N1;
  vector[N1] x1;
  vector[N1] y1;

  // second dataset
  int<lower=1> N2;
  vector[N2] x2;
  vector[N2] y2;

  // third dataset, now with an extra predictor z3
  int<lower=1>   N3;
  vector[N3]     x3;
  vector[N3]     y3;
  vector[N3]     z3;        // e.g. nutrient concentration or depth, etc.
}

parameters {
  real<lower=-4>        x0;      // shared inflection
  real<lower=0>         k;       // shared steepness
  real<lower=0,upper=1> b;       // shared lower asymptote

  real                  beta0;      // new slope for z3
  real<lower=0>         sigma1;  // noise DS1
  real<lower=0>         sigma2;  // noise DS2
  real<lower=0>         sigma3;  // noise DS3
}

model {
  // Priors
  x0     ~ normal(20, 20) T[-4, ];
  k      ~ normal(0, 0.25);
  b      ~ beta(2, 5);

  // prior on the linear term
  beta0     ~ normal(0, 0.01);

  // noise priors
  sigma1 ~ normal(0.01, 0.1) T[0.01, ];
  sigma2 ~ normal(0.01, 0.1) T[0.01, ];
  sigma3 ~ normal(0.01, 0.1) T[0.01, ];

  // Logistic‐curve means (vectorized)
  vector[N1] mu1 = (1 - b) * inv_logit(k * (x1 - x0)) + b;
  vector[N2] mu2 = (1 - b) * inv_logit(k * (x2 - x0)) + b;

  // for dataset 3, add the linear term beta3 * z3
  vector[N3] mu3 = (1 - b) * inv_logit(k * (x3 - x0)) + b + (beta0 * z3);

  // Likelihoods
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
  y3 ~ normal(mu3, sigma3);
}

generated quantities {
  // ratios for diagnostics
  real sigma2_sigma1 = sigma2 / sigma1;
  real sigma3_sigma1 = sigma3 / sigma1;
}


Figure 1. A scatter plot between scaledRI vs. Temperature (color by culture strains)


Show Python example

def get_posteriors(data_dict, stan_file_name):
    from cmdstanpy import CmdStanModel
    # … rest of your function …
Figure 2. A dashboard of posterior distributions of culRI-Bayesian models

RI residuals of culRI-Bayesian models
Figure 3. Paleo-showcases

3.1 PETM showcase


3.2 Glacial-Interglacial showcase