Bayesian Analysis Notes

Developing the new temperature calibration model using Ring Index.

This page contains notes for the culture-informed temperature calibration model using Ring Index using Bayesian inference approach. The model is called culRI-Bayesian (to make it a homophone of “calibration”).

Based on the Temperature-RI distribution across all archives—cultures, mesocosms, and coretop—the logistic functio with a fixed upper asymptote (see the interactive plot to see how each hyperparameter impacts the curve) is selected because:

RI values of lab cultures at growth T < 10°C and T > 35°C are flatten out—RI values are less sensitive to change in T (see Figure 1).
The upper asymptote is fixed to 1.0, which is the theoretical maximum RI value for all cultures at the highest growth T (35°C).
The logistic function has been widely used in ecology to describe the growth of populations or the response of organisms to environmental factors.

Below are examples of studies highlighting the use of logistic function to describe growth curves:

Harrold, Z. R., Skidmore, M. L., Hamilton, T. L., Desch, L., Amada, K., Van Gelder, W., … & Boyd, E. S. (2016). Aerobic and anaerobic thiosulfate oxidation by a cold-adapted, subglacial chemoautotroph. Applied and Environmental Microbiology, 82(5), 1486–1495. https://doi.org/10.1128/AEM.03398-15
Goehlich, H., Luna-Jorquera, G., Drapeau Picard, A. P., Pantoja, J., Tala, F., & Thiel, M. (2024). Seasonal growth rates of gooseneck barnacles (Lepas spp.): Proxies for floating time of rafts in marine ecosystems. Marine Biology, 171(1), 36. https://doi.org/10.1007/s00227-023-04336-8
Jenkins, D. G., & Pierce, S. (2017). General allometric scaling of net primary production agrees with plant adaptive strategy theory and has tipping points. Journal of Ecology, 105(4), 1094–1104. https://doi.org/10.1111/1365-2745.12726
Sauve, D., Friesen, V. L., Hatch, S. A., Elliott, K. H., & Charmantier, A. (2023). Shifting environmental predictors of phenotypes under climate change: a case study of growth in high latitude seabirds. Journal of Avian Biology, 2023(5-6), e03062. https://doi.org/10.1111/jav.03062
[Book Chapter] Vallina, S. M., Martinez-Garcia, R., Smith, S. L., & Bonachela, J. A. (2019). Models in microbial ecology. Academic Press. https://doi.org/10.1016/B978-0-12-809633-8.20789-9

The model is implemented using Stan, a probabilistic programming language for Bayesian inference. The model is fit to the data using Markov Chain Monte Carlo (MCMC) sampling.

Logistic functional forms (interactive plot)

Model Name	T	scaled RI	gdgt23ratio	x0	k	b	beta0	beta1
joint-cul-meso	● ●	● ●		●	●	●
joint-cul-meso-coretop	● ● ●	● ● ●		●	●	●
joint-cul-meso-coretop-multivariate	● ● ●	● ● ●	●	●	●	●	●	●
joint-cul-meso-coretop-multivariate-fixedbeta1	● ● ●	● ● ●	●	●	●	●	●
joint-all-full-multivariate	● ● ●	● ● ●	● ● ●	●	●	●	●	●
joint-all-full-multivariate-fixedbeta1	● ● ●	● ● ●	● ● ●	●	●	●	●

Legend | ● Culture — ● Mesocosm — ● Coretop

Stan model files

joint_all_full_multivariate.stan



// joint_all_full_multivariate.stan
data {
  // first dataset
  int<lower=1> N1;
  vector[N1] x1;
  vector[N1] y1;
  vector[N1] z1;        // e.g. nutrient concentration or depth, etc.

  // second dataset
  int<lower=1> N2;
  vector[N2] x2;
  vector[N2] y2;
  vector[N2] z2;        // e.g. nutrient concentration or depth, etc.

  // third dataset, now with an extra predictor z3
  int<lower=1>   N3;
  vector[N3]     x3;
  vector[N3]     y3;
  vector[N3]     z3;        // e.g. nutrient concentration or depth, etc.
}

parameters {
  real<lower=-4>        x0;      // shared inflection
  real<lower=0>         k;       // shared steepness
  real<lower=0,upper=1> b;       // shared lower asymptote

  real                  beta1; // intercept for dataset3 (coretop)
  real                  beta0;      // new slope for z3
  real<lower=0>         sigma1;  // noise DS1
  real<lower=0>         sigma2;  // noise DS2
  real<lower=0>         sigma3;  // noise DS3
}

model {
  // Priors
  x0     ~ normal(20, 20) T[-4, ];
  k      ~ normal(0, 0.25);
  b      ~ beta(2, 5);

  // prior on the linear term
  beta1     ~ normal(0, 0.3);
  beta0     ~ normal(0, 0.01);

  // noise priors
  sigma1 ~ normal(0.01, 0.1) T[0.01, ];
  sigma2 ~ normal(0.01, 0.1) T[0.01, ];
  sigma3 ~ normal(0.01, 0.1) T[0.01, ];

  // Logistic‐curve means (vectorized)
  vector[N1] mu1 = ((1 - b) * inv_logit(k * (x1 - x0)) + b)
                  + beta0 * z1 + beta1;

  vector[N2] mu2 = ((1 - b) * inv_logit(k * (x2 - x0)) + b)
                  + beta0 * z2 + beta1;

  // for dataset 3, add the linear term beta3 * z3
  vector[N3] mu3 = ((1 - b) * inv_logit(k * (x3 - x0)) + b)
                  + beta0 * z3 + beta1;

  // Likelihoods
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
  y3 ~ normal(mu3, sigma3);

}

generated quantities {
  // ratios for diagnostics
  real sigma2_sigma1 = sigma2 / sigma1;
  real sigma3_sigma1 = sigma3 / sigma1;
}

joint_all_full_multivariate_fixedbeta1.stan



// joint_all_full_multivariate_fixedbeta1.stan
data {
  // first dataset
  int<lower=1> N1;
  vector[N1] x1;
  vector[N1] y1;
  vector[N1] z1;        // e.g. nutrient concentration or depth, etc.

  // second dataset
  int<lower=1> N2;
  vector[N2] x2;
  vector[N2] y2;
  vector[N2] z2;        // e.g. nutrient concentration or depth, etc.

  // third dataset, now with an extra predictor z3
  int<lower=1>   N3;
  vector[N3]     x3;
  vector[N3]     y3;
  vector[N3]     z3;        // e.g. nutrient concentration or depth, etc.
}

parameters {
  real<lower=-4>        x0;      // shared inflection
  real<lower=0>         k;       // shared steepness
  real<lower=0,upper=1> b;       // shared lower asymptote

  real                  beta0;      // new slope for z3
  real<lower=0>         sigma1;  // noise DS1
  real<lower=0>         sigma2;  // noise DS2
  real<lower=0>         sigma3;  // noise DS3
}

model {
  // Priors
  x0     ~ normal(20, 20) T[-4, ];
  k      ~ normal(0, 0.25);
  b      ~ beta(2, 5);

  // prior on the linear term
  beta0     ~ normal(0, 0.01);

  // noise priors
  sigma1 ~ normal(0.01, 0.1) T[0.01, ];
  sigma2 ~ normal(0.01, 0.1) T[0.01, ];
  sigma3 ~ normal(0.01, 0.1) T[0.01, ];

  // Logistic‐curve means (vectorized)
  vector[N1] mu1 = ((1 - b) * inv_logit(k * (x1 - x0)) + b)
                  + beta0 * z1;
  vector[N2] mu2 = ((1 - b) * inv_logit(k * (x2 - x0)) + b)
                  + beta0 * z2;
  // for dataset 3, add the linear term beta3 * z3
  vector[N3] mu3 = ((1 - b) * inv_logit(k * (x3 - x0)) + b)
                  + beta0 * z3;

  // Likelihoods
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
  y3 ~ normal(mu3, sigma3);
}

generated quantities {
  // ratios for diagnostics
  real sigma2_sigma1 = sigma2 / sigma1;
  real sigma3_sigma1 = sigma3 / sigma1;
}

joint_culture_meso.stan



// joint_culture_meso.stan
data {
  // first dataset
  int<lower=1> N1;       
  vector[N1] x1;         
  vector[N1] y1;         

  // second dataset
  int<lower=1> N2;       
  vector[N2] x2;         
  vector[N2] y2;         
}

parameters {
  real<lower=-4>         x0;      // inflection point
  real<lower=0>          k;       // steepness
  real<lower=0,upper=1>  b;       // lower asymptote
  
  real<lower=0> sigma1;           // noise for dataset 1
  real<lower=0> sigma2;           // noise for dataset 2
}

model {
  // Priors
  x0     ~ normal(20, 20) T[-4, ];
  k      ~ normal(0, 0.25);
  b      ~ beta(2, 5);
  sigma1 ~ normal(0.01, 0.1) T[0.01, ];
  sigma2 ~ normal(0.01, 0.1) T[0.01, ];

  // Logistic means using inv_logit for elementwise operations
  // vector[N1] mu1 = (1 - b) ./ (1 + exp(-k * (x1 - x0))) + b;
  // vector[N2] mu2 = (1 - b) ./ (1 + exp(-k * (x2 - x0))) + b;

  // Solve
  // (1 - b) * inv_logit(z) + b  
  // = (1 - b) * (1 / (1 + exp(-z))) + b  
  // = (1 - b) / (1 + exp(-z))         + b

  vector[N1] mu1 = (1 - b) * inv_logit(k * (x1 - x0)) + b;
  vector[N2] mu2 = (1 - b) * inv_logit(k * (x2 - x0)) + b;

  // Likelihoods
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
}

generated quantities {
  real sigma_ratio = sigma2 / sigma1;
}

joint_culture_meso_coretop.stan



// joint_culture_meso_three.stan
data {
  // first dataset
  int<lower=1> N1;
  vector[N1] x1;
  vector[N1] y1;

  // second dataset
  int<lower=1> N2;
  vector[N2] x2;
  vector[N2] y2;

  // third dataset
  int<lower=1> N3;
  vector[N3] x3;
  vector[N3] y3;
}

parameters {
  real<lower=-4>         x0;      // inflection point
  real<lower=0>          k;       // steepness
  real<lower=0,upper=1>  b;       // lower asymptote

  real<lower=0> sigma1;          // noise for dataset 1
  real<lower=0> sigma2;          // noise for dataset 2
  real<lower=0> sigma3;          // noise for dataset 3
}

model {
  // Priors
  x0     ~ normal(20, 20) T[-4, ];
  k      ~ normal(0, 0.25);
  b      ~ beta(2, 5);
  sigma1 ~ normal(0.01, 0.1) T[0.01, ];
  sigma2 ~ normal(0.01, 0.1) T[0.01, ];
  sigma3 ~ normal(0.01, 0.1) T[0.01, ];

  // Logistic‐curve means (vectorized)
  vector[N1] mu1 = (1 - b) * inv_logit(k * (x1 - x0)) + b;
  vector[N2] mu2 = (1 - b) * inv_logit(k * (x2 - x0)) + b;
  vector[N3] mu3 = (1 - b) * inv_logit(k * (x3 - x0)) + b;

  // Likelihoods
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
  y3 ~ normal(mu3, sigma3);
}

generated quantities {
  // ratios for diagnostics
  real sigma2_sigma1 = sigma2 / sigma1;
  real sigma3_sigma1 = sigma3 / sigma1;
}

joint_culture_meso_coretop_multivariate.stan



// joint_culture_meso_coretop_multivariate.stan
data {
  // first dataset
  int<lower=1> N1;
  vector[N1] x1;
  vector[N1] y1;

  // second dataset
  int<lower=1> N2;
  vector[N2] x2;
  vector[N2] y2;

  // third dataset, now with an extra predictor z3
  int<lower=1>   N3;
  vector[N3]     x3;
  vector[N3]     y3;
  vector[N3]     z3;        // e.g. nutrient concentration or depth, etc.
}

parameters {
  real<lower=-4>        x0;      // shared inflection
  real<lower=0>         k;       // shared steepness
  real<lower=0,upper=1> b;       // shared lower asymptote

  real                  beta1; // intercept for dataset3 (coretop)
  real                  beta0;      // new slope for z3
  real<lower=0>         sigma1;  // noise DS1
  real<lower=0>         sigma2;  // noise DS2
  real<lower=0>         sigma3;  // noise DS3
}

model {
  // Priors
  x0     ~ normal(20, 20) T[-4, ];
  k      ~ normal(0, 0.25);
  b      ~ beta(2, 5);

  // prior on the linear term
  beta1     ~ normal(0, 0.3);
  beta0     ~ normal(0, 0.01);

  // noise priors
  sigma1 ~ normal(0.01, 0.1) T[0.01, ];
  sigma2 ~ normal(0.01, 0.1) T[0.01, ];
  sigma3 ~ normal(0.01, 0.1) T[0.01, ];

  // Logistic‐curve means (vectorized)
  vector[N1] mu1 = (1 - b) * inv_logit(k * (x1 - x0)) + b;
  vector[N2] mu2 = (1 - b) * inv_logit(k * (x2 - x0)) + b;

  // for dataset 3, add the linear term beta3 * z3
  vector[N3] mu3 = ((1 - b) * inv_logit(k * (x3 - x0)) + b)
                  + beta0 * z3
                  + beta1;

  // Likelihoods
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
  y3 ~ normal(mu3, sigma3);

}

generated quantities {
  // ratios for diagnostics
  real sigma2_sigma1 = sigma2 / sigma1;
  real sigma3_sigma1 = sigma3 / sigma1;
}

joint_culture_meso_coretop_multivariate_fixedbeta1.stan



// joint_culture_meso_coretop_multivariate_fixedbeta1.stan
data {
  // first dataset
  int<lower=1> N1;
  vector[N1] x1;
  vector[N1] y1;

  // second dataset
  int<lower=1> N2;
  vector[N2] x2;
  vector[N2] y2;

  // third dataset, now with an extra predictor z3
  int<lower=1>   N3;
  vector[N3]     x3;
  vector[N3]     y3;
  vector[N3]     z3;        // e.g. nutrient concentration or depth, etc.
}

parameters {
  real<lower=-4>        x0;      // shared inflection
  real<lower=0>         k;       // shared steepness
  real<lower=0,upper=1> b;       // shared lower asymptote

  real                  beta0;      // new slope for z3
  real<lower=0>         sigma1;  // noise DS1
  real<lower=0>         sigma2;  // noise DS2
  real<lower=0>         sigma3;  // noise DS3
}

model {
  // Priors
  x0     ~ normal(20, 20) T[-4, ];
  k      ~ normal(0, 0.25);
  b      ~ beta(2, 5);

  // prior on the linear term
  beta0     ~ normal(0, 0.01);

  // noise priors
  sigma1 ~ normal(0.01, 0.1) T[0.01, ];
  sigma2 ~ normal(0.01, 0.1) T[0.01, ];
  sigma3 ~ normal(0.01, 0.1) T[0.01, ];

  // Logistic‐curve means (vectorized)
  vector[N1] mu1 = (1 - b) * inv_logit(k * (x1 - x0)) + b;
  vector[N2] mu2 = (1 - b) * inv_logit(k * (x2 - x0)) + b;

  // for dataset 3, add the linear term beta3 * z3
  vector[N3] mu3 = (1 - b) * inv_logit(k * (x3 - x0)) + b + (beta0 * z3);

  // Likelihoods
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2);
  y3 ~ normal(mu3, sigma3);
}

generated quantities {
  // ratios for diagnostics
  real sigma2_sigma1 = sigma2 / sigma1;
  real sigma3_sigma1 = sigma3 / sigma1;
}

Figure 1. A scatter plot between scaledRI vs. Temperature (color by culture strains)

Show Python example


def get_posteriors(data_dict, stan_file_name):
    from cmdstanpy import CmdStanModel
    # … rest of your function …

Figure 2. A dashboard of posterior distributions of culRI-Bayesian models

RI residuals of culRI-Bayesian models

Figure 3. Paleo-showcases

3.1 PETM showcase

3.2 Glacial-Interglacial showcase