Posted on November 9, 2015 by Philipp Neubauer

Emulating trait-based models 2: a more serious attempt

In a previous try, I played with history matching, a method for fitting complex computer simulation models to data (Kennedy and O’Hagan 2001). My first try (hack?) was a rather naive attempt to understand how history matching works in general, and to get an idea of its usefulness for fitting size-spectra to data. But I didn't really make a serious attempt at going through a history matching exercise that might produce useful insights into possible parameter values of uncertain size-spectrum parameters.

So for this try I will make a more serious attempt at fitting statistical models to emulate trait based simulation models with multivariate outputs (e.g., species biomass levels). I will start with a very coarse evaluation of plausible parameter spaces, before building the emulator within that space. Due to the massive non-linearities in the size-spectra, a more loose approach of building an emulator over a large parameter space leads to poor emulation.

One thing that did become clear in the first round of history matching was that uni-variate emulators are very inefficient. In this post, I will try more efficient multi-variate emulators that are based on assumption of separability of input and output variances (Rougier 2008). I will also more explicitly query if the emulator makes reasonable predictions away from data, which is be crucial if this type of method is to work for ecosystem models and provide a path towards fitting these models to data.

Setting up a trait based model

I will use the mizer package to set up a trait based model, and estimate its (active) parameters. The mizer vignette provides a convenient reminder how to set this up:

no_sp = 10
min_w_inf <- 10
max_w_inf <- 1e5
w_inf <- 10^seq(from=log10(min_w_inf), to = log10(max_w_inf), length=10)
knife_edges <- w_inf * 0.05

truth <- set_trait_model(no_sp = no_sp, 
                         kappa = 0.01,
                         r_pp = 10,
                         beta = 300,
                         sigma= 1.3,
                         min_w_inf = min_w_inf, 
                         max_w_inf = max_w_inf,
                         knife_edge_size = knife_edges)
sim <- project(truth, t_max = 100, dt=0.2, effort = 0.4)


Looks like this arbitrary trait based model has converged to a stable solution. I now use the mean biomass over the last 25 years as data for the calibration (though I could use just the last year, or some other variables).

biom <- colMeans(getBiomass(sim)[75:100,])

Building an emulator

In this example, I chose to use just three ecological parameters (h - the maximum food intake rate, sigma - the width of the prey selection kernel and r_pp - the primary productivity) as unknowns, keeping all other parameters fixed to provide a tractable start. Some initial exploration with the trait based model suggested that these variables had disproportionate effects - although other variables such as k0 - the multiplier on maximum recruitment - and beta (the mean of the prey selection kernel) should be considered uncertain in the future. Furthermore, some parameters, such as h and k0 may well cay among species, so could be considered as on a per-species basis in the future. But for now, I focus on the three parameters above to see if I can get a workable example. Here's a start, making a first set of training data for the emulator by running the size-based model on a grid of parameter values.:

minmax <- data.frame(h = c(3,150),
                     r_pp = c(0.005,100),
                     sigma = c(0.5,4))
LHC <- latin.hypercube(250,3)

LHC_in <- data.frame(sapply(1:ncol(LHC), function(x)

colnames(LHC_in) <- c('h','r_pp','sigma')
pred_list <- split(LHC_in,1:nrow(LHC_in))       
system.time( simdat <- parallel::mclapply(pred_list,
                                          mc.cores = 4))
##     user   system  elapsed 
## 3605.216   23.464 1402.426
sim_data <-'rbind',simdat)

Here I chose to right away get rid of some regions of parameter space (i.e., calculate the implausibility (Andrianakis et al. 2015) based on simulation runs alone.). Since I do not use an emulator here, and the simulations are deterministic, I need to consider model uncertainty as the main source of uncertainty. Model uncertainty is the uncertainty due to model mis-specification and is notoriously hard to pin down. Here I set it very conservatively to a third of the variance of the simulation outputs over a wide range of input parameters. This should ensure that I only discard parameter values that produce really absurd outputs (i.e., extinction of some species). the implausibility theshold is set at 10, which is the approximate value for a Chi-squared dsitribution with 10 degrees of freedom. For a more detailed intro to Implausibility and its application in the analysis of simulation outputs, see Andrianakis et al. (2015) and references therein.

#naset <- which(apply(sim_data,1,function(x) any(
#sim_data <- sim_data[-naset,]

LHC_in$p_reg  <- apply(sim_data,1,function(pred){
  t(biom - pred) %*% 
    solve(var(sim_data)/4) %*% 
    (biom - pred)

mpreds <- reshape2::melt(LHC_in)

ggplot(mpreds) + 
  facet_grid( p_reg ~ variable,scales = "free") + 
  geom_bar(aes(x=value)) + 
  theme_bw() + 

I now use the remaining points that were not discarded (there aren't many!) to define a set of point to build the emulator on. This avoids training the emulator on a set of parameters and outputs that give widely different answers and dominate the model fit, but lead to poor predictions in the area of interest (i.e., parameters that make sense). The new set is made by drawing from a multivariate normal distribution around the remaining points to explore the space around those points.

leftovers <- LHC_in[!LHC_in$p_reg,1:3]
oc = round(500/nrow(leftovers))
isd <- cov(leftovers)

# draw from mv norm around leftover values
new_pred_pre_list_jitter <- t(apply(repmat(leftovers,oc,1),1,function(x) t(mvtnorm::rmvnorm(1,x,0.05*isd))))

#remove values <0
new_pred_pre_list_jitter <- new_pred_pre_list_jitter[(apply(new_pred_pre_list_jitter,1,function(x) !any(x<=0))),]

colnames(new_pred_pre_list_jitter) <- c('h','r_pp','sigma')

mp <- reshape2::melt(new_pred_pre_list_jitter)
ggplot(mp,aes(x=value)) + 
  geom_histogram(aes(x=value)) + 
  facet_wrap(~Var2,scales='free') + 

Running the simulator to make a training set for the emulator:

new_pred_list <- split(data.frame(new_pred_pre_list_jitter),

system.time( new_simdat <- parallel::mclapply(new_pred_list,
                                              mc.cores = 4))
##     user   system  elapsed 
## 8155.927   61.125 3178.952

I throw out some combinations of parameters that lead to collapse. In a full history matching exerciser I wouldn't want to do this since they hold information about areas of poor fit to the data - but here I use it to show that the emulator can indeed give OK (and fast!) predictions of the size spectrum in the parameter space of interest.

new_sim_data <-'rbind',new_simdat)

##               1            2            3            4            5
## 0%   0.02623961 3.014166e-18 1.690778e-06 3.577906e-16 1.337676e-11
## 25%  0.04066894 2.982046e-02 2.018920e-02 1.259627e-02 7.494951e-03
## 50%  0.04473751 3.178591e-02 2.264478e-02 1.573578e-02 1.020695e-02
## 75%  0.05080083 3.516367e-02 2.509736e-02 1.768273e-02 1.266223e-02
## 100% 0.06548712 4.533041e-02 3.276044e-02 2.454326e-02 1.885972e-02
##                 6            7            8            9           10
## 0%   2.021669e-08 1.115931e-08 1.221133e-08 6.819790e-09 3.433630e-09
## 25%  4.308016e-03 2.615995e-03 1.478547e-03 8.508476e-04 4.471478e-04
## 50%  6.254554e-03 3.755085e-03 2.087824e-03 1.166121e-03 5.963557e-04
## 75%  8.439304e-03 5.433474e-03 3.117022e-03 1.752072e-03 9.151875e-04
## 100% 1.436430e-02 1.037388e-02 6.870761e-03 4.196598e-03 2.361536e-03
pred_pre_list_df <- data.frame(new_pred_pre_list_jitter)
pred_pre_list_df$p_reg <- apply(new_sim_data,1,function(x) any(x<(min(biom)/10^4)))

mpreds <- reshape2::melt(pred_pre_list_df)

ggplot(mpreds) + 
  facet_grid( p_reg ~ variable,scales = "free") + 
  geom_bar(aes(x=value)) + 
  theme_bw() + 
keepers <-  !$p_reg) & !(pred_pre_list_df$p_reg)

sim_data <- new_sim_data[keepers,]

sim_data <- unique(sim_data)

train <-, size = 0.9*nrow(sim_data))
test <- which(!(1:nrow(sim_data)) %in% train)

sim_data_train <- sim_data[train,]
sim_data_test <- sim_data[test,]

pred_pre_list <- unique(new_pred_pre_list_jitter[keepers,])

Time to think about how to emulate this model. In general, species are linked in the model, so uni-variate emulation seems like the wrong approach. An alternative is multivariate emulation. In particular, the method described in Rougier (2008) may be appropriate here since we are working with a standard matrix of outputs. The method assumes that the predictions can be decomposed into a regression in the inputs and a regression in the outputs (i.e., the size classes).

First, source Johnathan Rougiers code:


For the method to work, we need to specify regressors on the inputs and outputs, along with a covariance, computed from the inputs and at points where the simulator is evaluated (i.e., where we calculate the biomass - at w_inf for each species). I use the same for inputs and outputs here as I have little prior idea. For the mean part of the GP regression, a second order polynomial seems like a relatively flexible start. Tor the covariance, I will start with an exponential covariance for simplicity. Later, I will try to learn the scale parameters of both matrices in order to optimise the emulator. The functions for means, variances and other helper functions are defined here

I can now define a GP emulator:

  # stdise is a normal centering and fixing to [-1,1], whereas stdise_wrt(x,y) is with respect to the mean and max of y 

ins <- stdise(as.matrix(pred_pre_list[train,]))
ins_test <- stdise_wrt(m(pred_pre_list[test,]),m(pred_pre_list[train,]))

outs <- stdise(log(sim_data_train))
tests <- stdise_wrt(log(sim_data_test),log(sim_data_train))

scales <- 1/(0.125*as.vector(diff(apply(ins,2,range))))^2
out_grid <- (seq(from=log10(min_w_inf), to = log10(max_w_inf), length=10))
out_grid <- (out_grid-mean(out_grid))/sd(out_grid)

OPE <- define_OPE(c(1,1*(scales)),
                  inData = ins,
                  outGrid = out_grid,
                  outData = outs)
cuts <- cut(1:nrow(ins_test),nrow(ins_test)/2)
ins_test_sp <- split(data.frame(ins_test), cuts)

test_pred <- mclapply(ins_test_sp, 
                      function(ins) emulate(ins,
                                            rev_std_data = log(sim_data_train)),
                      mc.cores = 4)

test_pred <-'rbind',test_pred)

How well does the emulator perform?

For these inputs, it seems as though the emulator does well in interpolating the results from the size-spectrum for most of the test set:

## Call:
## lm(formula = exp(test_pred$mu) ~ 0 + as.vector(t((sim_data_test))))
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0050637 -0.0003211 -0.0000239  0.0003372  0.0063478 
## Coefficients:
##                               Estimate Std. Error t value Pr(>|t|)    
## as.vector(t((sim_data_test))) 1.005866   0.002592     388   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Residual standard error: 0.001145 on 479 degrees of freedom
## Multiple R-squared:  0.9968, Adjusted R-squared:  0.9968 
## F-statistic: 1.505e+05 on 1 and 479 DF,  p-value: < 2.2e-16

Improving the emulator

It might be possible to get even better prediction by adjusting the length scale of the GP covariances. For this, I use a function to optimise the length scales based on the GP marginal likelihoods, and then define the new emulator by those length scales:

opt_OPE <- optim(c(1,(scales)),
      inData = stdise(as.matrix(pred_pre_list)),
      outGrid = out_grid,
      outData = stdise(log(sim_data)),
      control = list(fnscale=-1,
                     trace = 100,
                     reltol = 1e-6))

The new emulator with optimised GP length scales is now:

OPE_opt <- define_OPE(opt_OPE$par,
                      inData = stdise(as.matrix(pred_pre_list)),
                      outGrid = out_grid,
                      outData = stdise(log(sim_data)))

OPE_opt_test <- adjustOPE(OPE_opt, R = ins, Y = outs)

test_pred <- mclapply(ins_test_sp, 
                      function(ins) emulate(ins,
                                            rev_std_data = log(sim_data_train)),
                      mc.cores = 4)

test_pred <-'rbind',test_pred)

## Call:
## lm(formula = exp(test_pred$mu) ~ 0 + as.vector(t(sim_data_test)))
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0026919 -0.0001583 -0.0000180  0.0000440  0.0049725 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## as.vector(t(sim_data_test)) 1.002935   0.001517   661.3   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Residual standard error: 0.0006698 on 479 degrees of freedom
## Multiple R-squared:  0.9989, Adjusted R-squared:  0.9989 
## F-statistic: 4.373e+05 on 1 and 479 DF,  p-value: < 2.2e-16

Indeed, now 75% of predictions are within quantile(abs((exp(test_pred$mu)-as.vector(t(sim_data_test)))/as.vector(t(sim_data_test)))*100)[4]% of the simulations.

Can the emulator emulate the size-spectrum response?

To test the emulator, its worthwhile testing how well the emulator predicts the marginal responses to changes in the inputs:

in_df <- data.frame(pred_pre_list)

pred_size <- 8

testset1 <- data.frame(h=seq(min(in_df$h),max(in_df$h),l=pred_size),
                       r_pp = 10,
                       sigma= 1.3)

testset2 <- data.frame(h=40,
                       r_pp = 10,
                       sigma= seq(min(in_df$sigma),max(in_df$sigma),l=pred_size))

testset3 <- data.frame(h=40,
                       r_pp = lseq(quantile(in_df$r_pp,0.05),quantile(in_df$r_pp,0.95),l=pred_size),
                       sigma= 1.3)


This looks pretty promising! Especially given a big caveats here: the parameter sets over which these marginal responses were estimated may not all be plausible. However, I also removed some implausible inputs further up to improve model fits in relevant regions - so I cheated a bit. But all in all this looks like a promising way forward, and I will follow this up with a more elaborate example of emulation and history matching next.


Andrianakis, I., I. R. Vernon, N. McCreesh, T. J. McKinley, J. E. Oakley, R. N. Nsubuga, M. Goldstein, and R. G. White. 2015. Bayesian History Matching of Complex Infectious Disease Models Using Emulation: A Tutorial and a Case Study on HIV in Uganda. PLoS Comput Biol 11:e1003968.

Kennedy, M. C., and A. O’Hagan. 2001. Bayesian calibration of computer models. Journal of the Royal Statistical Society. Series B, Statistical Methodology:425–464.

Rougier, J. 2008. Efficient emulators for multivariate deterministic functions. Journal of Computational and Graphical Statistics 17:827–843.