Posted on October 20, 2015 by Philipp Neubauer

# Sources of variation in PPMRs: PT1 - INLA analysis

## Background

Many size-spectrum models for marine ecosystems are based on size specific predation, and specifically, the preferred predator-prey mass ratio *β*. It seems that most recent theoretical/simulation papers have used a value of *β* = 100. This number was based on Jennings et al. (2002), who used stable isotope analysis to relate size to trophic level.

Few (any?) papers acknowledge that this is the observed PPMR, and is the integral over the preference and availability kernels (e.g., following Hartvig et al. (2011), the food available for a predator of size *m* is (simplified): *ϕ*(*m*)=∫*N*(*m*_{p}) *s*(*m*_{p}, *m*) *d* *m*_{p}, where *m*_{p} is the prey mass, *N*(*m*_{p}) is number of fish at mass *m*_{p} and *s*(*m*_{p}, *m*) is the prey selectivity kernel.)

A meta-analysis of stomach content data by Barnes et al. (2010) found both geographical differences (or study specific, the two factors seem largely confounded in the study) and a positive relationship between the PPMR and predator mass. The over-all mean PPMR was found to be higher (2.66 + 0.24 × *l**o**g*_{10}(*m*)).

Subsequent in depth analyses of that dataset found both strong taxonomic components as well as site and species specific relationships between the PPMR and predator mass (Nakazawa et al. 2011). Furthermore, model selection in another study found a temperature signal in the biomass trends, suggesting that changes in the PPMR with predator mass are temperature dependent (Gibert and DeLong 2014).

## A Bayesian analysis using INLA

In order to use size-spectra and other models based on size-based predation for strategic decision making in real ecosystems, a key question remains: What drivers are important to consider for plug-in values of PPMRs, and what are reasonable values to use in strategic models.

This first post takes a glimpse at drivers of the PPMR using Bayesian random effects models implemented in INLA. The idea is to partition variability into taxonomic, habitat and predation components, where the latter relates to predator/prey size, feeding mode etc. For this I will use the dataset supplied with Barnes et al. (2010), and supplement it with detailed taxonomic and habitat data, using the taxize package (Chamberlain and Szocs 2013, Chamberlain et al. 2014) and the rfishbase package (Boettiger et al. 2015), respectively.

### Preparing the data

The first step is to load the data, downloaded from the ecology archives.

```
require(dplyr)
require(INLA)
ppmr.tab <-read.csv('include/PPMR_files/PPMR.csv',header=T,na.strings = 'n/a',stringsAsFactors = F)
ppmr <- tbl_df(ppmr.tab)
# convert all weights to mg
ppmr$Prey.mass[ppmr$Prey.mass.unit=='mg'] <- ppmr$Prey.mass[ppmr$Prey.mass.unit=='mg'] /1000
# numeric Latitude
ppmr$Latitude <- as.numeric(substr(ppmr$Latitude,start = 1,stop = 2))
# filter insectivorous fish
ppmr <- ppmr %>% filter(!Type.of.feeding.interaction %in% c('insectivorous','predacious/piscivorous'))
```

I then standardise numerical variables to check relative importance, and use taxize and rfishbase to get the covariates:

```
std <- function(x) (x-mean(x))/(2*sd(x))
ppmr <- ppmr %>% mutate(Mean.annual.temp = std(Mean.annual.temp),
SD.annual.temp = std(SD.annual.temp),
Latitude = std(Latitude),
Mean.PP = std(Mean.PP),
Depth = std(Depth),
Pmass=Predator.mass)
################################
###### Taxise ##################
################################
require("taxize")
# taxonomy of unresolved names, fix some bad entries manually
us <- unique(ppmr$Predator)
us[us=='Urophysis chuss'] <- 'Urophycis chuss'
us[us=='Leucoraja fullonica '] <- 'Leucoraja fullonica'
us[us=='Myoxocephalus octodecimspinosus'] <- 'Myoxocephalus octodecemspinosus'
# get rid of leading or trailing white spaces in species names
trim <- function (x) gsub("^\\s+|\\s+$", "", x)
us <-trim(us)
nus<- strsplit(us,split = ' sp.',fixed=T)
# get taxonomy
cs <- classification(nus, db = 'ncbi')
# assign taxonomy
ppmr$Class <- unlist(lapply(cs,function(x) x$name[x$rank=='Class' |x$rank=='class' ]))[match(ppmr$Predator,unique(ppmr$Predator))]
ppmr$Order <- unlist(lapply(cs,function(x) x$name[x$rank=='Order' | x$rank=='order']))[match(ppmr$Predator,unique(ppmr$Predator))]
ppmr$Family <- unlist(lapply(cs,function(x) x$name[x$rank=='Family' | x$rank=='family']))[match(ppmr$Predator,unique(ppmr$Predator))]
##################################################
################ Habitats ########################
##################################################
# get habitat from fishbase and other DBs
require(rfishbase)
# %in% doesn't work, need grepl as some ScientificNames don't match - makes it really slow
match_tax <- function (sp, fish.data = NULL, path = NULL) {
if (is.null(fish.data))
fish.data <- fishbase
matches <- vector('list',length(sp))
for (i in 1:length(sp)){
matches[[i]] <- which(do.call('c',lapply(split(fish.data,1:nrow(fish.data)), function(x) grepl(sp[[i]],paste(x$Genus,x$Species,collapse=' ')) | grepl(sp[[i]],x$Genus) | grepl(sp[[i]],x$Family)| grepl(sp[[i]],x$Class))))
}
return(matches)
}
# parallelise this call
sp_match <- parallel::mclapply(nus,match_tax,mc.cores=4)
sp_match <- do.call('c',sp_match)
# get habitat for FB species.
habitats <- do.call('c',lapply(sp_match,function(x) {
if(length(x)>0){
habitats <- species(species_names(load_taxa()[x[[1]],]$SpecCode,all_taxa = load_taxa()))$DemersPelag
habitats <- table(unlist(lapply(habitats,function(y) {
if(!is.null(y)) strsplit(y,'[;]')[[1]][1]
})))
habitat<-names(which.max(habitats))
} else {
habitat <- NA
}
habitat
}
)
)
# any without a match?
unique(ppmr$Predator)[is.na(habitats)]
```

```
## [1] "Loligo pealeii" "Notolepis rissoi"
## [3] "Paralichthys oblongus" "Raja erinacea"
## [5] "Gilbertidia sigalutes" "Raja ocellata"
## [7] "Aspitrigla cuculus" "Leucoraja fullonica "
## [9] "Raja naevus" "Pleuragramma antarcticum"
```

```
# manually assign habitats based on internet search...
habitats[is.na(habitats)] <- c('pelagic-neritic','bathypelagic','demersal','demersal','demersal','demersal','demersal','demersal','demersal','benthopelagic')
ppmr$habitat <- habitats[match(ppmr$Predator,unique(ppmr$Predator))]
any(is.na(ppmr$habitat))
```

`## [1] FALSE`

```
##################################################
############## Subset ############################
##################################################
ppmr.better <- ppmr %>% filter(Predator.quality.of.length.mass.conversion <5,
Prey.quality.of.conversion.to.mass <5)
# only use data that is of "good" quality for this analysis
ppmr.best <- ppmr %>% filter(Predator.quality.of.length.mass.conversion <4,
Prey.quality.of.conversion.to.mass <4,
!is.na(Individual.ID))
```

### Testing the waters with INLA

I now fit a Bayesian model in INLA (Rue et al. 2014), using the log10 prey mass regressed against the log10 predator mass and the covariates; taxonomy, habitat and feed type are supplied as random effects. I also added a random effect for geographic location, which pretty much proxies for the study in the original data.

I also let INLA calculate the CPO and pit values, as the latter can provide a clue about model calibration - they should be uniformly distributed if the model is well specified.

```
best.model.int <- inla(log10(Prey.mass) ~ log10(Predator.mass)*Mean.annual.temp+
SD.annual.temp +
#f(log10(Pmass),model='rw2', diagonal=1e-5) +
f(Predator,model='iid') +
f(Geographic.location,model='iid')+
f(Type.of.feeding.interaction,model='iid')+
f(factor(Individual.ID),model='iid')+
f(Order,model='iid')+
f(Class,model='iid')+
f(habitat,model='iid')+
f(Family,model='iid')+
#f(Specific.habitat,model='iid')+
Depth +
#Latitude +
Mean.PP,
data = ppmr.best,
control.compute = list(dic=TRUE,cpo=T),
control.predictor = list(compute=T)
)
```

### Model fit and results

A summary of the model fit is given here:

`summary(best.model.int)`

```
##
## Call:
## c("inla(formula = log10(Prey.mass) ~ log10(Predator.mass) * Mean.annual.temp + ", " SD.annual.temp + f(Predator, model = \"iid\") + f(Geographic.location, ", " model = \"iid\") + f(Type.of.feeding.interaction, model = \"iid\") + ", " f(factor(Individual.ID), model = \"iid\") + f(Order, model = \"iid\") + ", " f(Class, model = \"iid\") + f(habitat, model = \"iid\") + f(Family, ", " model = \"iid\") + Depth + Mean.PP, data = ppmr.best, control.compute = list(dic = TRUE, ", " cpo = T), control.predictor = list(compute = T))")
##
## Time used:
## Pre-processing Running inla Post-processing Total
## 2.5232 272.3183 1.2328 276.0743
##
## Fixed effects:
## mean sd 0.025quant 0.5quant
## (Intercept) -2.0986 0.3946 -2.9005 -2.0891
## log10(Predator.mass) 0.8646 0.0453 0.7753 0.8647
## Mean.annual.temp 0.8738 0.8799 -0.9171 0.9143
## SD.annual.temp 0.8886 1.1505 -1.3620 0.8818
## Depth -0.9497 0.8127 -2.2860 -1.0169
## Mean.PP -0.8536 0.3500 -1.5039 -0.8644
## log10(Predator.mass):Mean.annual.temp 0.1753 0.0552 0.0669 0.1753
## 0.975quant mode kld
## (Intercept) -1.3487 -2.0773 0
## log10(Predator.mass) 0.9532 0.8649 0
## Mean.annual.temp 2.4544 0.9471 0
## SD.annual.temp 3.1752 0.8793 0
## Depth 0.7513 -1.0766 0
## Mean.PP -0.1480 -0.8726 0
## log10(Predator.mass):Mean.annual.temp 0.2835 0.1754 0
##
## Random effects:
## Name Model
## Predator IID model
## Geographic.location IID model
## Type.of.feeding.interaction IID model
## factor(Individual.ID) IID model
## Order IID model
## Class IID model
## habitat IID model
## Family IID model
##
## Model hyperparameters:
## mean sd 0.025quant
## Precision for the Gaussian observations 7.481 1.170e-01 7.2590
## Precision for Geographic.location 9.079 7.095e+00 1.5589
## Precision for Predator 1.273 3.749e-01 0.5932
## Precision for Type.of.feeding.interaction 573.938 5.774e+02 50.6217
## Precision for factor(Individual.ID) 4.961 2.644e-01 4.4722
## Precision for Order 19462.135 1.890e+04 1272.0609
## Precision for Class 16042.542 1.766e+04 1077.8208
## Precision for habitat 20029.042 1.924e+04 1204.4973
## Precision for Family 15978.116 1.762e+04 1063.6941
## 0.5quant 0.975quant mode
## Precision for the Gaussian observations 7.478 7.718 7.469
## Precision for Geographic.location 7.220 27.645 4.127
## Precision for Predator 1.266 2.014 1.234
## Precision for Type.of.feeding.interaction 405.322 2101.621 141.752
## Precision for factor(Individual.ID) 4.950 5.511 4.923
## Precision for Order 13911.587 69281.296 3430.782
## Precision for Class 10700.136 63102.813 2915.287
## Precision for habitat 14360.221 70233.378 3165.792
## Precision for Family 10644.986 62921.371 2865.652
##
## Expected number of effective parameters(std dev): 1037.84(11.46)
## Number of equivalent replicates : 9.341
##
## Deviance Information Criterion (DIC) ...: 9043.64
## Effective number of parameters .........: 1038.39
##
## Marginal log-Likelihood: -5305.44
## CPO and PIT are computed
##
## Posterior marginals for linear predictor and fitted values computed
```

The summary suggests that inferences made in Barnes et al. (2010), Gibert and DeLong (2014) and Nakazawa et al. (2011) all hold up in a comprehensive analysis of the data: the effects for predator size and its interaction with temperature are unlikely to be zero. Taxonomic variability was high, but was inferred only at the individual and species level: at higher taxonomic groupings (e.g., Family, Order), the SD (precision) of random effects was extremely small (large). Similarly, habitat did not seem an important driver of realised PPMRs. The over-all mean PPMR, however, was inferred to be 123:1, which is quite a bit lower than that found in Barnes et al. (2010) and more in line with Jennings et al. (2002). Furthermore, mean primary production was negatively related to the ratio of prey to predator mass, indicating a larger PPMR in productive systems.

A look at model fits indicates an OK fit with some over-dispersion (peak in the pit histogram):

```
#Predicted values vs observations from the model
plot(best.model.int$summary.linear.predictor$`0.5quant`,log10(ppmr.best$Prey.mass))
abline(a=0,b=1)
```

```
# pit histogram
hist(best.model.int$cpo$pit)
```

The pit histogram, as well as the low variance attributed to the higher taxonomic groupings in the INLA model make me suspicious that the formulation of independent random effects in this case estimates their contribution correctly or produces reasonable predictions at higher taxonomic levels. Taxonomy is by nature nested, but it seems difficult to represent this explicitly in INLA (but perhaps I just didn't find the right way!). I'll try a more sophisticated approach for the next go at this...TBC

## References

Barnes, C., D. Maxwell, D. C. Reuman, and S. Jennings. 2010. Global patterns in predator-prey size relationships reveal size dependency of trophic transfer efficiency. Ecology 91:222–232.

Boettiger, C., S. Chamberlain, D. Temple Lang, and P. Wainwright. 2015. Rfishbase: R interface to ’fishBase’.

Chamberlain, S., and E. Szocs. 2013. Taxize - taxonomic search and retrieval in r. F1000Research.

Chamberlain, S., E. Szocs, C. Boettiger, K. Ram, I. Bartomeus, and J. Baumgartner. 2014. Taxize: Taxonomic information from around the web.

Gibert, J. P., and J. P. DeLong. 2014. Temperature alters food web body-size structure. Biology letters 10:20140473.

Hartvig, M., K. H. Andersen, and J. E. Beyer. 2011. Food web framework for size-structured populations. Journal of Theoretical Biology 272:113–122.

Jennings, S., K. J. Warr, and S. Mackinson. 2002. Use of size-based production and stable isotope analyses to predict trophic transfer efficiencies and predator-prey body mass ratios in food webs. Marine Ecology Progress Series 240:11–20.

Nakazawa, T., M. Ushio, and M. Kondoh. 2011. Scale dependence of predator–prey mass ratio: Determinants and applications. The Role of Body Size in Multispecies Systems 45:269.

Rue, H., S. Martino, F. Lindgren, D. Simpson, A. Riebler, and E. T. Krainski. 2014. INLA: Functions which allow to perform full bayesian analysis of latent gaussian models using integrated nested laplace approximaxion.