[R-sig-phylo] Comparing DIC of phylogenetic and non-phylogenetic GLMM run with MCMC (MCMCglmm)

Discussion:

Liam Kendall

2018-06-20 09:13:28 UTC

Dear all,

I am conducting an analysis predicting insect body sizes using a co-varying trait and their biogeographic region within two model formulations using MCMCglmm.

The first model has the structure: log(Weight) ~ log(Trait)+ Biogeography + Family (i.e. Taxonomic family of species)

The second model has the structure: log(Weight) ~ log(Trait)+ Biogeography + (1|Species/Animal), pedigree = phylogeny, i.e. variance between species is constrained by the branch lengths between the species.

The aim of running these two models is compare which is more predictive and to increase usability: Including family is user-friendly (and easy for the end user, especially if they’re not a taxonomist) whereas the phylogenetic model is more attractive theoretically however from a predictive sense requires your species of interest to be contained within the phylogeny used to fit the model,

Therefore, my question is how best can I compare these two models in model selection? Can I compare them directly by their DIC weighting if the only difference is the phylogenetic random term? Or is there be a better way to compare them? So far, we are also comparing their performance based off k-fold cross validation and RMSE but in the ‘age of AIC’, DIC appears a good place to start for model selection.

Any advice would be much appreciated.

Best,
Liam

[[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - R-sig-***@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-***@r-project.org/

jonnations

2018-06-21 13:24:37 UTC

Permalink

Hi Liam,

I don't have the exact answer you are looking for, but I would highly
recommend the brms package in R. It is incredibly flexible and has
excellent diagnostic tools like LOO and WAIC that are easy to use and
interpret for model selection. I think it would work well for the models
you presented. There is an easy to follow tutorial on phylogenetic mixed
models too.

Also there is another list serve called "r-sig-mixed-models" that you might
be interested in. It's not "phylo" focused, but these sorts of questions
come up on there all the time.

Good luck!
Jon

ps- my first time responding to the list, sorry for any format errors
--
Jonathan A. Nations
PhD Candidate
Esselstyn Lab <http://www.museum.lsu.edu/esselstyn>
Museum of Natural Sciences <http://sites01.lsu.edu/wp/mns>
Louisiana State University

Message: 2
Date: Wed, 20 Jun 2018 19:13:28 +1000
Subject: [R-sig-phylo] Comparing DIC of phylogenetic and
non-phylogenetic GLMM run with MCMC (MCMCglmm)
Content-Type: text/plain; charset="utf-8"
Dear all,
I am conducting an analysis predicting insect body sizes using a
co-varying trait and their biogeographic region within two model
formulations using MCMCglmm.
The first model has the structure: log(Weight) ~ log(Trait)+ Biogeography
+ Family (i.e. Taxonomic family of species)
The second model has the structure: log(Weight) ~ log(Trait)+ Biogeography
+ (1|Species/Animal), pedigree = phylogeny, i.e. variance between species
is constrained by the branch lengths between the species.
The aim of running these two models is compare which is more predictive
and to increase usability: Including family is user-friendly (and easy for
the end user, especially if they’re not a taxonomist) whereas the
phylogenetic model is more attractive theoretically however from a
predictive sense requires your species of interest to be contained within
the phylogeny used to fit the model,
Therefore, my question is how best can I compare these two models in model
selection? Can I compare them directly by their DIC weighting if the only
difference is the phylogenetic random term? Or is there be a better way to
compare them? So far, we are also comparing their performance based off
k-fold cross validation and RMSE but in the ‘age of AIC’, DIC appears a
good place to start for model selection.
Any advice would be much appreciated.
Best,
Liam
[[alternative HTML version deleted]]
------------------------------
Subject: Digest Footer
_______________________________________________
R-sig-phylo mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
------------------------------
End of R-sig-phylo Digest, Vol 125, Issue 9
*******************************************

[[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - R-sig-***@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-***@r-project.org/

Jarrod Hadfield

2018-06-21 15:34:05 UTC

Permalink

Hi Liam,

In multi-level models DIC can be 'focused' at different levels. In
MCMCglmm, DIC is focussed at the highest possible level because this is
the only level at which it can be analytically computed for non-Gaussian
models. The highest level is not the level at which most scientists want
their information criteria focussed, and so I would not recommend it. In
fact I have wondered about removing it completely from MCMCglmm.
Cross-validation is a much better approach, and in some ways is what
information criteria aspire to. But its more computationally demanding
of course.

Cheers,

Jarrod

Post by jonnations
Hi Liam,
I don't have the exact answer you are looking for, but I would highly
recommend the brms package in R. It is incredibly flexible and has
excellent diagnostic tools like LOO and WAIC that are easy to use and
interpret for model selection. I think it would work well for the models
you presented. There is an easy to follow tutorial on phylogenetic mixed
models too.
Also there is another list serve called "r-sig-mixed-models" that you might
be interested in. It's not "phylo" focused, but these sorts of questions
come up on there all the time.
Good luck!
Jon
ps- my first time responding to the list, sorry for any format errors

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-phylo mailing list - R-sig-***@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-***@r-project.org/

Liam Kendall

2018-06-22 03:43:31 UTC

Permalink

Thank you all for your very informative responses.

I will try the brms package as Jon suggested - I have read a bit about WAIC being more appropriate or favourable than the DIC but I was (until now) unfamiliar with the brms package.

We are very much working within a predictive framework where model selection is followed by k-fold cross validation so I would very much be curious about your thoughts on that type of cross-validation on a phylogenetic glmm

Thanks again and all the best

Liam

Post by jonnations
Hi Liam,
In multi-level models DIC can be 'focused' at different levels. In MCMCglmm, DIC is focussed at the highest possible level because this is the only level at which it can be analytically computed for non-Gaussian models. The highest level is not the level at which most scientists want their information criteria focussed, and so I would not recommend it. In fact I have wondered about removing it completely from MCMCglmm. Cross-validation is a much better approach, and in some ways is what information criteria aspire to. But its more computationally demanding of course.
Cheers,
Jarrod

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-phylo mailing list - R-sig-***@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-***@r-project.org/