David Bapst
2018-10-27 17:37:40 UTC
Hi all,
I was interested if anyone was familiar with R code that can estimate an
extended majority consensus tree (referred to as an 'allcompat' tree by the
sumt command in MrBayes)? This is a fully bifurcating summary of a tree
posterior, where each clade is maximally resolved by the split that is most
abundant in the considered post-burn-in posterior (i.e., that split which
has the plurality, if not the majority - the highest posterior probability
of any other competing, conflicting splits recovered within the posterior.
So, I guess one could also call these plurality consensus trees...).
The ape function `consensus` seemed promising at first, as it takes a `p`
argument which at 1 returns the strict consensus (the default), and at 0.5
returns the majority rule consensus (effectively the same as the
'halfcompat' option in MrBayes). So, I thought, I wonder what happens if I
set `p` below 0.5 - you could imagine that the extended majority consensus
is basically a similar threshold algorithm, but accepting solutions
(splits) of any frequency of occurrence in the tree set, so effectively
p~0.
Unfortunately, that is not how that works out, as `consensus` simply
assembles all splits with a frequency above the `p` value, but doesn't
discard conflicting splits. This means you can theoretically get more
resolved consensus trees below 0.5, but in practical terms your ability to
recover reasonable tree objects lasts until the frequency drops to the
point that you begin to accept conflicting splits.
Here's some code based off a phangorn example where I can do consensus to
get a more resolved tree as I delve into lower `p` values - you can see I
get a reasonable (if you think the extended majority rule ), more resolved
tree at `p = 0.4`, but at `p = 0.2` there are conflicting splits accepted,
such that the tree output no longer has a rational tree structure.
```
+ bs=100)
tree badly conformed; cannot plot. Check the edge matrix.
$ edge : int [1:109, 1:2] 48 49 50 51 52 53 54 55 56 57 ...
$ tip.label: chr [1:47] "GraySeal" "Vole" "Wallaroo" "Loris" ...
$ Nnode : int 63
- attr(*, "class")= chr "phylo"
Warning messages:
1: display list redraw incomplete
2: display list redraw incomplete
3: display list redraw incomplete
4: display list redraw incomplete
```
I'd be interested to know if anyone knows of an alternative way to do this
in R, or if perhaps I need to figure out how to modify `consensus` to
reject conflicting splits.
Cheers,
-Dave B
PS: Yes, I know there are real issues with such exhaustive consensus trees,
particularly they will likely agglomerate a combination of splits that
exists on no tree recovered within the posterior, but I have my reasons!
--
David W. Bapst, PhD
Asst Research Professor, Geology & Geophysics, Texas A & M University
Postdoc, Ecology & Evolutionary Biology, Univ of Tenn Knoxville
https://github.com/dwbapst/paleotree
[[alternative HTML version deleted]]
_______________________________________________
R-sig-phylo mailing list - R-sig-***@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-***@r-project.org/
I was interested if anyone was familiar with R code that can estimate an
extended majority consensus tree (referred to as an 'allcompat' tree by the
sumt command in MrBayes)? This is a fully bifurcating summary of a tree
posterior, where each clade is maximally resolved by the split that is most
abundant in the considered post-burn-in posterior (i.e., that split which
has the plurality, if not the majority - the highest posterior probability
of any other competing, conflicting splits recovered within the posterior.
So, I guess one could also call these plurality consensus trees...).
The ape function `consensus` seemed promising at first, as it takes a `p`
argument which at 1 returns the strict consensus (the default), and at 0.5
returns the majority rule consensus (effectively the same as the
'halfcompat' option in MrBayes). So, I thought, I wonder what happens if I
set `p` below 0.5 - you could imagine that the extended majority consensus
is basically a similar threshold algorithm, but accepting solutions
(splits) of any frequency of occurrence in the tree set, so effectively
p~0.
Unfortunately, that is not how that works out, as `consensus` simply
assembles all splits with a frequency above the `p` value, but doesn't
discard conflicting splits. This means you can theoretically get more
resolved consensus trees below 0.5, but in practical terms your ability to
recover reasonable tree objects lasts until the frequency drops to the
point that you begin to accept conflicting splits.
Here's some code based off a phangorn example where I can do consensus to
get a more resolved tree as I delve into lower `p` values - you can see I
get a reasonable (if you think the extended majority rule ), more resolved
tree at `p = 0.4`, but at `p = 0.2` there are conflicting splits accepted,
such that the tree output no longer has a rational tree structure.
```
library(ape)
library(phangorn)
data(Laurasiatherian)
set.seed(42)
bs <- bootstrap.phyDat(Laurasiatherian, FUN =
function(x)upgma(dist.hamming(x)),library(phangorn)
data(Laurasiatherian)
set.seed(42)
bs <- bootstrap.phyDat(Laurasiatherian, FUN =
+ bs=100)
tA <- consensus(bs,p=1)
tB <- consensus(bs, p=0.5)
tC <- consensus(bs, p=0.45)
tD <- consensus(bs, p=0.2)
layout(matrix(1:4,2,2))
plot(tA);plot(tB);plot(tC);plot(tD)
Error in plot.phylo(tD) :tB <- consensus(bs, p=0.5)
tC <- consensus(bs, p=0.45)
tD <- consensus(bs, p=0.2)
layout(matrix(1:4,2,2))
plot(tA);plot(tB);plot(tC);plot(tD)
tree badly conformed; cannot plot. Check the edge matrix.
str(tD)
List of 3$ edge : int [1:109, 1:2] 48 49 50 51 52 53 54 55 56 57 ...
$ tip.label: chr [1:47] "GraySeal" "Vole" "Wallaroo" "Loris" ...
$ Nnode : int 63
- attr(*, "class")= chr "phylo"
Warning messages:
1: display list redraw incomplete
2: display list redraw incomplete
3: display list redraw incomplete
4: display list redraw incomplete
```
I'd be interested to know if anyone knows of an alternative way to do this
in R, or if perhaps I need to figure out how to modify `consensus` to
reject conflicting splits.
Cheers,
-Dave B
PS: Yes, I know there are real issues with such exhaustive consensus trees,
particularly they will likely agglomerate a combination of splits that
exists on no tree recovered within the posterior, but I have my reasons!
--
David W. Bapst, PhD
Asst Research Professor, Geology & Geophysics, Texas A & M University
Postdoc, Ecology & Evolutionary Biology, Univ of Tenn Knoxville
https://github.com/dwbapst/paleotree
[[alternative HTML version deleted]]
_______________________________________________
R-sig-phylo mailing list - R-sig-***@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-***@r-project.org/