Title: | Enumeration and Uniform Sampling of Transmission Trees for a Known Phylogeny |
---|---|
Description: | For a single, known pathogen phylogeny, provides functions for enumeration of the set of compatible epidemic transmission trees, and for uniform sampling from that set. Optional arguments allow for incomplete sampling with a known number of missing individuals, multiple sampling, and known infection time limits. Always assumed are a complete transmission bottleneck and no superinfection or reinfection. See Hall and Colijn (2019) <doi:10.1093/molbev/msz058> for methodology. |
Authors: | Matthew Hall [aut, cre], Caroline Colijn [ctb] |
Maintainer: | Matthew Hall <[email protected]> |
License: | GPL |
Version: | 1.1.2 |
Built: | 2025-02-20 05:00:17 UTC |
Source: | https://github.com/mdhall272/stratus |
igraph
objectFor a sample, produce the transmission tree as a igraph
object
build.edgelist(generator, sample)
build.edgelist(generator, sample)
generator |
A list of class |
sample |
A list of class |
An igraph
object
generator <- tt.generator(stratus.example.tree) samples <- sample.tt(generator, 1) build.edgelist(generator, samples[[1]])
generator <- tt.generator(stratus.example.tree) samples <- sample.tt(generator, 1) build.edgelist(generator, samples[[1]])
ggtree
For a sample with no unsampled hosts, draw the annotated phylogeny using ggtree
draw.fully.sampled(generator, sample)
draw.fully.sampled(generator, sample)
generator |
A list of class |
sample |
A list of class |
A ggtree
object
generator <- tt.generator(stratus.example.tree) samples <- sample.tt(generator, 1) draw.fully.sampled(generator, samples[[1]])
generator <- tt.generator(stratus.example.tree) samples <- sample.tt(generator, 1) draw.fully.sampled(generator, samples[[1]])
ggtree
For a sample with or without unsampled hosts, draw the annotated phylogeny using ggtree
draw.incompletely.sampled(generator, sample)
draw.incompletely.sampled(generator, sample)
generator |
A list of class |
sample |
A list of class |
A ggtree
object
generator <- tt.generator(stratus.example.tree, max.unsampled = 2) samples <- sample.tt(generator, 1, unsampled=2) # Tree is annotated with the number of unsampled hosts along each branch draw.incompletely.sampled(generator, samples[[1]]) # This still works if there are no unsampled hosts samples <- sample.tt(generator, 1) draw.incompletely.sampled(generator, samples[[1]])
generator <- tt.generator(stratus.example.tree, max.unsampled = 2) samples <- sample.tt(generator, 1, unsampled=2) # Tree is annotated with the number of unsampled hosts along each branch draw.incompletely.sampled(generator, samples[[1]]) # This still works if there are no unsampled hosts samples <- sample.tt(generator, 1) draw.incompletely.sampled(generator, samples[[1]])
stratus.example.tree
to groups (in the order they appear in stratus.example.tree$tip.label
), as an example of multiple sampling.A vector assigning the tips of stratus.example.tree
to groups (in the order they appear in stratus.example.tree$tip.label
), as an example of multiple sampling.
A character vector of length 20
Resample the subtree rooted at any tree node, keeping the annotations for the rest of the tree fixed
sample.partial.tt( generator, count = 1, unsampled = 0, starting.node = phangorn::getRoot(generator$tree), existing = NULL, check.integrity = TRUE, draw = count == 1, igraph = FALSE, verbose = FALSE )
sample.partial.tt( generator, count = 1, unsampled = 0, starting.node = phangorn::getRoot(generator$tree), existing = NULL, check.integrity = TRUE, draw = count == 1, igraph = FALSE, verbose = FALSE )
generator |
A list of class |
count |
How many transmission trees to sample. |
unsampled |
The number of unsampled hosts in the transmission chain. (The whole transmission chain, even if only part of the transmission tree is being resampled). A value >0 requires a |
starting.node |
The root of the subtree to resample. If this is the root of the whole tree, then |
existing |
An existing list of class |
check.integrity |
Whether to check if |
draw |
Use |
igraph |
Produce the transmission trees in |
verbose |
Verbose output |
A list, each of whose elements is a list of class tt
with one or more of the following elements:
annotations
Always present. A vector indicating which host (given by numbers corresponding to the ordering in generator$hosts
) is assigned to each phylogeny node.
edgelist
Always present. A data.frame
giving the edge list; the first column are parents and the second children.
hidden
Present if unsampled
is greater than 0. The number of "hidden" unsampled hosts (with no associated nodes) along each branch.
picture
Present if draw
was TRUE; a ggtree
object.
igraph
Present if igraph
was TRUE; an igraph
object.
# draw one sample from the uniform distribution generator <- tt.generator(stratus.example.tree) samples <- sample.tt(generator, 1, draw = TRUE) original.tt <- samples[[1]] # sample anew, from node 31 downwards revised.tt <- sample.partial.tt(generator, 1, starting.node = 31, existing = original.tt, draw = TRUE)
# draw one sample from the uniform distribution generator <- tt.generator(stratus.example.tree) samples <- sample.tt(generator, 1, draw = TRUE) original.tt <- samples[[1]] # sample anew, from node 31 downwards revised.tt <- sample.partial.tt(generator, 1, starting.node = 31, existing = original.tt, draw = TRUE)
Sample one or more transmission trees uniformly
sample.tt( generator, count = 1, unsampled = 0, draw = count == 1, igraph = FALSE, verbose = FALSE )
sample.tt( generator, count = 1, unsampled = 0, draw = count == 1, igraph = FALSE, verbose = FALSE )
generator |
A list of class |
count |
How many transmission trees to sample. |
unsampled |
The number of unsampled hosts in the transmission chain. |
draw |
Use |
igraph |
Produce the transmission trees in |
verbose |
Verbose output |
A list, each of whose elements is a list of class tt
with one or more of the following elements:
annotations
Always present. A vector indicating which host (given by numbers corresponding to the ordering in generator$hosts
) is assigned to each phylogeny node.
edgelist
Always present. A data.frame
giving the edge list; the first column are parents and the second children.
hidden
Present if unsampled
is greater than 0. The number of "hidden" unsampled hosts (with no associated nodes) along each branch.
picture
Present if draw
was TRUE; a ggtree
object.
igraph
Present if igraph
was TRUE; an igraph
object.
# draw one sample from the uniform distribution generator <- tt.generator(stratus.example.tree) samples <- sample.tt(generator, 1, draw = TRUE) samples[[1]] # with unsampled.hosts generator.us <- tt.generator(stratus.example.tree, max.unsampled = 2) # note that you can ask for less unsampled hosts than the generator has (but not more) samples.1us <- sample.tt(generator.us, 1, unsampled = 1, draw = TRUE) samples.1us[[1]] # with multiply sampled hosts generator.ms <- tt.generator(stratus.example.tree, tip.map = grouping.map) samples.ms <- sample.tt(generator.ms, 1, draw = TRUE)
# draw one sample from the uniform distribution generator <- tt.generator(stratus.example.tree) samples <- sample.tt(generator, 1, draw = TRUE) samples[[1]] # with unsampled.hosts generator.us <- tt.generator(stratus.example.tree, max.unsampled = 2) # note that you can ask for less unsampled hosts than the generator has (but not more) samples.1us <- sample.tt(generator.us, 1, unsampled = 1, draw = TRUE) samples.1us[[1]] # with multiply sampled hosts generator.ms <- tt.generator(stratus.example.tree, tip.map = grouping.map) samples.ms <- sample.tt(generator.ms, 1, draw = TRUE)
rtree
from ape
An unexceptional phylogeny generated with rtree
from ape
A phylogenetic tree (phylo
format) with 20 tips and 19 internal nodes
This function produces a list of class tt.generator
which can be used to randomly sample transmission trees for the input phylogeny, and contains information on the number of compatible transmission trees.
tt.generator( tree, max.unsampled = 0, max.infection.to.sampling = Inf, max.sampling.to.noninfectious = Inf, minimum.heights = NULL, maximum.heights = NULL, tip.map = tree$tip.label, bigz = FALSE )
tt.generator( tree, max.unsampled = 0, max.infection.to.sampling = Inf, max.sampling.to.noninfectious = Inf, minimum.heights = NULL, maximum.heights = NULL, tip.map = tree$tip.label, bigz = FALSE )
tree |
A |
max.unsampled |
The maximum number of unsampled hosts in the transmission chain. The default is 0. |
max.infection.to.sampling |
The greatest time period (in tree branch length units) that can have elapsed between the infection of a host and a tip from that host appearing. The default is infinity, meaning that no such time limit exists. |
max.sampling.to.noninfectious |
The greatest time period (in tree branch length units) that can have elapsed between a tip from a host appearing and that host becoming noninfectious. If this is 0, a host's infection ends at the time of its last tip. The default is infinity, meaning that no such time limit exists. |
minimum.heights |
A vector of the same length as the set of sampled hosts (at present this is always the number of tips of the tree) dictating the minimum height at which nodes can be allocated to each host. The order is the same as the order of tips in |
maximum.heights |
A vector of the same length as the set of sampled hosts (at present this is always the number of tips of the tree) dictating the maximum height at which nodes can be allocated to each host. The order is the same as the order of tips in |
tip.map |
A vector of the same length as the tip set of the tree listing a string giving the host from which the corresponding sample was derived. If absent, each tip is assumed to come from a different host and the tip names are taken to be the host names. |
bigz |
Use |
A list of class tt.info
with the following fields:
tree
The input tree
tt.count
The total number of possible transmission trees.
hosts
The vector of host names. The order of the elements of this vector is used in the output of sample.tt
.
height.limits
A matrix giving maximum and minimum node heights, in two columns. Rows are ordered by the order of hosts given in the host
field.
bridge
A vector with the same length as the node set of the tree, dictating which nodes have their annotation forced by the tip annotations. Entries are host numbers for nodes whose annotation must be that host, and NA for nodes which can take multiple hosts.
node.calculations
A list with the same length as the number of nodes of the tree and whose entries are indexed in the same order. If max.unsampled
is 0, each has the following fields (the terminology here comes from the Hall paper):
p
The number of valid partitions of the subtree rooted at this node.
pstar
The number of valid partitions of the unrooted tree obtained by attaching a single extra tip to the root node of the subtree rooted at this node. Alternatively, if any height constraints are given, a vector of the same length as the set of hosts, giving the number of partitions of the unrooted tree if the extra partition element is subject to the same minimum (but not maximum) height constraint as each host in turn.
v
A list indexed by the set of hosts, whose entries are the number of valid partitions of the subtree rooted at this node where the root node is in the partition element from each host.
Alternatively, if max.unsampled
is greater than 0, the entries are:
p
A vector of length 1 + max.unsampled
giving the number of valid partitions of the subtree rooted at this node if there are between 0 and max.unsampled
(in order) partition elements containing no tips.
pstar
A vector of length 1 + max.unsampled
giving the number of valid partitions of the tree obtained from the subtree rooted at this node by adding an extra tip connected to the root node, if there are between 0 and max.unsampled
(in order) partition elements containing no tips.
ps
As with p
, except this counts only partitions that have the root node in a sampled component (one containing at least one tip).
pu
As with p
, except this counts only partitions that have the have the root node in an unsampled component (one containing no tip).
v
A list indexed by the set of hosts and "unsampled", whose entries are, for each host and an unsampled host, a vector of length 1 + max.unsampled
counting the number of partitions that have the root node in that host's component if there are between 0 and max.unsampled
partition elements containing no tips.
# make a generator for the example tree generator <- tt.generator(stratus.example.tree) # count the total number of transmission trees generator$tt.count # make a generator for the example tree with at most two unsampled hosts generator.2us <- tt.generator(stratus.example.tree, max.unsampled = 2) # make a generator for the example tree with no infection after sampling generator.limits <- tt.generator(stratus.example.tree, max.sampling.to.noninfectious = 0) # make a generator with multiple sampling defined by the vector grouping.map generator.ms <- tt.generator(stratus.example.tree, tip.map = grouping.map)
# make a generator for the example tree generator <- tt.generator(stratus.example.tree) # count the total number of transmission trees generator$tt.count # make a generator for the example tree with at most two unsampled hosts generator.2us <- tt.generator(stratus.example.tree, max.unsampled = 2) # make a generator for the example tree with no infection after sampling generator.limits <- tt.generator(stratus.example.tree, max.sampling.to.noninfectious = 0) # make a generator with multiple sampling defined by the vector grouping.map generator.ms <- tt.generator(stratus.example.tree, tip.map = grouping.map)