Skip to content

API Reference

I recommend that TACT only be used via its console commands.

While TACT can be used as a Python library, its internal interface is subject to change at any time, even for minor or patch versions. This API documentation is provided merely for the sake of completeness.

Numerical functions

Functions in tact/lib.py.

Functions to handle various numerical operations, including optimization.

crown_capture_probability(n, k)

Calculate the probability of observing the crown node of an incompletely sampled node.

That is, the probability that a sample of k taxa from a clade of n total taxa includes the root (crown) node of the clade, under a Yule process.

This equation is taken from:

Sanderson, M. J. 1996. How many taxa must be sampled to identify the root node of a large clade? Systematic Biology 45:168-173

Parameters:

Name Type Description Default
n int

total number of taxa

required
k int

sampled taxa

required

Returns:

Type Description
float

probability of including a root node.

Source code in tact/lib.py
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
def crown_capture_probability(n, k):
    """Calculate the probability of observing the crown node of an incompletely sampled node.

    That is, the probability that a sample of `k` taxa from a clade of `n` total taxa
    includes the root (crown) node of the clade, under a Yule process.

    This equation is taken from:

    Sanderson, M. J. 1996. How many taxa must be sampled to identify
    the root node of a large clade? Systematic Biology 45:168-173

    Args:
        n (int): total number of taxa
        k (int): sampled taxa

    Returns:
        (float): probability of including a root node.
    """
    if n < k:
        raise Exception(f"n must be greater than or equal to k (n={n}, k={k})")
    if n == 1 and k == 1:
        return 0  # not technically correct but it works for our purposes
    return 1 - 2 * (n - k) / ((n - 1) * (k + 1))

get_bd(r, a)

Converts turnover and relative extinction to birth and death rates.

Parameters:

Name Type Description Default
r float

turnover or net diversification (birth - death)

required
a float

relative extinction (death / birth)

required

Returns:

Type Description
(float, float)

birth, death

Source code in tact/lib.py
15
16
17
18
19
20
21
22
23
24
25
def get_bd(r, a):
    """Converts turnover and relative extinction to birth and death rates.

    Args:
        r (float): turnover or net diversification (birth - death)
        a (float): relative extinction (death / birth)

    Returns:
        (float, float): birth, death
    """
    return -r / (a - 1), -a * r / (a - 1)

get_new_times(ages, birth, death, missing, told=None, tyoung=None)

Simulates new speciation events in an incomplete phylogeny.

Assumes a constant-rate birth-death process. Adapted from the R function TreeSim::corsim, written by Tanja Stadler.

N. Cusimano, T. Stadler, S. Renner. A new method for handling missing species in diversification analysis applicable to randomly or non-randomly sampled phylogenies. Syst. Biol., 61(5): 785-792, 2012.

Parameters:

Name Type Description Default
ages list

vector of waiting times

required
birth float

birth rate

required
death float

death rate

required
missing int

number of missing taxa to simulate

required
told float

maximum simulated age (default: max(ages))

None
tyoung float

minimum simulated age bound (default: 0)

None

Returns:

Type Description
list

vector of simulated waiting times.

Source code in tact/lib.py
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
def get_new_times(ages, birth, death, missing, told=None, tyoung=None):
    """Simulates new speciation events in an incomplete phylogeny.

    Assumes a constant-rate birth-death process. Adapted from the R function `TreeSim::corsim`,
    written by Tanja Stadler.

    N. Cusimano, T. Stadler, S. Renner. A new method for handling missing
    species in diversification analysis applicable to randomly or
    non-randomly sampled phylogenies. Syst. Biol., 61(5): 785-792, 2012.

    Args:
        ages (list): vector of waiting times
        birth (float): birth rate
        death (float): death rate
        missing (int): number of missing taxa to simulate
        told (float): maximum simulated age (default: `max(ages)`)
        tyoung (float): minimum simulated age bound (default: `0`)

    Returns:
        (list): vector of simulated waiting times.
    """
    if told is None:
        told = max(ages)
    if len(ages) > 0:
        if max(ages) > told and abs(max(ages) - told) > sys.float_info.epsilon:
            raise Exception("Zero or negative branch lengths detected in backbone phylogeny")
    if tyoung is None:
        tyoung = 0

    ages.sort(reverse=True)
    times = [x for x in ages if told >= x >= tyoung]
    times = [told, *times, tyoung]
    ranks = range(0, len(times))
    only_new = []
    while missing > 0:
        if len(ranks) > 2:
            distrranks = []
            for i in range(1, len(ranks)):
                temp = ranks[i] * (intp1(times[i - 1], birth, death) - intp1(times[i], birth, death))
                distrranks.append(temp)
            try:
                dsum = sum(distrranks)
                distrranks = [x / dsum for x in distrranks]
                for i in range(1, len(distrranks)):
                    distrranks[i] = distrranks[i] + distrranks[i - 1]
                r = random.uniform(0, 1)
                addrank = min([idx for idx, x in enumerate(distrranks) if x > r])
            except ZeroDivisionError:
                addrank = 0
            except ValueError:
                addrank = 0
        else:
            addrank = 0
        r = random.uniform(0, 1)
        const = intp1(times[addrank], birth, death) - intp1(times[addrank + 1], birth, death)
        try:
            temp = intp1(times[addrank + 1], birth, death) / const
        except ZeroDivisionError:
            temp = 0.0
        xnew = 1 / (death - birth) * log((1 - (r + temp) * const * birth) / (1 - (r + temp) * const * death))
        only_new.append(xnew)
        missing -= 1
    only_new.sort(reverse=True)
    return only_new

get_ra(b, d)

Converts birth and death to turnover and relative extinction rates.

Parameters:

Name Type Description Default
b float

birth rate

required
d float

extinction rate

required

Returns:

Type Description
(float, float)

turnover, relative extinction

Source code in tact/lib.py
28
29
30
31
32
33
34
35
36
37
38
def get_ra(b, d):
    """Converts birth and death to turnover and relative extinction rates.

    Args:
        b (float): birth rate
        d (float): extinction rate

    Returns:
        (float, float): turnover, relative extinction
    """
    return (b - d, d / b)

intp1(t, l, m)

Computes a constant necessary to sample the time of a missing speciation event.

This constant is not named, but was used in eqn A.2 and called c_2, described in:

N. Cusimano, T. Stadler, S. Renner. A new method for handling missing species in diversification analysis applicable to randomly or non-randomly sampled phylogenies. Syst. Biol., 61(5): 785-792, 2012.

This function was originally implemented as TreeSim:::intp1.

Source code in tact/lib.py
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
def intp1(t, l, m):  # noqa: E741
    """Computes a constant necessary to sample the time of a missing speciation event.

    This constant is not named, but was used in eqn A.2 and called c_2, described in:

    N. Cusimano, T. Stadler, S. Renner. A new method for handling missing
    species in diversification analysis applicable to randomly or
    non-randomly sampled phylogenies. Syst. Biol., 61(5): 785-792, 2012.

    This function was originally implemented as `TreeSim:::intp1`.
    """
    try:
        return (1 - exp(-(l - m) * t)) / (l - m * exp(-(l - m) * t))
    except OverflowError:
        return float(intp1_exact(t, l, m))

intp1_exact(t, l, m)

Exact version of intp1 using Decimal math.

Source code in tact/lib.py
230
231
232
233
234
235
236
237
def intp1_exact(t, l, m):  # noqa: E741
    """Exact version of `intp1` using Decimal math."""
    l = D(l)  # noqa: E741
    m = D(m)
    t = D(t)
    num = D(1) - (-(l - m) * t).exp()
    denom = l - m * (-(l - m) * t).exp()
    return num / denom

lik_constant(vec, rho, t, root=1, survival=1, p1=p1)

Calculates the likelihood of a constant-rate birth-death process.

This likelihood function is conditioned on the waiting times of a phylogenetic tree and degree of incomplete sampling. Based off of the R function TreePar::LikConstant written by Tanja Stadler.

T. Stadler. On incomplete sampling under birth-death models and connections to the sampling-based coalescent. Jour. Theo. Biol. 261: 58-66, 2009.

Parameters:

Name Type Description Default
vec (float, float)

two element tuple of birth and death

required
rho float

sampling fraction

required
t list

vector of waiting times

required
root bool

include the root or not? (default: 1)

1
survival bool

assume survival of the process? (default: 1)

1
p1

(func): the p1 function used to compute this likelihood.

p1

Returns:

Type Description
float

likelihood of the birth-death process.

Source code in tact/lib.py
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
def lik_constant(vec, rho, t, root=1, survival=1, p1=p1):
    """Calculates the likelihood of a constant-rate birth-death process.

    This likelihood function is conditioned on the waiting times of a phylogenetic tree and
    degree of incomplete sampling. Based off of the R function `TreePar::LikConstant` written by Tanja Stadler.

    T. Stadler. On incomplete sampling under birth-death models and connections
    to the sampling-based coalescent. Jour. Theo. Biol. 261: 58-66, 2009.

    Args:
        vec (float, float): two element tuple of birth and death
        rho (float): sampling fraction
        t (list): vector of waiting times
        root (bool): include the root or not? (default: 1)
        survival (bool): assume survival of the process? (default: 1)
        p1: (func): the `p1` function used to compute this likelihood.

    Returns:
        (float): likelihood of the birth-death process.
    """
    l = vec[0]  # noqa: E741
    m = vec[1]
    t.sort(reverse=True)
    lik = (root + 1) * log(p1(t[0], l, m, rho))
    for tt in t[1:]:
        lik += log(l) + log(p1(tt, l, m, rho))
    if survival == 1:
        lik -= (root + 1) * log(1 - p0(t[0], l, m, rho))
    return -lik

optim_bd(ages, sampling, min_bound=1e-09)

Optimizes birth and death parameters given a vector of splitting times and sampling fraction.

Parameters:

Name Type Description Default
ages list

vector of node ages

required
sampling float

sampling fraction (0, 1]

required
min_bound float

minimum birth rate

1e-09

Returns:

Name Type Description
birth float

optimized birth rate.

death float

optimized death rate.

Source code in tact/lib.py
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
def optim_bd(ages, sampling, min_bound=1e-9):
    """Optimizes birth and death parameters given a vector of splitting times and sampling fraction.

    Args:
        ages (list): vector of node ages
        sampling (float): sampling fraction (0, 1]
        min_bound (float): minimum birth rate

    Returns:
        birth (float): optimized birth rate.
        death (float): optimized death rate.
    """
    if max(ages) < 0.000001:
        init_r = 1e-3
    else:
        # Magallon-Sanderson crown estimator
        init_r = (log((len(ages) + 1) / sampling) - log(2)) / max(ages)
        init_r = max(1e-3, init_r)
    bounds = ((min_bound, 100), (0, 1 - min_bound))
    result = two_step_optim(wrapped_lik_constant, x0=(init_r, min_bound), bounds=bounds, args=(sampling, ages))
    return get_bd(*result)

optim_yule(ages, sampling, min_bound=1e-09)

Optimizes birth parameter under a Yule model, given a vector of splitting times and sampling fraction.

Parameters:

Name Type Description Default
ages list

vector of node ages

required
sampling float

sampling fraction (0, 1]

required
min_bound float

minimum birth rate

1e-09

Returns:

Name Type Description
birth float

optimized birth rate.

death float

optimized death rate. Always 0.

Source code in tact/lib.py
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
def optim_yule(ages, sampling, min_bound=1e-9):
    """Optimizes birth parameter under a Yule model, given a vector of splitting times and sampling fraction.

    Args:
        ages (list): vector of node ages
        sampling (float): sampling fraction (0, 1]
        min_bound (float): minimum birth rate

    Returns:
        birth (float): optimized birth rate.
        death (float): optimized death rate. Always 0.
    """
    bounds = (min_bound, 100)
    result = minimize_scalar(wrapped_lik_constant_yule, bounds=bounds, args=(sampling, ages), method="Bounded")
    if result["success"]:
        return (result["x"], 0.0)

    raise Exception(f"Optimization failed: {result['message']} (code {result['status']})")

p0(t, l, m, rho)

Compute the probability of no sampled descendants.

Specifically, this is the probability that an individual alive at time t before today has no sampled extinct or extant descendants, and assumes that there is no sampling in the past. This can alternatively be interpreted as the probability of sampling zero extant individuals and potentially infinite extinct individuals.

This equation is described as remark 3.2 in:

Stadler, T. (2010). Sampling-through-time in birth-death trees. Journal of Theoretical Biology, 267(3), 396-404.

It was originally implemented as TreePar:::p0, whose original description was in:

Stadler, T. (2011). Mammalian phylogeny reveals recent diversification rate shifts. Proceedings of the National Academy of Sciences, 108(15), 6187-6192.

Source code in tact/lib.py
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
def p0(t, l, m, rho):  # noqa: E741
    """Compute the probability of no sampled descendants.

    Specifically, this is the probability that an individual alive at time `t` before today has no
    sampled extinct or extant descendants, and assumes that there is no sampling in the past. This
    can alternatively be interpreted as the probability of sampling zero extant individuals and
    potentially infinite extinct individuals.

    This equation is described as remark 3.2 in:

    Stadler, T. (2010). Sampling-through-time in birth-death trees.
    Journal of Theoretical Biology, 267(3), 396-404.

    It was originally implemented as `TreePar:::p0`, whose original description was in:

    Stadler, T. (2011). Mammalian phylogeny reveals recent diversification rate shifts.
    Proceedings of the National Academy of Sciences, 108(15), 6187-6192.
    """
    try:
        return 1 - rho * (l - m) / (rho * l + (l * (1 - rho) - m) * exp(-(l - m) * t))
    except FloatingPointError:
        return float(p0_exact(t, l, m, rho))

p0_exact(t, l, m, rho)

Exact version of p0 using Decimal math.

Source code in tact/lib.py
140
141
142
143
144
145
146
def p0_exact(t, l, m, rho):  # noqa: E741
    """Exact version of `p0` using Decimal math."""
    t = D(t)
    l = D(l)  # noqa: E741
    m = D(m)
    rho = D(rho)
    return D(1) - rho * (l - m) / (rho * l + (l * (D(1) - rho) - m) * (-(l - m) * t).exp())

p1(t, l, m, rho)

Compute the probability of exactly one sampled descendant.

Specifically, the probability that an individual alive at time t before today has precisely one sampled extant descendant and no sampled extinct descendant, and assumes that there is no sampling in the past. This can alternatively be interpreted as the probability of sampling exactly one extant individual and potentially infinite extinct individuals.

This implementation is an optimized version of p1_orig, using common subexpression elimination and strength reduction from exponentiation to multiplication.

This equation is described as remark 3.2 in:

Stadler, T. (2010). Sampling-through-time in birth-death trees. Journal of Theoretical Biology, 267(3), 396-404.

It was originally implemented as TreePar:::p1, whose original description was in:

Stadler, T. (2011). Mammalian phylogeny reveals recent diversification rate shifts. Proceedings of the National Academy of Sciences, 108(15), 6187-6192.

Source code in tact/lib.py
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
def p1(t, l, m, rho):  # noqa: E741
    """Compute the probability of exactly one sampled descendant.

    Specifically, the probability that an individual alive at time `t` before today has precisely one sampled
    extant descendant and no sampled extinct descendant, and assumes that there is no sampling in the past.
    This can alternatively be interpreted as the probability of sampling exactly one extant individual and
    potentially infinite extinct individuals.

    This implementation is an optimized version of `p1_orig`, using common subexpression elimination
    and strength reduction from exponentiation to multiplication.

    This equation is described as remark 3.2 in:

    Stadler, T. (2010). Sampling-through-time in birth-death trees.
    Journal of Theoretical Biology, 267(3), 396-404.

    It was originally implemented as `TreePar:::p1`, whose original description was in:

    Stadler, T. (2011). Mammalian phylogeny reveals recent diversification rate shifts.
    Proceedings of the National Academy of Sciences, 108(15), 6187-6192.
    """
    try:
        ert = np.exp(-(l - m) * t, dtype=np.float64)
        num = rho * (l - m) ** 2 * ert
        denom = (rho * l + (l * (1 - rho) - m) * ert) ** 2
        res = num / denom
    except (OverflowError, FloatingPointError):
        res = float(p1_exact(t, l, m, rho))
    if res == 0.0:
        return sys.float_info.min
    return res

p1_exact(t, l, m, rho)

Exact version of p1 using Decimal math.

Source code in tact/lib.py
173
174
175
176
177
178
179
180
181
def p1_exact(t, l, m, rho):  # noqa: E741
    """Exact version of `p1` using Decimal math."""
    t = D(t)
    l = D(l)  # noqa: E741
    m = D(m)
    rho = D(rho)
    num = rho * (l - m) ** D(2) * (-(l - m) * t).exp()
    denom = (rho * l + (l * (1 - rho) - m) * (-(l - m) * t).exp()) ** D(2)
    return num / denom

p1_orig(t, l, m, rho)

Original version of p1, here for testing and comparison purposes.

Source code in tact/lib.py
184
185
186
187
188
189
190
191
192
193
194
def p1_orig(t, l, m, rho):  # noqa: E741
    """Original version of `p1`, here for testing and comparison purposes."""
    try:
        num = rho * (l - m) ** 2 * np.exp(-(l - m) * t)
        denom = (rho * l + (l * (1 - rho) - m) * np.exp(-(l - m) * t)) ** 2
        res = num / denom
    except (OverflowError, FloatingPointError):
        res = float(p1_exact(t, l, m, rho))
    if res == 0.0:
        return sys.float_info.min
    return res

two_step_optim(func, x0, bounds, args)

Conduct a two-step function optimization.

First, use the fast L-BFGS-B method, and if that fails, use simulated annealing.

Parameters:

Name Type Description Default
func callable

function to optimize

required
x0 tuple

initial conditions

required
bounds tuple

boundary conditions

required
args list

additional arguments to pass to func

required

Returns:

Name Type Description
params tuple

optimized parameter values

Source code in tact/lib.py
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
def two_step_optim(func, x0, bounds, args):
    """Conduct a two-step function optimization.

    First, use the fast L-BFGS-B method, and if that fails, use simulated annealing.

    Args:
        func (callable): function to optimize
        x0 (tuple): initial conditions
        bounds (tuple): boundary conditions
        args (list): additional arguments to pass to `func`

    Returns:
        params (tuple): optimized parameter values
    """
    try:
        result = minimize(func, x0=x0, bounds=bounds, args=args, method="L-BFGS-B")
        if result["success"]:
            return result["x"].tolist()
    except FloatingPointError:
        pass

    result = dual_annealing(func, x0=x0, bounds=bounds, args=args)
    if result["success"]:
        return result["x"].tolist()

    raise Exception(f"Optimization failed: {result['message']} (code {result['status']})")

wrapped_lik_constant(x, sampling, ages)

Wrapper for birth-death likelihood to make optimizing more convenient.

Parameters:

Name Type Description Default
x (float, float)

turnover, relative extinction

required
sampling float

sampling fraction (0, 1]

required
ages list

vector of node ages

required

Returns:

Type Description
float

a likelihood

Source code in tact/lib.py
41
42
43
44
45
46
47
48
49
50
51
52
def wrapped_lik_constant(x, sampling, ages):
    """Wrapper for birth-death likelihood to make optimizing more convenient.

    Args:
        x (float, float): turnover, relative extinction
        sampling (float): sampling fraction (0, 1]
        ages (list): vector of node ages

    Returns:
        (float): a likelihood
    """
    return lik_constant(get_bd(*x), sampling, ages)

wrapped_lik_constant_yule(x, sampling, ages)

Wrapper for Yule likelihood to make optimizing more convenient.

Parameters:

Name Type Description Default
x float

birth rate

required
sampling float

sampling fraction (0, 1]

required
ages list

vector of node ages

required

Returns:

Type Description
float

a likelihood

Source code in tact/lib.py
55
56
57
58
59
60
61
62
63
64
65
66
def wrapped_lik_constant_yule(x, sampling, ages):
    """Wrapper for Yule likelihood to make optimizing more convenient.

    Args:
        x (float): birth rate
        sampling (float): sampling fraction (0, 1]
        ages (list): vector of node ages

    Returns:
        (float): a likelihood
    """
    return lik_constant((x, 0.0), sampling, ages)

Tree functions

Functions in tact/tree_util.py.

Functions specifically to handle DendroPy tree objects.

compute_node_depths(tree)

Returns a dictionary of node depths for each node with a label.

Source code in tact/tree_util.py
116
117
118
119
120
121
122
123
124
125
def compute_node_depths(tree):
    """Returns a dictionary of node depths for each node with a label."""
    res = {}
    for leaf in tree.leaf_node_iter():
        cnt = 0
        for anc in leaf.ancestor_iter():
            if anc.label:
                cnt += 1
        res[leaf.taxon.label] = cnt
    return res

count_locked(node)

How many edges under node are locked?

Source code in tact/tree_util.py
205
206
207
def count_locked(node):
    """How many edges under `node` are locked?"""
    sum([x.label == "locked" for x in edge_iter(node)])

edge_iter(node, filter_fn=None)

Iterates over the child edge of node and all its descendants.

Can optionally be filtered by filter_fn.

Source code in tact/tree_util.py
55
56
57
58
59
60
61
62
63
64
65
def edge_iter(node, filter_fn=None):
    """Iterates over the child edge of `node` and all its descendants.

    Can optionally be filtered by `filter_fn`.
    """
    stack = list(node.child_edge_iter())
    while stack:
        edge = stack.pop()
        if filter_fn is None or filter_fn(edge):
            yield edge
        stack.extend(edge.head_node.child_edge_iter())

get_age_intervals(node)

Gets the (possibly disjoint) interval that could be generated in the clade under node.

This assumes that grafts to locked edges are restricted.

Source code in tact/tree_util.py
231
232
233
234
235
236
237
238
239
def get_age_intervals(node):
    """Gets the (possibly disjoint) interval that could be generated in the clade under `node`.

    This assumes that grafts to locked edges are restricted.
    """
    acc = portion.empty()
    for edge in edge_iter(node, lambda x: x.label != "locked"):
        acc = acc | portion.closed(edge.head_node.age, edge.tail_node.age)
    return acc

get_ages(node, include_root=False)

Get list of ages under a node.

Returns the list of ages of the children of a given node, optionally including the node's age if include_root is True.

Source code in tact/tree_util.py
35
36
37
38
39
40
41
42
43
44
def get_ages(node, include_root=False):
    """Get list of ages under a node.

    Returns the list of ages of the children of a given `node`,
    optionally including the `node`'s age if `include_root` is True.
    """
    ages = [x.age for x in node.ageorder_iter(include_leaves=False, descending=True)]
    if include_root:
        ages += [node.age]
    return ages

get_birth_death_rates(node, sampfrac, yule=False, include_root=False)

Estimate birth-death rates from a subtree.

These birth and death rates descend from node with sampling fraction sampfrac.

Optionally restrict to a Yule pure-birth model.

Source code in tact/tree_util.py
13
14
15
16
17
18
19
20
21
22
23
def get_birth_death_rates(node, sampfrac, yule=False, include_root=False):
    """Estimate birth-death rates from a subtree.

    These birth and death rates descend from `node` with sampling fraction `sampfrac`.

    Optionally restrict to a Yule pure-birth model.
    """
    if yule:
        return optim_yule(get_ages(node, include_root), sampfrac)

    return optim_bd(get_ages(node, include_root), sampfrac)

get_min_age(node)

Gets the minimum possible age that could be generated in a clade under node.

This assumes that grafts to locked edges are restricted.

Source code in tact/tree_util.py
215
216
217
218
219
220
221
222
223
224
225
226
227
228
def get_min_age(node):
    """Gets the minimum possible age that could be generated in a clade under `node`.

    This assumes that grafts to locked edges are restricted.
    """
    interval = get_age_intervals(node)

    if not interval.atomic:
        raise DisjointConstraintError(f"Constraint on {node} implies disjoint interval {interval}")

    if interval.empty:
        return 0.0

    return interval.lower

get_monophyletic_node(tree, species)

Returns the node or None that is the MRCA of the species in tree.

Source code in tact/tree_util.py
26
27
28
29
30
31
32
def get_monophyletic_node(tree, species):
    """Returns the node or None that is the MRCA of the `species` in `tree`."""
    mrca = tree.mrca(taxon_labels=species)
    if mrca and species.issuperset(get_tip_labels(mrca)):
        return mrca

    return None

get_short_branches(node)

Yields an iterator of especially short edges under node.

Source code in tact/tree_util.py
109
110
111
112
113
def get_short_branches(node):
    """Yields an iterator of especially short edges under `node`."""
    for edge in edge_iter(node):
        if edge.length <= 0.001:
            yield edge

get_tip_labels(tree_or_node)

Returns a set of tip labels for a node or tree.

Source code in tact/tree_util.py
47
48
49
50
51
52
def get_tip_labels(tree_or_node):
    """Returns a `set` of tip labels for a node or tree."""
    try:
        return {x.taxon.label for x in tree_or_node.leaf_node_iter()}
    except AttributeError:
        return {x.taxon.label for x in tree_or_node.leaf_iter()}

get_tree(path, namespace=None)

Gets a DendroPy tree from a path and precalculate its node ages and bipartition bitmask.

Source code in tact/tree_util.py
68
69
70
71
72
def get_tree(path, namespace=None):
    """Gets a DendroPy tree from a path and precalculate its node ages and bipartition bitmask."""
    tree = dendropy.Tree.get_from_path(path, schema="newick", taxon_namespace=namespace, rooting="default-rooted")
    update_tree_view(tree)
    return tree

graft_node(graft_recipient, graft, stem=False)

Grafts a node graft randomly in the subtree below node graft_recipient.

The attribute graft.age must be set so we know where the best place is to graft the node. The node graft can optionally have child nodes, in this case the edge.length attribute should be set on all child nodes if the tree is to remain ultrametric.

We graft things "below" a node by picking one of the children of that node and forcing it to be sister to the grafted node and adjusting the edge lengths accordingly. Therefore, the node above which the graft lives (i.e., the one that will be the child of the new graft) must fulfill the following requirements:

  1. Must not be the crown node (cannot graft things above crown node)
  2. Must be younger than the graft node (no negative branches)
  3. Seed node must be older than graft node (no negative branches)
  4. Must not be locked (intruding on monophyly)
Source code in tact/tree_util.py
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
def graft_node(graft_recipient, graft, stem=False):
    """Grafts a node `graft` randomly in the subtree below node `graft_recipient`.

    The attribute `graft.age` must be set so we know where the best place is to graft the node.
    The node `graft` can optionally have child nodes, in this case the `edge.length` attribute
    should be set on all child nodes if the tree is to remain ultrametric.

    We graft things "below" a node by picking one of the children
    of that node and forcing it to be sister to the grafted node
    and adjusting the edge lengths accordingly. Therefore, the node
    *above* which the graft lives (i.e., the one that will be the child
    of the new graft) must fulfill the following requirements:

    1. Must not be the crown node (cannot graft things above crown node)
    2. Must be younger than the graft node (no negative branches)
    3. Seed node must be older than graft node (no negative branches)
    4. Must not be locked (intruding on monophyly)
    """

    def filter_fn(x):
        return x.head_node.age <= graft.age and x.head_node.parent_node.age >= graft.age and x.label != "locked"

    all_edges = list(edge_iter(graft_recipient))
    if stem:
        # also include the crown node's subtending edge
        all_edges.append(graft_recipient.edge)
    eligible_edges = [x for x in all_edges if filter_fn(x)]

    if not eligible_edges:
        raise Exception(f"could not place node {graft} in clade {graft_recipient}")

    focal_node = random.choice([x.head_node for x in eligible_edges])
    seed_node = focal_node.parent_node
    sisters = focal_node.sibling_nodes()

    # pick a child edge and detach its corresponding node
    #
    # DendroPy's Node.remove_child() messes with the edge lengths.
    # But, Node.clear_child_nodes() simply cuts that bit of the tree out.
    seed_node.clear_child_nodes()

    # set the correct edge length on the grafted node and make the grafted
    # node a child of the seed node
    graft.edge.length = seed_node.age - graft.age
    if graft.edge.length < 0:
        raise Exception("negative branch length")
    sisters.append(graft)
    seed_node.set_child_nodes(sisters)

    # make the focal node a child of the grafted node and set edge length
    focal_node.edge.length = graft.age - focal_node.age
    if focal_node.edge.length < 0:
        raise Exception("negative branch length")
    graft.add_child(focal_node)

    # return the (potentially new) crown of the clade
    if graft_recipient.parent_node == graft:
        return graft
    return graft_recipient

is_binary(node)

Is the subtree under node a fully bifurcating tree?

Source code in tact/tree_util.py
87
88
89
90
91
92
def is_binary(node):
    """Is the subtree under `node` a fully bifurcating tree?"""
    for internal_node in node.preorder_internal_node_iter():
        if len(internal_node.child_nodes()) != 2:
            return False
    return True

is_fully_locked(node)

Are all the edges below node locked?

Source code in tact/tree_util.py
210
211
212
def is_fully_locked(node):
    """Are all the edges below `node` locked?"""
    return all(x.label == "locked" for x in edge_iter(node))

is_ultrametric(tree, tolerance=1e-06)

Is the tree ultrametric, within a specified tolerance?

Uses the relative difference between minimum and maximum root-to-tip distances.

Source code in tact/tree_util.py
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
def is_ultrametric(tree, tolerance=1e-6):
    """Is the `tree` ultrametric, within a specified `tolerance`?

    Uses the relative difference between minimum and maximum root-to-tip distances.
    """
    tree.calc_node_root_distances()
    lengths = {}
    for leaf in tree.leaf_node_iter():
        lengths[leaf.taxon.label] = leaf.root_distance
    t_min = min(lengths.items(), key=lambda x: x[1])
    t_max = max(lengths.items(), key=lambda x: x[1])
    return (math.isclose(t_min[1], t_max[1], rel_tol=tolerance), (t_min, t_max))

lock_clade(node, stem=False)

Locks a clade descending from node so future grafts will avoid locked edges.

Source code in tact/tree_util.py
189
190
191
192
193
194
def lock_clade(node, stem=False):
    """Locks a clade descending from `node` so future grafts will avoid locked edges."""
    for edge in edge_iter(node):
        edge.label = "locked"
    if stem:
        node.edge.label = "locked"

unlock_clade(node, stem=False)

Unlocks a clade descending from node so new tips can be grafted to its edges.

Source code in tact/tree_util.py
197
198
199
200
201
202
def unlock_clade(node, stem=False):
    """Unlocks a clade descending from `node` so new tips can be grafted to its edges."""
    for edge in edge_iter(node):
        edge.label = ""
    if stem:
        node.edge.label = ""

update_tree_view(tree)

Perform an in-place update of a DendroPy tree object with node ages and bipartition bitmask.

We also correct for minor ultrametricity errors.

Returns a list of tip labels.

Source code in tact/tree_util.py
75
76
77
78
79
80
81
82
83
84
def update_tree_view(tree):
    """Perform an in-place update of a DendroPy tree object with node ages and bipartition bitmask.

    We also correct for minor ultrametricity errors.

    Returns a list of tip labels.
    """
    tree.calc_node_ages(is_force_max_age=True)
    tree.update_bipartitions()
    return get_tip_labels(tree)

FastMRCA

Functions in tact/fastmrca.py.

Singleton object that helps speed up MRCA lookups.

bitmask(labels)

Gets a bitmask for the taxa in labels, potentially in parallel.

Source code in tact/fastmrca.py
14
15
16
17
18
def bitmask(labels):
    """Gets a bitmask for the taxa in `labels`, potentially in parallel."""
    global tree
    tn = tree.taxon_namespace
    return tn.taxa_bitmask(labels=labels)

fastmrca_getter(tn, x)

Helper function for submitting stuff.

Source code in tact/fastmrca.py
31
32
33
34
35
36
37
def fastmrca_getter(tn, x):
    """Helper function for submitting stuff."""
    taxa = tn.get_taxa(labels=x)
    mask = 0
    for taxon in taxa:
        mask |= tn.taxon_bitmask(taxon)
    return mask

get(labels)

Pulls a MRCA node out for the taxa in labels.

Source code in tact/fastmrca.py
21
22
23
24
25
26
27
28
def get(labels):
    """Pulls a MRCA node out for the taxa in `labels`."""
    global tree
    labels = set(labels)
    mrca = tree.mrca(leafset_bitmask=bitmask(labels))
    if mrca and labels.issuperset(get_tip_labels(mrca)):
        return mrca
    return None

initialize(phy)

Initialize the fastmrca singleton with a tree.

Source code in tact/fastmrca.py
 8
 9
10
11
def initialize(phy):
    """Initialize the fastmrca singleton with a tree."""
    global tree
    tree = phy

Validation

Functions in tact/validation.py.

Various validation functions for click classes and parameters.

BackboneCommand

Bases: Command

Helper class to validate a Click Command that contains a backbone tree.

At a minimum, the Command must contain a backbone parameter, which is validated by validate_newick and checked to ensure it is a binary tree.

If the command also contains a taxonomy parameter, representing a taxonomic phylogeny, this is also validated to ensure that the DendroPy TaxonNamespace is non-strict superset of the taxa contained in backbone. An optional outgroups parameter may add other taxa not in the taxonomy.

If the command also contains an ultrametricity_precision parameter, the ultrametricity of the backbone is also checked.

Source code in tact/validation.py
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
class BackboneCommand(click.Command):
    """Helper class to validate a Click Command that contains a backbone tree.

    At a minimum, the Command must contain a `backbone` parameter, which is validated by `validate_newick`
    and checked to ensure it is a binary tree.

    If the command also contains a `taxonomy` parameter, representing a taxonomic phylogeny,
    this is also validated to ensure that the DendroPy TaxonNamespace is non-strict superset
    of the taxa contained in `backbone`. An optional `outgroups` parameter may add
    other taxa not in the `taxonomy`.

    If the command also contains an `ultrametricity_precision` parameter, the
    ultrametricity of the `backbone` is also checked.
    """

    def validate_backbone_variables(self, ctx, params):
        """Validates variables related to the backbone and taxonomy files."""
        if "taxonomy" in params:
            tn = params["taxonomy"].taxon_namespace
            tn.is_mutable = True
            if params.get("outgroups"):
                tn.new_taxa(params["outgroups"])
            tn.is_mutable = False
            try:
                backbone = validate_newick(ctx, params, params["backbone"], taxon_namespace=tn)
            except dendropy.utility.error.ImmutableTaxonNamespaceError as e:
                msg = f"""
                DendroPy error: {e}

                This usually indicates your backbone has species that are not present in your
                taxonomy. Outgroups not in the taxonomy can be excluded with the --outgroups argument.
                """
                raise click.BadParameter(msg) from None
        else:
            backbone = validate_newick(ctx, params, params["backbone"])

        if not is_binary(backbone):
            raise click.BadParameter("Backbone tree is not binary!")
        update_tree_view(backbone)

        if "ultrametricity_precision" in params:
            ultra, res = is_ultrametric(backbone, params["ultrametricity_precision"])
            if not ultra:
                msg = f"""
                Tree is not ultrametric!
                {res[0][0]} has a root distance of {res[0][1]}, but {res[1][0]} has {res[1][1]}

                Increase `--ultrametricity-precision` or use phytools::force.ultrametric in R
                """
                raise click.BadParameter(msg) from None

        params["backbone"] = backbone
        return params

    def make_context(self, *args, **kwargs):
        """Set up the proper Click context for a command handler."""
        ctx = super().make_context(*args, **kwargs)
        ctx.params = self.validate_backbone_variables(ctx, ctx.params)
        return ctx

make_context(*args, **kwargs)

Set up the proper Click context for a command handler.

Source code in tact/validation.py
102
103
104
105
106
def make_context(self, *args, **kwargs):
    """Set up the proper Click context for a command handler."""
    ctx = super().make_context(*args, **kwargs)
    ctx.params = self.validate_backbone_variables(ctx, ctx.params)
    return ctx

validate_backbone_variables(ctx, params)

Validates variables related to the backbone and taxonomy files.

Source code in tact/validation.py
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
def validate_backbone_variables(self, ctx, params):
    """Validates variables related to the backbone and taxonomy files."""
    if "taxonomy" in params:
        tn = params["taxonomy"].taxon_namespace
        tn.is_mutable = True
        if params.get("outgroups"):
            tn.new_taxa(params["outgroups"])
        tn.is_mutable = False
        try:
            backbone = validate_newick(ctx, params, params["backbone"], taxon_namespace=tn)
        except dendropy.utility.error.ImmutableTaxonNamespaceError as e:
            msg = f"""
            DendroPy error: {e}

            This usually indicates your backbone has species that are not present in your
            taxonomy. Outgroups not in the taxonomy can be excluded with the --outgroups argument.
            """
            raise click.BadParameter(msg) from None
    else:
        backbone = validate_newick(ctx, params, params["backbone"])

    if not is_binary(backbone):
        raise click.BadParameter("Backbone tree is not binary!")
    update_tree_view(backbone)

    if "ultrametricity_precision" in params:
        ultra, res = is_ultrametric(backbone, params["ultrametricity_precision"])
        if not ultra:
            msg = f"""
            Tree is not ultrametric!
            {res[0][0]} has a root distance of {res[0][1]}, but {res[1][0]} has {res[1][1]}

            Increase `--ultrametricity-precision` or use phytools::force.ultrametric in R
            """
            raise click.BadParameter(msg) from None

    params["backbone"] = backbone
    return params

validate_newick(ctx, param, value, **kwargs)

Validates a Newick tree, using appropriate defaults.

Source code in tact/validation.py
23
24
25
def validate_newick(ctx, param, value, **kwargs):
    """Validates a Newick tree, using appropriate defaults."""
    return dendropy.Tree.get_from_stream(value, schema="newick", rooting="default-rooted", **kwargs)

validate_outgroups(ctx, param, value)

Validates an outgroups parameter, by splitting on commas and transforming underscores to spaces.

Source code in tact/validation.py
11
12
13
14
15
16
17
18
19
20
def validate_outgroups(ctx, param, value):
    """Validates an `outgroups` parameter, by splitting on commas and transforming underscores to spaces."""
    if value is None:
        return
    try:
        value = value.split(",")
    except AttributeError:
        # Tuples and lists shouldn't have the .split method
        pass
    return [x.replace("_", " ") for x in value]

validate_taxonomy_tree(ctx, param, value)

Validates a taxonomy tree.

Source code in tact/validation.py
42
43
44
45
def validate_taxonomy_tree(ctx, param, value):
    """Validates a taxonomy tree."""
    value = validate_newick(ctx, param, value)
    return validate_tree_node_depths(ctx, param, value)

validate_tree_node_depths(ctx, param, value)

Validates a DendroPy tree, ensuring that the node depth is equal for all tips.

Source code in tact/validation.py
28
29
30
31
32
33
34
35
36
37
38
39
def validate_tree_node_depths(ctx, param, value):
    """Validates a DendroPy tree, ensuring that the node depth is equal for all tips."""
    node_depths = compute_node_depths(value)
    stats = collections.defaultdict(int)
    for v in node_depths.values():
        stats[v] += 1
    if len(stats) > 1:
        msg = "The tips of your taxonomy tree do not have equal numbers of ranked clades in their ancestor chain:\n"
        for k in sorted(stats.keys()):
            msg += f"* {stats[k]} tips have {k} ranked ancestors\n"
        raise click.BadParameter(msg)
    return value

Exceptions

Functions in tact/exceptions.py.

Exceptions used by TACT.

DisjointConstraintError

Bases: TactError

Exception raised when a set of constraints lead to a disjoint implied age interval.

Source code in tact/exceptions.py
8
9
class DisjointConstraintError(TactError):
    """Exception raised when a set of constraints lead to a disjoint implied age interval."""

TactError

Bases: Exception

Base class for errors raised by TACT.

Source code in tact/exceptions.py
4
5
class TactError(Exception):
    """Base class for errors raised by TACT."""