The ggtree mailing-list is a great place to get help, once you have created a reproducible example that illustrates your problem.
Bioconductor release is adhere to specific R version. Please make sure you are using latest version of R if you want to install the latest release of Bioconductor packages, including
ggtree. Beware that bugs will only be fixed in current release and develop branches. If you find a bug, please follow the guide14 to report it.
If you are new to
R and want to use
ggtree for tree visualization, please do
learn some basic
A very common issue is that users always copy-paste command without looking at
the function’s behavior.
system.file() was used in the
ggtree package documentation to find files in the packages.
system.file package:base R Documentation Find Names of R System Files Description: Finds the full file names of files in packages etc. Usage: system.file(..., package = "base", lib.loc = NULL, mustWork = FALSE)
For users who want to use their own files, please just use relative or absolute file path (e.g.
f = "your/folder/filename").
For example, we can add symbolic points to nodes with
The magic here is we don’t need to map
y position of the points by providing
aes(x, y) to
geom_point() since it was already mapped by
ggtree function and it serves as a global mapping for all layers.
But what if we provide a
dataset in a layer and the
dataset doesn’t contain column of
the layer function also try to map
y and also others if you map them in
As these variable is not available in your
dataset, you will get the following error:
Error in eval(expr, envir, enclos) : object 'x' not found
This can be fixed by using parameter
inherit.aes=FALSE which will disable inheriting mapping from
NEVER DO THIS15.
See the explaination in the ggplot2 book 2ed:
For rectangular/dendrogram layout tree, users can display tip labels as y-axis labels. In this case, no matter how long the labels is, they will not be truncated (see Figure 4.8C).
In this example, the tip labels displayed on Figure A.1A are truncated. This is because the units are in two different spaces (data and pixel). Users can use
xlim to allocate more spaces for tip labels (Figure A.1B).
p + xlim(0, 0.08)
Another solution is to set
clip = "off" to allow drawing outside of the plot panel. We may also need to set
plot.margin to allocate more spaces for margin (Figure A.1C).
If you want to modify tip labels of the tree, you can use
treeio::rename_taxa() to rename a
tree <- read.tree(text = "((A, B), (C, D));") d <- data.frame(label = LETTERS[1:4], label2 = c("sunflower", "tree", "snail", "mushroom")) ## rename_taxa use 1st column as key and 2nd column as value by default ## rename_taxa(tree, d) rename_taxa(tree, d, label, label2) %>% write.tree
##  "((sunflower,tree),(snail,mushroom));"
If the input tree object is a
treedata instance, you can use
write.beast() to export the tree with with associated data to a BEAST compatible NEXUS file.
Renaming phylogeny tip labels seems not be a good idea, since it may introduce problems when mapping the original sequence alignment to the tree. Personally, I recommend to store the new labels as a tip annotation in
tree2 <- full_join(tree, d, by = "label") tree2
## 'treedata' S4 object'. ## ## ...@ phylo: ## Phylogenetic tree with 4 tips and 3 internal nodes. ## ## Tip labels: ## A, B, C, D ## ## Rooted; no branch lengths. ## ## with the following features available: ## 'label2'.
If you just want to show different or additional information when plotting the tree, you don’t need to modify tip labels. This could be easily done via the
%<+% operator to attach the modified version of the labels and than use
geom_tiplab to display
the modified version (Figure A.2).
p <- ggtree(tree) + xlim(NA, 3) p1 <- p + geom_tiplab() ## the following command will produce identical figure of p2 ## ggtree(tree2) + geom_tiplab(aes(label = label2)) p2 <- p %<+% d + geom_tiplab(aes(label=label2)) cowplot::plot_grid(p1, p2, ncol=2, labels = c("A", "B"))
If you want to format labels, you need to set
geom_tiplab and the
label should be string that can be parsed into expression and displayed as described in
For example, the tip labels contains two parts, species name and accession number and we want to display species name in italic, we can use command like this to format specific tip/node label (Figure A.3A):
Another example for formating all tip labels is demonstrated in Figure A.3B:
p2 <- ggtree(tree) + geom_tiplab(aes(label=paste0('bold(', label, ')~italic(', node, ')')), parse=TRUE) + xlim(0, 5)
label can be provided by a
data.frame that contains related information
of the taxa (Figure A.3C).
tree <- read.tree(text = "((a,(b,c)),d);") genus <- c("Gorilla", "Pan", "Homo", "Pongo") species <- c("gorilla", "spp.", "sapiens", "pygmaeus") geo <- c("Africa", "Africa", "World", "Asia") d <- data.frame(label = tree$tip.label, genus = genus, species = species, geo = geo) p3 <- ggtree(tree) %<+% d + xlim(NA, 6) + geom_tiplab(aes(label=paste0('italic(', genus, ')~bolditalic(', species, ')~', geo)), parse=T) cowplot::plot_grid(p1, p2, p3, ncol=3, labels = LETTERS[1:3])
library(ggrepel) library(ggtree) raxml_file <- system.file("extdata/RAxML", "RAxML_bipartitionsBranchLabels.H3", package="treeio") raxml <- read.raxml(raxml_file) ggtree(raxml) + geom_label_repel(aes(label=bootstrap, fill=bootstrap)) + theme(legend.position = c(.1, .8)) + scale_fill_viridis_c()
It’s quite command to store
bootstrap value as node label in
newick format. Visualizing node label is easy using
geom_text2(aes(subset = !isTip, label=label)).
If you want to only display a subset of
bootstrap (e.g. bootstrap > 80), you can’t simply using
geom_text2(subset= (label > 80), label=label) (or
label is a character vector, which contains node label (bootstrap value) and tip label (taxa name). If we use
geom_text2(subset=(as.numeric(label) > 80), label=label), it will also fail since
NAs were introduced by coercion. We need to convert
NAs to logical
FALSE, this can be done by the following code:
nwk <- system.file("extdata/RAxML","RAxML_bipartitions.H3", package='treeio') tr <- read.tree(nwk) ggtree(tr) + geom_label2(aes(label=label, subset = !is.na(as.numeric(label)) & as.numeric(label) > 80))
Another solution is converting the bootstrap value outside
ggtree() ladderizes the input tree so that the tree will appear less cluttered. This is the reason why the tree visualized by
ggtree() is different from the one using
plot.phylo() which displays nonladderized tree. To disable the ladderize effect, user can pass the parameter
ladderize = FALSE to the
ggtree() function as demonstrated in @ref:(fig:ggtreeladderize).
rotateConstr() function provided in ape rotates internal branches based on the specified order of the tips and the order should be followed when plotting the tree (from bottom to top). As
ggtree() always ladderize the input tree, users need to disable by passing
ladderize = FALSE. Then the the order of the tree will be displayed as expected (A.7). Users can also extract tip order that displayed by
ggtree() using the
get_taxa_name() function as demonstrated in session 12.6.
y <- ape::rotateConstr(x, c('t4', 't2', 't5', 't1', 't3')) ggtree(y, ladderize = FALSE) + geom_tiplab()
When outgroups are on a very long branch length (Figure A.8A), we would like to keep the out groups in the tree but ignore their branch lengths (Figure A.8B)18. This can be easily done by modifying coordination of the out groups.
x <- read.tree("data/long-branch-example.newick") m <- MRCA(x, 75, 76) y <- groupClade(x, m) p <- p1 <- ggtree(y, aes(linetype = group)) + geom_tiplab(size = 2) + theme(legend.position = 'none') p$data[p$data$node %in% c(75, 76), "x"] <- mean(p$data$x) plot_grid(p1, p, ncol=2)
Sometimes there are known branches that are not in the tree, and we would like to have them on the tree. Another scenario is that we have a newly sequence species and would like to update reference tree with this species by inferring its evolutionary position.
Users can use
phytools::bind.tip() (Revell 2012) to attach a new tip to a tree. With tidytree, it is easy to add annotation to differentiate newly introduce and original branches and to reflect uncertainty of the added branch splits off as demonstrated in Figure A.9.
library(phytools) library(tidytree) library(ggplot2) library(ggtree) set.seed(2019-11-18) tr <- rtree(5) tr2 <- bind.tip(tr, 'U', edge.length = 0.1, where = 7, position=0.15) d <- as_tibble(tr2) d$type <- "original" d$type[d$label == 'U'] <- 'newly introduce' d$sd <- NA d$sd[parent(d, 'U')$node] <- 0.05 tr3 <- as.treedata(d) ggtree(tr3, aes(linetype=type)) + geom_tiplab() + geom_errorbarh(aes(xmin=x-sd, xmax=x+sd, y = y - 0.3), linetype='dashed', height=0.1) + scale_linetype_manual(values = c("newly introduce" = "dashed", "original" = "solid")) + theme(legend.position=c(.8, .2))
If you want to colour or change line types of specific branches, you only need to prepare a data frame with variables of branch setting (e.g. selected and unselected).
set.seed(123) x <- rtree(10) ## binary choices of colours d <- data.frame(node=1:Nnode2(x), colour = 'black') d[c(2,3,14,15), 2] <- "red" ## multiple choices of line types d2 <- data.frame(node=1:Nnode2(x), lty = 1) d2[c(2,5,13, 14), 2] <- c(2, 3, 2,4) p <- ggtree(x) + geom_label(aes(label=node)) p %<+% d %<+% d2 + aes(colour=I(colour), linetype=I(lty))
Users can use the gginnards package to manipulate plot elements for more complicated scenarios.
If you want to add an arbitrary point to a branch19, you can use
geom_point2 (works for both external and internal nodes) to filter selected node (end point of the branch) via the
subset aesthetic mapping and specify horizontal position by
x = x - offset aesthetic mapping, where the offset can be an absolute value (Figure A.11A) or proportion to branch length (Figure A.11B).
set.seed(2020-05-20) x <- rtree(10) p <- ggtree(x) p1 <- p + geom_nodepoint(aes(subset = node == 13, x = x - .1), size = 5, colour = 'firebrick', shape = 21) p2 <- p + geom_nodepoint(aes(subset = node == 13, x = x - branch.length * 0.2), size = 3, colour = 'firebrick') + geom_nodepoint(aes(subset = node == 13, x = x - branch.length * 0.8), size = 5, colour = 'steelblue') cowplot::plot_grid(p1, p2, labels=c("A", "B"))
library(ggtree) library(ggplot2) set.seed(2019-05-02) x <- rtree(30) p <- ggtree(x) + geom_tiplab() d <- data.frame(label = x$tip.label, value = rnorm(30)) p2 <- facet_plot(p, panel = "Dot", data = d, geom = geom_point, mapping = aes(x = value)) p2 <- p2 + theme_bw() + xlim_tree(5) + xlim_expand(c(-5, 5), 'Dot') d = data.frame(.panel = c('Tree', 'Dot'), lab = c("Distance", "Dot Units"), x=c(2.5,0), y=-2) p2 + scale_y_continuous(limits=c(0, 31), expand=c(0,0), oob=function(x, ...) x) + geom_text(aes(label=lab), data=d) + coord_cartesian(clip='off') + theme(plot.margin=margin(6, 6, 40, 6))
ggtree function plot the tree structure and normally we add layers on top of the tree.
If we want the layers behind the tree layer, we can reverse the order of all the layers.
p$layers <- rev(p$layers)
This question was asked several times20, and a published example can be found in https://www.ncbi.nlm.nih.gov/pubmed/27605062. Increasing percentage of center white space in circular tree is useful to avoid overlapping tip labels and to increase readibility of the tree by moving all nodes and branches further out. This can be done simply by using
+xlim() to allocate more space, just like in Figure 4.3G, or assign a long root branch that is similar to the “Root Length” parameter in FigTree.
revts will reverse the x-axis by setting the most recent tip to 0. We can use
scale_x_continuous(labels=abs) to label x-axis using absolute values.
tr <- rtree(10) p <- ggtree(tr) + theme_tree2() p2 <- revts(p) + scale_x_continuous(labels=abs) plot_grid(p, p2, ncol=2, labels=c("A", "B"))