# 11 Other ggtree Extensions

The ggtree package is a general package for visualizing tree structures and associated data. If you have some special requirements that are not directly provided by ggtree, you may need to use one of the extension packages built on top of ggtree. For example, the RevGadgets package for visualizing the output of the RevBayes, the sitePath package for visualizing fixation events on phylogenetic pathways, and the enrichplot package for visualizing hierarchical structure of the enriched pathways.

rp <- BiocManager::repositories()
db <- utils::available.packages(repo=rp)
x <- tools::package_dependencies('ggtree', db=db,
which = c("Depends", "Imports"),
reverse=TRUE)
print(x)                                
## \$ggtree
##  [1] "enrichplot"        "ggtreeExtra"
##  [3] "LymphoSeq"         "miaViz"
##  [5] "microbiomeMarker"  "MicrobiotaProcess"
##  [7] "philr"             "singleCellTK"
##  [9] "sitePath"          "systemPipeTools"
## [11] "tanggle"           "treekoR"

There are 12 packages in CRAN or Bioconductor that depend on or import ggtree and several packages on GitHub that extend ggtree. Here we briefly introduce some extension packages, including MicrobiotaProcess and tanggle.

## 11.1 Taxonomy Annotation Using MicrobiotaProcess

The MicrobiotaProcess package provides a LEfSe-like algorithm to discover microbiome biomarkers by comparing taxon abundance between different classes. It provides several methods to visualize the analysis result. The ggdiffclade is developed based on ggtree . In addition to the diff_analysis() result, it also supports a data frame that contains a hierarchical relationship (e.g., taxonomy annotation or KEGG annotation) with another data frame that contains taxa and factor information and/or pvalue. The following example demonstrates how to use data frames (i.e., analysis results) to visualize the differential taxonomy tree. More details can be found on the vignette of the MicrobiotaProcess package.

library(MicrobiotaProcess)
library(ggplot2)
library(TDbook)

# load df_difftax and df_difftax_info from TDbook
taxa <- df_alltax_info
dt <- df_difftax

nodedf=dt,
factorName="DIAGNOSIS",
skpointsize=0.6,
linewd=0.2,
taxlevel=3,
# This argument is to remove the branch of unknown taxonomy.
reduce=TRUE) +
scale_fill_manual(values=c("#00AED7", "#009E73"))+
guides(color = guide_legend(keywidth = 0.1, keyheight = 0.6,
order = 3,ncol=1)) +
theme(panel.background=element_rect(fill=NA),
legend.position="right",
plot.margin=margin(0,0,0,0),
legend.spacing.y=unit(0.02, "cm"),
legend.title=element_text(size=7.5),
legend.text=element_text(size=5.5),
legend.box.spacing=unit(0.02,"cm")
)

The data frame of this example is from the analysis result of diff_analysis() using public datasets . The colors represent the features enriched in the relevant class groups. The size of circle points represents the -log10(pvalue), i.e., a larger point indicates a greater significance. In Figure 11.1, we can find that Fusobacterium sequences were enriched in carcinomas, while Firmicutes, Bacteroides, and Clostridiales were greatly reduced in tumors. These results were consistent with the original article . The species of Campylobacter has been proven to be associated with colorectal cancer . We can find in Figure 11.1 that Campylobacter was enriched in tumors, while its relative abundance is lower than Fusobacterium.

## 11.2 Visualizing Phylogenetic Network Using Tanggle

The tanggle package provides functions to display a split network. It extends the ggtree package to allow the visualization of phylogenetic networks (Figure 11.2).

library(ggplot2)
library(ggtree)
library(tanggle)

file <- system.file("extdata/trees/woodmouse.nxs", package = "phangorn")
ggexpand(.1) + ggexpand(.1, direction=-1)