# 11 Other ggtree extensions

rp <- BiocManager::repositories()
db <- utils::available.packages(repo=rp)
x <- tools::package_dependencies('ggtree', db=db,
which = c("Depends", "Imports"),
reverse=TRUE)
print(x)                                
## \$ggtree
## [1] "LymphoSeq"         "MicrobiotaProcess"
## [3] "philr"             "singleCellTK"
## [5] "sitePath"          "genBaRcode"
## [7] "harrietr"          "RAINBOWR"
## [9] "STraTUS"

There are 9 packages in CRAN or Bioconductor that depend or import ggtree and several packages on github that extends ggtree.

## 11.1 Taxonomy annotation using MicrobiotaProcess

The MicrobiotaProcess package provides a LEfSe-like algorithm (Segata et al. 2011) to discover microbiome biomarker by comparing taxon abudance between different classes. It also provides several methods to visualize the analysis result. The ggdiffcalde() is developed based on ggtree (Yu et al. 2017). In addition to the diff_analysis() result, it also supports a data frame that contains hierarchical relationship (e.g. taxonomy annotation or KEGG annotation) with another data frame that contains taxa and factor information and/or pvalue. The following example demonstrates how to use data frames (i.e. analysis result) to visualize the differential taxonomy tree. More details can be found on the vignette of the MicrobiotaProcess package.

library(MicrobiotaProcess)
library(ggplot2)

nodedf=dt,
factorName="DIAGNOSIS",
skpointsize=0.6,
linewd=0.2,
taxlevel=3,
reduce=TRUE) + # This argument is to remove the branch of unknown taxonomy.
scale_fill_manual(values=c("#00AED7", "#009E73"))+
guides(color = guide_legend(keywidth = 0.1, keyheight = 0.6,
order = 3,ncol=1)) +
theme(panel.background=element_rect(fill=NA),
legend.position="right",
plot.margin=margin(0,0,0,0),
legend.spacing.y=unit(0.02, "cm"),
legend.title=element_text(size=7.5), # This should be adjusted with different devout.
legend.text=element_text(size=5.5),
legend.box.spacing=unit(0.02,"cm")
)

The data frame of this example is from the analysis result of diff_analysis() using public datasets (Kostic et al. 2012). The colors represent the features enriched in the relevant class groups. The size of circle points represents the -log10(pvalue), i.e. a larger point indicates a greater signficance. In Figure 11.1, we can found that Fusobacterium sequences were enriched in carcinomas, while Firmicutes, Bacteroides and Clostridiales were depleted in tumors. These results were consistent with the original article (Kostic et al. 2012). The species of Campylobacter has been proven to associated with the colorectal cancer (He et al. 2019; Wu et al. 2013; Amer et al. 2017). We can found in the Figure 11.1 that Campylobacter were enriched in tumors, while its relative abundance is lower than Fusobacterium.

## 11.2 Visualizing phylogenetic network

ggnetworx: phylogenetic networks using ggplot2 and ggtree

library(ggplot2)
library(ggtree)
library(ggnetworx)

file <- system.file("extdata/trees/woodmouse.nxs", package = "phangorn")
ggexpand(.1) + ggexpand(.1, direction=-1)