6  WikiPathways analysis

WikiPathways is a continuously updated pathway database curated by a community of researchers and pathway enthusiasts. WikiPathways produces monthly releases of GMT files for supported organisms at data.wikipathways.org. The clusterProfiler package (Yu et al. 2012) supports enrichment analysis (either ORA or GSEA) for WikiPathways using the enrichWP() and gseWP() functions. These functions will automatically download and parse the latest WikiPathways GMT file for the selected organism.

Supported organisms can be listed by:

library(clusterProfiler)
get_wp_organisms()
 [1] "Anopheles gambiae"        "Arabidopsis thaliana"    
 [3] "Bos taurus"               "Caenorhabditis elegans"  
 [5] "Canis familiaris"         "Danio rerio"             
 [7] "Drosophila melanogaster"  "Equus caballus"          
 [9] "Gallus gallus"            "Homo sapiens"            
[11] "Mus musculus"             "Pan troglodytes"         
[13] "Populus trichocarpa"      "Rattus norvegicus"       
[15] "Saccharomyces cerevisiae" "Solanum lycopersicum"    
[17] "Sus scrofa"               "Zea mays"                

6.1 WikiPathways over-representation analysis

data(geneList, package="DOSE")
gene <- names(geneList)[abs(geneList) > 2]

enrichWP(gene, organism = "Homo sapiens") 
#
# over-representation test
#
#...@organism    Homo sapiens 
#...@ontology    WikiPathways 
#...@keytype     ENTREZID 
#...@gene    chr [1:207] "4312" "8318" "10874" "55143" "55388" "991" "6280" "2305" ...
#...pvalues adjusted by 'BH' with cutoff < 0.05
#...7 enriched terms found
'data.frame':   7 obs. of  12 variables:
 $ ID            : chr  "WP2446" "WP2361" "WP3942" "WP179" ...
 $ Description   : chr  "Retinoblastoma gene in cancer" "Gastric cancer network 1" "PPAR signaling" "Cell cycle" ...
 $ GeneRatio     : chr  "11/122" "6/122" "7/122" "10/122" ...
 $ BgRatio       : chr  "89/9008" "28/9008" "50/9008" "120/9008" ...
 $ RichFactor    : num  0.1236 0.2143 0.14 0.0833 0.2857 ...
 $ FoldEnrichment: num  9.13 15.82 10.34 6.15 21.1 ...
 $ zScore        : num  9.03 9.2 7.76 6.66 8.82 ...
 $ pvalue        : num  2.67e-08 1.61e-06 4.33e-06 4.74e-06 2.89e-05 ...
 $ p.adjust      : num  2.13e-05 6.42e-04 9.45e-04 9.45e-04 4.60e-03 ...
 $ qvalue        : num  2.13e-05 6.42e-04 9.45e-04 9.45e-04 4.60e-03 ...
 $ geneID        : chr  "890/891/9133/8318/1111/81620/7272/24137/983/6241/7153" "11065/6286/22974/4605/6790/7153" "5346/2167/4312/5105/3158/9370/9415" "890/891/9133/8318/1111/991/7272/983/9232/4174" ...
 $ Count         : int  11 6 7 10 4 7 8
#...Citation
S Xu, E Hu, Y Cai, Z Xie, X Luo, L Zhan, W Tang, Q Wang, B Liu, R Wang, W Xie, T Wu, L Xie, G Yu. Using clusterProfiler to characterize multiomics data. Nature Protocols. 2024, 19(11):3292-3320 

6.2 WikiPathways gene set enrichment analysis

gseWP(geneList, organism = "Homo sapiens")
#
# Gene Set Enrichment Analysis
#
#...@organism    Homo sapiens 
#...@setType     WikiPathways 
#...@keytype     ENTREZID 
#...@geneList    Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
 - attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
#...nPerm    1000 
#...pvalues adjusted by 'BH' with cutoff < 0.05
#...168 enriched terms found
'data.frame':   168 obs. of  11 variables:
 $ ID             : chr  "WP2446" "WP466" "WP179" "WP2361" ...
 $ Description    : chr  "Retinoblastoma gene in cancer" "DNA replication" "Cell cycle" "Gastric cancer network 1" ...
 $ setSize        : int  84 42 111 23 62 41 78 36 63 61 ...
 $ enrichmentScore: num  0.731 0.792 0.663 0.837 0.665 ...
 $ NES            : num  2.89 2.76 2.75 2.54 2.48 ...
 $ pvalue         : num  2.79e-12 2.76e-12 2.92e-12 6.16e-09 1.27e-09 ...
 $ p.adjust       : num  7.77e-10 7.77e-10 7.77e-10 8.18e-07 2.52e-07 ...
 $ qvalue         : num  6.41e-10 6.41e-10 6.41e-10 6.75e-07 2.08e-07 ...
 $ rank           : int  1333 1002 1234 854 1111 1750 1627 2454 1961 2622 ...
 $ leading_edge   : chr  "tags=51%, list=11%, signal=46%" "tags=55%, list=8%, signal=51%" "tags=40%, list=10%, signal=36%" "tags=52%, list=7%, signal=49%" ...
 $ core_enrichment: chr  "8318/9133/7153/6241/890/983/81620/7272/1111/891/24137/993/898/4998/10733/9134/4175/4173/6502/5984/994/7298/3015"| __truncated__ "8318/55388/81620/4174/4171/990/23594/4998/4175/4173/10926/5984/5111/51053/8317/5427/23649/4176/5982/5557/5558/4172/5424" "8318/991/9133/890/983/7272/1111/891/4174/9232/4171/993/990/5347/9700/898/23594/4998/9134/4175/4173/10926/6502/9"| __truncated__ "4605/7153/11065/22974/6286/6790/1894/56992/4173/1063/9585/8607" ...
#...Citation
S Xu, E Hu, Y Cai, Z Xie, X Luo, L Zhan, W Tang, Q Wang, B Liu, R Wang, W Xie, T Wu, L Xie, G Yu. Using clusterProfiler to characterize multiomics data. Nature Protocols. 2024, 19(11):3292-3320 

If your input gene ID type is not Entrez gene ID, you can use the bitr() function to convert gene ID. If you want to convert the gene IDs in output result to gene symbols, you can use the setReadable() function, see also Section 17.2.

References

Yu, Guangchuang, Le-Gen Wang, Yanyan Han, and Qing-Yu He. 2012. “clusterProfiler: An r Package for Comparing Biological Themes Among Gene Clusters.” OMICS: A Journal of Integrative Biology 16 (5): 284–87. https://doi.org/10.1089/omi.2011.0118.