6  WikiPathways analysis

WikiPathways is a continuously updated pathway database curated by a community of researchers and pathway enthusiasts. WikiPathways produces monthly releases of GMT files for supported organisms at data.wikipathways.org. The clusterProfiler package (Yu et al. 2012) supports enrichment analysis (either ORA or GSEA) for WikiPathways using the enrichWP() and gseWP() functions. These functions will automatically download and parse the latest WikiPathways GMT file for the selected organism.

Supported organisms can be listed by:

library(clusterProfiler)
get_wp_organisms()
 [1] "Anopheles gambiae"        "Arabidopsis thaliana"    
 [3] "Bos taurus"               "Caenorhabditis elegans"  
 [5] "Canis familiaris"         "Danio rerio"             
 [7] "Drosophila melanogaster"  "Equus caballus"          
 [9] "Gallus gallus"            "Homo sapiens"            
[11] "Mus musculus"             "Pan troglodytes"         
[13] "Populus trichocarpa"      "Rattus norvegicus"       
[15] "Saccharomyces cerevisiae" "Solanum lycopersicum"    
[17] "Sus scrofa"               "Zea mays"                

6.1 WikiPathways over-representation analysis

data(geneList, package="DOSE")
gene <- names(geneList)[abs(geneList) > 2]

enrichWP(gene, organism = "Homo sapiens") 
#
# over-representation test
#
#...@organism    Homo sapiens 
#...@ontology    WikiPathways 
#...@keytype     ENTREZID 
#...@gene    chr [1:207] "4312" "8318" "10874" "55143" "55388" "991" "6280" "2305" ...
#...pvalues adjusted by 'BH' with cutoff < 0.05
#...7 enriched terms found
'data.frame':   7 obs. of  12 variables:
 $ ID            : chr  "WP2446" "WP2361" "WP3942" "WP179" ...
 $ Description   : chr  "Retinoblastoma gene in cancer" "Gastric cancer network 1" "PPAR signaling" "Cell cycle" ...
 $ GeneRatio     : chr  "11/122" "6/122" "7/122" "10/122" ...
 $ BgRatio       : chr  "89/9006" "28/9006" "50/9006" "120/9006" ...
 $ RichFactor    : num  0.1236 0.2143 0.14 0.0833 0.2857 ...
 $ FoldEnrichment: num  9.12 15.82 10.33 6.15 21.09 ...
 $ zScore        : num  9.03 9.2 7.76 6.66 8.82 ...
 $ pvalue        : num  2.68e-08 1.61e-06 4.34e-06 4.75e-06 2.89e-05 ...
 $ p.adjust      : num  2.14e-05 6.43e-04 9.47e-04 9.47e-04 4.61e-03 ...
 $ qvalue        : logi  NA NA NA NA NA NA ...
 $ geneID        : chr  "890/6241/24137/891/7153/7272/1111/8318/9133/983/81620" "6286/4605/7153/22974/11065/6790" "9415/9370/5105/3158/5346/4312/2167" "890/891/7272/1111/8318/9133/983/4174/9232/991" ...
 $ Count         : int  11 6 7 10 4 7 8
#...Citation
S Xu, E Hu, Y Cai, Z Xie, X Luo, L Zhan, W Tang, Q Wang, B Liu, R Wang, W Xie, T Wu, L Xie, G Yu. Using clusterProfiler to characterize multiomics data. Nature Protocols. 2024, 19(11):3292-3320 

6.2 WikiPathways gene set enrichment analysis

gseWP(geneList, organism = "Homo sapiens")
#
# Gene Set Enrichment Analysis
#
#...@organism    Homo sapiens 
#...@setType     WikiPathways 
#...@keytype     ENTREZID 
#...@geneList    Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
 - attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
#...nPerm    1000 
#...pvalues adjusted by 'BH' with cutoff < 0.05
#...158 enriched terms found
'data.frame':   158 obs. of  11 variables:
 $ ID             : chr  "WP2446" "WP179" "WP466" "WP2361" ...
 $ Description    : chr  "Retinoblastoma gene in cancer" "Cell cycle" "DNA replication" "Gastric cancer network 1" ...
 $ setSize        : int  84 111 42 23 62 41 78 24 61 36 ...
 $ enrichmentScore: num  0.731 0.663 0.792 0.837 0.665 ...
 $ NES            : num  2.85 2.74 2.73 2.49 2.48 ...
 $ pvalue         : num  3.11e-12 3.17e-12 2.53e-12 2.41e-09 5.39e-10 ...
 $ p.adjust       : num  8.42e-10 8.42e-10 8.42e-10 3.20e-07 1.07e-07 ...
 $ qvalue         : num  6.89e-10 6.89e-10 6.89e-10 2.62e-07 8.78e-08 ...
 $ rank           : int  1333 1234 1002 854 1111 1750 1627 1270 2622 2454 ...
 $ leading_edge   : chr  "tags=51%, list=11%, signal=46%" "tags=40%, list=10%, signal=36%" "tags=55%, list=8%, signal=51%" "tags=52%, list=7%, signal=49%" ...
 $ core_enrichment: chr  "8318/9133/7153/6241/890/983/81620/7272/1111/891/24137/993/898/4998/10733/9134/4175/4173/6502/5984/994/7298/3015"| __truncated__ "8318/991/9133/890/983/7272/1111/891/4174/9232/4171/993/990/5347/9700/898/23594/4998/9134/4175/4173/10926/6502/9"| __truncated__ "8318/55388/81620/4174/4171/990/23594/4998/4175/4173/10926/5984/5111/51053/8317/5427/23649/4176/5982/5557/5558/4172/5424" "4605/7153/11065/22974/6286/6790/1894/56992/4173/1063/9585/8607" ...
#...Citation
S Xu, E Hu, Y Cai, Z Xie, X Luo, L Zhan, W Tang, Q Wang, B Liu, R Wang, W Xie, T Wu, L Xie, G Yu. Using clusterProfiler to characterize multiomics data. Nature Protocols. 2024, 19(11):3292-3320 

If your input gene ID type is not Entrez gene ID, you can use the bitr() function to convert gene ID. If you want to convert the gene IDs in output result to gene symbols, you can use the setReadable() function, see also Section 17.2.

References

Yu, Guangchuang, Le-Gen Wang, Yanyan Han, and Qing-Yu He. 2012. “clusterProfiler: An r Package for Comparing Biological Themes Among Gene Clusters.” OMICS: A Journal of Integrative Biology 16 (5): 284–87. https://doi.org/10.1089/omi.2011.0118.