6  WikiPathways analysis

WikiPathways is a continuously updated pathway database curated by a community of researchers and pathway enthusiasts. WikiPathways produces monthly releases of GMT files for supported organisms at data.wikipathways.org. The clusterProfiler package (Yu et al. 2012) supports enrichment analysis (either ORA or GSEA) for WikiPathways using the enrichWP() and gseWP() functions. These functions will automatically download and parse the latest WikiPathways GMT file for the selected organism.

Supported organisms can be listed by:

 [1] "Anopheles gambiae"        "Arabidopsis thaliana"    
 [3] "Bos taurus"               "Caenorhabditis elegans"  
 [5] "Canis familiaris"         "Danio rerio"             
 [7] "Drosophila melanogaster"  "Equus caballus"          
 [9] "Gallus gallus"            "Homo sapiens"            
[11] "Mus musculus"             "Pan troglodytes"         
[13] "Populus trichocarpa"      "Rattus norvegicus"       
[15] "Saccharomyces cerevisiae" "Solanum lycopersicum"    
[17] "Sus scrofa"               "Zea mays"                

6.1 WikiPathways over-representation analysis

data(geneList, package="DOSE")
gene <- names(geneList)[abs(geneList) > 2]

enrichWP(gene, organism = "Homo sapiens") 
#
# over-representation test
#
#...@organism    Homo sapiens 
#...@ontology    WikiPathways 
#...@keytype     ENTREZID 
#...@gene    chr [1:207] "4312" "8318" "10874" "55143" "55388" "991" "6280" "2305" ...
#...pvalues adjusted by 'BH' with cutoff <0.05 
#...7 enriched terms found
'data.frame':   7 obs. of  12 variables:
 $ ID            : chr  "WP2446" "WP2361" "WP3942" "WP179" ...
 $ Description   : chr  "Retinoblastoma gene in cancer" "Gastric cancer network 1" "PPAR signaling" "Cell cycle" ...
 $ GeneRatio     : chr  "11/122" "6/122" "7/122" "10/122" ...
 $ BgRatio       : chr  "89/9032" "28/9032" "50/9032" "120/9032" ...
 $ RichFactor    : num  0.1236 0.2143 0.14 0.0833 0.2857 ...
 $ FoldEnrichment: num  9.15 15.86 10.36 6.17 21.15 ...
 $ zScore        : num  9.04 9.22 7.77 6.67 8.83 ...
 $ pvalue        : num  2.60e-08 1.59e-06 4.26e-06 4.63e-06 2.86e-05 ...
 $ p.adjust      : num  8.82e-06 2.69e-04 3.93e-04 3.93e-04 1.94e-03 ...
 $ qvalue        : num  8.19e-06 2.50e-04 3.65e-04 3.65e-04 1.80e-03 ...
 $ geneID        : chr  "8318/9133/7153/6241/890/983/81620/7272/1111/891/24137" "4605/7153/11065/22974/6286/6790" "4312/9415/9370/5105/2167/3158/5346" "8318/991/9133/890/983/7272/1111/891/4174/9232" ...
 $ Count         : int  11 6 7 10 4 7 8
#...Citation
S Xu, E Hu, Y Cai, Z Xie, X Luo, L Zhan, W Tang, Q Wang, B Liu, R Wang, W Xie, T Wu, L Xie, G Yu. Using clusterProfiler to characterize multiomics data. Nature Protocols. 2024, 19(11):3292-3320 

6.2 WikiPathways gene set enrichment analysis

gseWP(geneList, organism = "Homo sapiens")
#
# Gene Set Enrichment Analysis
#
#...@organism    Homo sapiens 
#...@setType     WikiPathways 
#...@keytype     ENTREZID 
#...@geneList    Named num [1:12495] 4.57 4.51 4.42 4.14 3.88 ...
 - attr(*, "names")= chr [1:12495] "4312" "8318" "10874" "55143" ...
#...nPerm    
#...pvalues adjusted by 'BH' with cutoff <0.05 
#...66 enriched terms found
'data.frame':   66 obs. of  11 variables:
 $ ID             : chr  "WP2446" "WP179" "WP466" "WP5115" ...
 $ Description    : chr  "Retinoblastoma gene in cancer" "Cell cycle" "DNA replication" "Network map of SARS CoV 2 signaling" ...
 $ setSize        : int  84 111 42 231 23 62 78 102 41 63 ...
 $ enrichmentScore: num  0.731 0.663 0.792 0.445 0.837 ...
 $ NES            : num  2.85 2.73 2.72 2.05 2.55 ...
 $ pvalue         : num  1.00e-10 1.00e-10 1.00e-10 1.53e-09 2.46e-09 ...
 $ p.adjust       : num  2.56e-08 2.56e-08 2.56e-08 2.94e-07 3.79e-07 ...
 $ qvalue         : num  2.17e-08 2.17e-08 2.17e-08 2.49e-07 3.21e-07 ...
 $ rank           : num  1333 1234 1002 1818 854 ...
 $ leading_edge   : chr  "tags=51%, list=11%, signal=46%" "tags=40%, list=10%, signal=36%" "tags=55%, list=8%, signal=51%" "tags=29%, list=15%, signal=25%" ...
 $ core_enrichment: chr  "8318/9133/7153/6241/890/983/81620/7272/1111/891/24137/993/898/4998/10733/9134/4175/4173/6502/5984/994/7298/3015"| __truncated__ "8318/991/9133/890/983/7272/1111/891/4174/9232/4171/993/990/5347/9700/898/23594/4998/9134/4175/4173/10926/6502/9"| __truncated__ "8318/55388/81620/4174/4171/990/23594/4998/4175/4173/10926/5984/5111/51053/8317/5427/23649/4176/5982/5557/5558/4172/5424" "9133/3627/6241/10563/4283/983/7850/332/891/6355/2921/6364/51512/3576/1978/8836/6352/59272/4599/3932/2537/6772/6"| __truncated__ ...
#...Citation
S Xu, E Hu, Y Cai, Z Xie, X Luo, L Zhan, W Tang, Q Wang, B Liu, R Wang, W Xie, T Wu, L Xie, G Yu. Using clusterProfiler to characterize multiomics data. Nature Protocols. 2024, 19(11):3292-3320 

If your input gene ID type is not Entrez gene ID, you can use the bitr() function to convert gene ID. If you want to convert the gene IDs in output result to gene symbols, you can use the setReadable() function, see also Section 15.2.

References

Yu, Guangchuang, Le-Gen Wang, Yanyan Han, and Qing-Yu He. 2012. “clusterProfiler: An r Package for Comparing Biological Themes Among Gene Clusters.” OMICS: A Journal of Integrative Biology 16 (5): 284–87. https://doi.org/10.1089/omi.2011.0118.