The output includes the core enriched genes in Entrez ID format for each significant term.
Enhancing Readability with setReadable
To make the results more interpretable, use setReadable() to convert Entrez IDs to gene symbols:
library(clusterProfiler)
clusterProfiler v4.19.6 Learn more at https://yulab-smu.top/contribution-knowledge-mining/
Please cite:
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan,
X Fu, S Liu, X Bo, and G Yu. clusterProfiler 4.0: A universal
enrichment tool for interpreting omics data. The Innovation. 2021,
2(3):100141
Attaching package: 'clusterProfiler'
The following object is masked from 'package:stats':
filter
This transformation makes the core enrichment results much more readable and biologically meaningful.
For visualization of leading edge analysis results using cnetplot, please refer to the enrichplot chapter.
Non-Model Plant Annotation with clusterProfiler
For non-model plants and other organisms lacking standard annotation packages, clusterProfiler can be used with custom annotation data obtained from tools like eggNOG.
Workflow Overview
Annotation with eggNOG: Use the eggNOG web server to annotate protein sequences
Parse eggNOG Results: Extract GO and KEGG annotations using custom scripts
Enrichment Analysis: Use clusterProfiler’s enricher() function with custom annotation data
Key Steps
1. eggNOG Annotation
Upload protein sequences to the eggNOG mapper with appropriate parameters for your organism.
# Parse GO ontology filepython parse_go_obofile.py -i go-basic.obo -o go.tb# Parse eggNOG annotations with reference species filteringpython parse_eggNOG.py -i panax_ginseng.annotations \-g go.tb \-O ath,osa \-o output_directory
This generates two key files: - GOannotation.tsv: GO term annotations - KOannotation.tsv: KEGG pathway annotations
3. Enrichment Analysis with clusterProfiler
library(clusterProfiler)# Read annotation filesKOannotation <-read.delim("KOannotation.tsv", stringsAsFactors=FALSE)GOannotation <-read.delim("GOannotation.tsv", stringsAsFactors=FALSE)GOinfo <-read.delim("go.tb", stringsAsFactors=FALSE)# Your gene listgene_list <-c("gene1", "gene2", "gene3") # Replace with your actual gene list# GO enrichment (Molecular Function as example)GOannotation_split <-split(GOannotation, GOannotation$level)enricher(gene_list,TERM2GENE = GOannotation_split[['molecular_function']][c(2,1)],TERM2NAME = GOinfo[1:2])# KEGG enrichmentenricher(gene_list,TERM2GENE = KOannotation[c(3,1)],TERM2NAME = KOannotation[c(3,4)])
Advantages
Works for any organism with protein sequences
Uses reliable eggNOG annotation pipeline
Flexible reference species filtering for KEGG
Considerations
Requires intermediate Python scripting
Performance may vary with dataset size
Manual integration of annotation and analysis steps
This approach enables comprehensive functional enrichment analysis for non-model organisms using clusterProfiler’s powerful enrichment capabilities combined with custom annotation data.