Visualization of pathfindR Enrichment Results

2024-01-19

suppressPackageStartupMessages(library(pathfindR))

pathfindR offers various functionality to visualize the enrichment results. In this vignette, I try to demonstrate these functionalities.

enrichment_chart(): Bubble Chart of Enrichment Results

enrichment_chart generates a bubble chart. The x-axis corresponds to fold enrichment values while the y-axis indicates the enriched terms. Size of the bubble indicates the number of significant genes in the given enriched term. Color indicates the -log10(lowest-p) value. The closer the color is to red, the more significant the enrichment is.

enrichment_chart(example_pathfindR_output)

By default, the bubble chart is generated for the top 10 terms. This can be controlled by the top_terms argument:

## change top_terms
enrichment_chart(example_pathfindR_output, top_terms = 3)

## set null for displaying all terms
enrichment_chart(example_pathfindR_output, top_terms = NULL)

If the enrichment results were clustered, setting plot_by_cluster == TRUE will result in the enriched terms to be grouped by clusters:

enrichment_chart(example_pathfindR_output_clustered, plot_by_cluster = TRUE)
#> Plotting the enrichment bubble chart

See ?enrichment_chart for more details.

visualize_terms(): Enriched Term Diagrams

For H.sapiens KEGG enrichment analyses, visualize_terms() can be used to generate KEGG pathway diagrams that are saved as PNG files in a directory called “term_visualizations” under the current working directory:

input_processed <- input_processing(example_pathfindR_input)
visualize_terms(
  result_df = example_pathfindR_output,
  input_processed = input_processed,
  hsa_KEGG = TRUE
)

Alternatively (i.e., for other types of non-KEGG/non-H.sapiens enrichment analyses), an interaction diagram per enriched term can be generated again via visualize_terms(). These diagrams are also saved as PNG files in a directory called “term_visualizations” under the current working directory:

input_processed <- input_processing(example_pathfindR_input)
visualize_terms(
  result_df = example_pathfindR_output,
  input_processed = input_processed,
  hsa_KEGG = FALSE,
  pin_name_path = "Biogrid"
)

See ?visualize_terms for more details.

term_gene_heatmap(): Terms by Genes Heatmap

term_gene_heatmap() is used to create a heatmap where rows are enriched terms and columns are involved input genes. This heatmap allows visual identification of the input genes involved in the enriched terms, as well as the common or distinct genes between different terms.

term_gene_heatmap(example_pathfindR_output)

By default, the heatmap is generated for the top 10 terms. This can be controlled by the num_terms argument:

term_gene_heatmap(example_pathfindR_output, num_terms = 3)

## set null for displaying all terms
term_gene_heatmap(example_pathfindR_output, num_terms = NULL)

By default, the term ids are used. For using full descriptions, set use_description = TRUE

term_gene_heatmap(example_pathfindR_output, use_description = TRUE)

If the input data frame (same as in run_pathfindR()) is supplied, the tile colors indicate the change values:

term_gene_heatmap(result_df = example_pathfindR_output, genes_df = example_pathfindR_input)

See ?term_gene_heatmap for more details.

term_gene_graph(): Term-Gene Graph

The function term_gene_graph() (adapted from the Gene-Concept network visualization by the R package enrichplot) can be utilized to visualize which significant genes are involved in the enriched terms. The function creates the term-gene graph, displaying the connections between genes and biological terms (enriched pathways or gene sets). This allows for the investigation of multiple terms to which significant genes are related. The graph also enables determination of the degree of overlap between the enriched terms by identifying shared and/or distinct significant genes. By default, the function visualizes the term-gene graph for the top 10 enriched terms:

term_gene_graph(example_pathfindR_output)

To plot all of the enriched terms in the enrichment results, set num_terms = NULL (not advised due to cluttered visualization):

term_gene_graph(example_pathfindR_output, num_terms = NULL)

To plot using full term names (instead of IDs which is the default), set use_description = TRUE:

term_gene_graph(example_pathfindR_output, num_terms = 3, use_description = TRUE)