Functional analyses

Investigation of sample composition give an insight on “What organisms are present in our sample”. We now want to know “What are they doing in that sample ?”, with metabolic analyses.

Metabolic analysis

For this investigation, we need to affiliate sequences to a protein database. We choose for this task to use HUMAnN2. HUMAnN profiles the presence/absence and abundance of gene families and microbial pathways in a community from metagenomic or metatranscriptomic sequencing data.

HUMAnN2 is available in Analyze metabolism (Functional assignation section). We execute HUMAnN2 on non rRNA sequences:

../../_images/humann2_param.png

3 output files are generated:

  • A file with abundance of found UniRef50 gene families
  • A file with coverage of found Metacyc pathways
  • A file with abundance of found Metacyc pathways

These 3 files give detailed insights into gene families and pathways. This is interesting when we want to look to a particular pathway or to check abundance of a given gene families. However, when we want a broad overview of metabolic processes in a community, we need tools to regroup gene families or pathways into global categories.

Broad overview of metabolic analysis

To get global categories from HUMAnN2 outputs, we decide to use Gene Ontology. Gene ontology project is a collaborative project to get a 3 structures ontologies to describe gene products in terms of their associated biological processes, cellular components and molecular functions.

HUMAnN2 gives opportunity to regroup UniRef50 gene family abundances into GO abundances. However, these GO terms are still too precise to get a good overview of metabolic processes.

Gene Ontology Consortium proposes GO slim, which are cut-down versions of the GO ontologies to give a broad overview of the ontology content. In our case, we use metagenomic GO slim terms developed by Jane Lomax and the InterPro group.

To regroup HUMAnN2 output containing UniRef50 gene family abundances into abundances of metagenomic GO slim term, we use Group humann2 uniref50 abundances to Gene Ontology (GO) slim terms [batut_group_humann2_uniref_abundances_to_go:_2016]. This tool uses GoaTools [1] to map GO terms to GO slim terms, HUMAnN2 to regroup abundances of UniRref50 gene families into abundances of metagenomc GO slim terms and custom Python scripts.

Tool to group HUMAnN2 UniRef50 abundances to Gene Ontology (GO) slim terms is available in Analyse metabolism (Functional assignation section). We execute it on HUMAnN2 output containing UniRef50 gene family abundance:

../../_images/group_humann2_uniref_abundances_to_go_param.png

This tool generates 3 tabular outputs:

  • A file with abundances of GO terms corresponding to molecular functions
  • A file with abundances of GO terms corresponding to biological processes
  • A file with abundances of GO terms corresponding to cellular components

Visualization of metabolic analysis

The 3 previously generated tabular files with relative abundances of Gene Ontology slim terms can be visualised with barplots. A tool Plot barplot with R is available in Visualize data (Post treatments section):

../../_images/plot_barplot.png

Several graphical options are available such as margins, labels, bar color, ...

For our dataset, we execute this tool 3 times to obtain the following 3 graphics

../../_images/cellular_component_abundance.png

Relative abundance of GO slim terms corresponding to cellular components

../../_images/biological_process_abundance.png

Relative abundance of GO slim terms corresponding to biological processes

../../_images/molecular_function_abundance.png

Relative abundance of GO slim terms corresponding to molecular fonctions

Our analyses are done, you may want to download the generated files and to extract these numerous step into a workflow to reproduce it. When you are done, do not forgot to stop the Galaxy instance and clean the environment.

[1]Haibao Tang, Debra Klopfenstein, Brent Pedersen, Patrick Flick, Kenta Sato, Fidel Ramirez, Jeff Yunes, and Chris Mungall. GOATOOLS: Tools for Gene Ontology. 2016. URL: https://github.com/tanghaibao/goatools.