Comparative metagenomics

Authors
Affiliation

Alejandra Escobar

Christian Atallah

Published

October 5, 2023

Normalization methods, alpha & beta diversity, and differentially abundant features

In this practical session, we aim to demonstrate how the MGnifyR tool can be used to fetch data and metadata of a MGnify metagenomic analysis. Then we show diversity metrics calculus and two methods to identify differentially abundant features using taxonomic and functional profiles generated through the MGnify v5.0 pipeline for metagenomic assemblies, as shown in the workflow schema below.

The dataset is the TARA ocean metagenomic study (WMS) corresponding to size fractions for prokaryotes MGYS00002008. Find more information about the TARA Ocean Project.

Mgnify assembly analysis pipeline v5.0

MGnifyR is a library that provides a set of tools for easily accessing and processing MGnify data in R, making queries to MGnify databases through the MGnify API. The benefits of MGnifyR are that data can either be fetched in TSV format or be directly combined in a phyloseq object to run an analysis in a custom workflow.

The exercises are organized into 5 main sections:

  1. Fetching data and preprocessing
  2. Normalization, alpha diversity indices and taxonomic profiles visualization
  3. Comparative metagenomics at community level: Beta diversity
  4. Detection of differentially abundant taxa (SIAMCAT)
  5. Detection of differentially abundant functions (Aldex2)

Open the Jupyter Notebook

The practice has been prepared in a Jupyter Notebook available in the MGnify Notebooks Github repo.

To access the notebook open a terminal and run the following commands:
cd ~/mgnify-notebooks
sudo chown -R training .
git stash --all
git pull
git switch comparative_practice_2023
task edit-notebooks

After a few seconds, some URLs will be printed in the terminal. Open the last one (http://127.0.0.1:8888/lab?token=.....), by right-clicking on the URL and selecting “Open Link”, or by copying-and-pasting it into a web browser like Chromium/Firefox.

Find and open the ‘comparative_practice_2023’ notebook in the ‘R examples’ directory
The notebook has coloured boxes to indicate relevant steps in the analysis or to introduce material for discussion

Yellow boxes:

  • Up to you: To re-run the analysis changing parameters
  • Questions: For open discussion

Blue boxes:

  • Notes: Informative boxes about the running command
  • Tips: Information useful for results interpretation
To leave the notebook, you can save the changes and close the window in the browser. Then, terminate the process at the terminal:
CTRL+C

Use the Jupyter Notebook after the course

This notebook is based on a publicly accessible version. You can use this at any time.

  1. It is available to use from your web browser, no installation needed: notebooks.mgnify.org
  2. You can see a completed version of it, with all the outputs, on docs.mgnify.org
  3. You can use a prebuilt docker image and our public notebooks repository: github.com/ebi-metagenomics/notebooks. This should work on any computer you can install Docker on.
  4. You can try and install all the dependencies yourself ¯\_(ツ)_/¯

Reuse

Apache 2.0