The virtualome: a computational framework to analyse microbiomics

Serrano Antón, Belén (2021). The virtualome: a computational framework to analyse microbiomics. Thesis (Master thesis), E.T.S. de Ingeniería Agronómica, Alimentaria y de Biosistemas (UPM).

Description

Title: The virtualome: a computational framework to analyse microbiomics
Author/s:
  • Serrano Antón, Belén
Contributor/s:
  • Bertocchini, Federica
  • Pagán Muñoz, Jesús Israel
Item Type: Thesis (Master thesis)
Masters title: Biología Computacional
Date: June 2021
Subjects:
Freetext Keywords: 16S rRNA, Amplicon sequencing, Whole genome shotgun sequencing, Metagenomics, Microbiome, Next-generation sequencing, Comparisson, Metaclassification
Faculty: E.T.S. de Ingeniería Agronómica, Alimentaria y de Biosistemas (UPM)
Department: Biotecnología - Biología Vegetal
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (2MB) | Preview

Abstract

Microbiome characterisation is one of the most common applications of metagenomics, but also one of its greatest challenges. Currently, the two most widely used methods are amplicon (based on taxonomical markers like the ribosomal gene 16S for bacteria, or ITS or COI for fungi) and whole genome sequencing (WGS). However, although these techniques play a fundamental role in characterising microbial communities and, in particular, symbiotic bacteria, they have limiting factors. Indeed, this work started after experimental data from the gut of Galleria mellonela showed that the percentage overlap between amplicon (16S) and WGS was only 16%. In this study, the limitations of these techniques are discussed. In particular, we will focus on the overlap between the results of both techniques (16S and WGS) and their ability to detect changes in bacterial populations. To do that, we formulate a model for generating virtual samples to test metagenomic analysis techniques. Virtual samples (virtualomes) are generated following the ecological structure of realistic bacterial communities. For the analyses, both intrinsic characteristics of the community (ecology and composition) and technical aspects (number of reads and information in the databases) have been varied. Furthermore, analyses have been done at species and genus level. This study reveals several factors that greatly affect the performance of metagenomic analyses, such as 1) the loss of species due to low abundance and, in the case of 16S, because the primer is not able to amplify the target region; 2) the characterisation of communities at species level is better with WGS, although the results are still far from reality; 3) the overlap of both techniques is low and they do not necessarily overlap in true positives; 4) detection of changes in microbial populations is poor and changes tend to be overestimated. All these results are highly affected by community diversity and database information (both at genus and species level).

More information

Item ID: 68843
DC Identifier: https://oa.upm.es/68843/
OAI Identifier: oai:oa.upm.es:68843
Deposited by: Belén Serrano Antón
Deposited on: 15 Oct 2021 04:56
Last Modified: 15 Oct 2021 04:56
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM