Microbiome Research
The term “microbiome” is defined as the collection of the microbial taxa or microbes and their genes. Thus, it gives an idea of all the organisms and genomes that composes a sample (Xia, Sun, & Chen, 2018). In 2005, with advances in DNA-sequencing technologies such as 454 Pyrosequencing and Illumina sequencing, researchers started to analyze DNA extracted directly from a sample rather than from individually cultured microbes (Eckburg, et al., 2005). This approach has been widely studied in different scenarios (Beiko, Hsiao, & Parkinson, 2018), especially due to the absence of culture-based methods, which in the past has resulted in underestimation of the complexity of the human microbiome (Xia, Sun, & Chen, 2018).
Different strategies have been used to investigate microbiomes research. Of those, amplicon sequencing approach, in which one particular gene (often 16S rRNA) is amplified and sequenced, and shotgun metagenomic sequencing approach, in which genes are randomly targeted and sequenced, are the most commonly found in literature (Beiko, Hsiao, & Parkinson, 2018), (Morgan, Toit, & Setati, 2017), (Xia, Sun, & Chen, 2018).
Save your time!
We can take care of your essay
- Proper editing and formatting
- Free revision, title page, and bibliography
- Flexible prices and money-back guarantee
Place an order
Amplicon Sequencing with Targeted Genes
Development of targeted gene sequencing, through PCR amplification of a single taxonomically informative “marker gene” from organisms of interest, also known as amplicon sequencing, has further reduced cost and increased depth of taxonomic information (Beiko, Hsiao, & Parkinson, 2018). Ribosomal genes are conserved among all known organisms. Therefore, this gene allows for the joint reconstruction of both prokaryotic and eukaryotic phylogenies (Silva, Oliveira, & Grisolia, 2016). In bacteria, targeting the 16S rRNA gene has garnered much data on the bacterial diversity in the environment and the human microbiome. As with the 16S rRNA gene used to survey bacterial taxa, the 18S rRNA gene is a suitable marker for eukaryotes based on the presence of conserved genetic regions (Beiko, Hsiao, & Parkinson, 2018), although the low nucleotide substitution rate in the 18S rRNA gene often prevent fully discrimination between closely related species (Silva, Oliveira, & Grisolia, 2016). The internal transcribed spacer (ITS) region is a reliable DNA marker for identification of fungal species in metagenomic sample (Turenne, Sanche, Hoban, Karlowsky, & Kabani, 1999) and has been formally proposed for adoption as the primary fungal barcode marker to the Consortium for the Barcode of Life (Schoch, et al., 2012). Also, the 28S rRNA can discriminate fungi species on its own or combined with ITS (Silva, Oliveira, & Grisolia, 2016).
The major steps through amplicon sequencing approach are: (i) DNA extraction from samples, (ii) PCR amplification of targeted genes and library preparation, (iii) DNA sequencing, (iv) quality checking and (v) cluster sequences into OTUs. So far, there is no standard protocol regarding sample collection to guarantee the sample quality of microbiome data, as well no DNA extraction methods can provide a truly unbiased DNA sample, although Human Microbiome Project (HMP) has a few recommendations protocols (NIH, 2010). Barcodeded primers are usually used during PCR amplification followed by library preparation, in which adaptors and indexes are fused with targeted fragments of DNA or RNA (e.g. 16S, 18S, ITS, …), which allows genomic regions to be compared between experimental data and reference libraries to provide taxonomic resolution. Next-generation sequencers like 454 pyrosequencing and Illumina are commonly used for microbiome research and since each technology has pros and cons, the correct NGS approach is a critical choice during the workflow. 454 pyrosequencing can generate longer sequences, which improves taxa analysis, however Illumina provides more coverage at lower cost. After sequencing the raw files must pass through a pre-processing step, in which chimeras, low quality sequences and short reads are removed denoising the file. It’s an important step, since improves accuracy and avoids the overestimation of community taxa (Beiko, Hsiao, & Parkinson, 2018). Finally, the sequences are clustered using OTU-based method (or phylotype-based method), providing taxonomic distance between sequences. OTUs are defined as sequences that have great similarity (usually 97% for species) with other sequences. The percentage similarity between OTUs and a referenced database (e.g. SILVA, RDP and Greengenes, NCBI, UNITE, …) allows taxonomy assignment and relative abundances of the microbiota sample (Xia, Sun, & Chen, 2018). There is no perfect database choice, since each has their own protocols, taxonomic coverage and particularities. For instance, SILVA, RDP and Greengenes are commonly used with 16S analysis due to vast archeal and bacterial data, while UNITE is better used with 18S and ITS analysis due to their high content of fungi data (Beiko, Hsiao, & Parkinson, 2018).
In other hands, amplicon sequencing methodology has a few disadvantages, such as biases associated with PCR, overestimation of community diversity or species abundance, direct analysis incapability of biological functions of associated taxa (Xia, Sun, & Chen, 2018).
Shotgun Metagenomic Sequencing
As an alternative for target gene amplicon sequence, the shotgun metagenomic sequencing can be used to fulfill lacks and provide better understanding of the microbiome, especially taxonomic analysis (who is there?), functional analysis (what are they doing?) and comparative analysis (how to compare them?) (Xia, Sun, & Chen, 2018).
This approach doesn’t target any specific gene, so the entire community DNA is extracted and independently sequenced, which produces a massive number of DNA reads that can be aligned to genomic locations in the sample. For instance, it can be sampled from taxonomically informative genome loci (e.g. 16S) and from coding sequences, providing insights of the community structure and metagenome. In other words, it can discriminate strains of common species by gene content. Also, the shotgun metagenome analysis is potentially unbiased, which allows better resolution of novel microorganisms (Xia, Sun, & Chen, 2018).
The main steps through shotgun metagenomic sequence are: (i) DNA extraction from samples, (ii) PCR amplification of random genes and library preparation, (iii) DNA sequencing, (iv) quality checking, (v) assembly, (vi) binning/annotation. The first four steps are pretty similar with the amplicon sequencing approach, although no specific gene is targeted during PCR amplification. This way, a massive amount of DNA from all cells is collected. The assembly step refers to the act of assembling short reads into longer, contiguous sequences (also called “contigs”). It can be a referenced-base assembly, in which the reads are assembled based on a reference genome, or a de novo assembly, in which no reference genome is used. Usually, de novo assembly is computational costlier. However, it can be used as a counterproof for the referenced-base assembly, providing insights to better understand the metagenome (Xia, Sun, & Chen, 2018). Binning is a process in which the contigs are sorted in groups that might represent an individual genome or closely related organism. The most common algorithms are compositional binning, based on GC content specificity of a genome or abundance/distribution of k-mers, and similarity binning, in DNA fragments are compared to a reference genome and clustered based on the similarity of the genes. The annotation step offers the possibility to identify genes of interest and to better understand the functional pathways that defines that microbiome.
This methodology, however, has a few disadvantages, such as technical challenges processing huge amount of data, large and complex outputs that difficult gene tracking and complications identifying different taxa between communities (Xia, Sun, & Chen, 2018).
Bioinformatics Data Analysis Tools
With the rise of computer processing capability in past years, microbiome research has improved significantly their outputs. Pipelines like quantitative insights into microbial ecology (QIIME) and Morthur allows users to demultiplex files, remove barcodes and adaptors, perform quality checking and cluster sequences into OTUs, generating OTUs table and phylogenetic tree. Furthermore, it can perform statistical analysis like alpha/beta diversity, dispersion plots and boxplots. For instance, Buza et al., 2019, developed a full pipeline for 16S analysis in which both Morthur and QIIME are used as platform, with raw reads and mapping file as input and alpha/beta diversity and phylogenetic trees as outputs. For the shotgun metagenomic sequencing, the most common pipelines involve metagenomics rapid annotation using subsystem technology (MG-RAST) and rapid analysis of multiple metagenomes with clustering and annotation pipeline (RAMMCAP), which can offer decent outputs for metagenomic analysis. Buza et al, 2019, unified a few protocols and bioinformatic pipeline instructions for microbiome analysis, that can defenetly give a head start for anyone that just arrived in this field.
Bibliography
- Beiko, R. G., Hsiao, W., & Parkinson, J. (Eds.). (2018). Microbiome Analysis: Methods and Protocols.
- Buza, T. M., Tonui, T., Stomeo, F., Tiambo, C., Katani, R., Schilling, M., . . . Kapur, V. (2019). iMAP: an integrated bioinformatics and visualization pipeline for microbiome data analysis. BMC Bioinformatics.
- Eckburg, P. B., Bik, E. M., Bernstein, C. N., Purdom, E., Dethlefsen, L., Sargent, M., . . . Relma, D. A. (10 de Junho de 2005). Diversity of the Human Intestinal Microbial Flora. Science, pp. 1635-1638.
- Morgan, H. H., Toit, M. D., & Setati, M. E. (2017). The Grapevine and Wine Microbiome: Insights from High-Throughput Amplicon Sequencing. Frontiers in Microbiology.
- NIH. (2010). Human Microbiome Project – Core Microbiome Sampling Protocol A . Fonte: NIH Human Microbiome Project: https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/GetPdf.cgi?id=phd002854.2
- Schoch, C. L., Seifert, K. A., Huhndorf, S., Robert, V., Spouge, J. L., Levesque, A., & Chen, W. (2012). Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi. Proceedings of the National Academy of Sciences of the United States of America, 6241-6246.
- Silva, D. B., Oliveira, K. M., & Grisolia, A. B. (2016). Molecular Methods Developed for the Identification and Characterization of Candida Species. International Journal of Genetic Science, 1-6.
- Turenne, C. Y., Sanche, S. E., Hoban, D. J., Karlowsky, J. A., & Kabani, A. M. (1999). Rapid Identification of Fungi by Using the ITS2 Genetic Region and an Automated Fluorescent Capillary Electrophoresis System. Journal of Clinical Microbiology, 1846-1851.
- Xia, Y., Sun, J., & Chen, D. (2018). Statistical Analysis of Microbiome Data with R. Singapore.