oa Introducing EzBioCloud: A taxonomically united database of 16S rRNA and whole genome assemblies
- Authors: Seok-Hwan Yoon1, Sung-Min Ha2, Soonjae Kwon3, Jeongmin Lim4, Yeseul Kim5, Hyungseok Seo6, Jongsik Chun7
- VIEW AFFILIATIONS
1 1ChunLab, Inc. 2 2ChunLab, Inc. 3 3ChunLab, Inc. 4 4ChunLab, Inc. 5 5ChunLab, Inc. 6 6ChunLab, Inc. 7 7Seoul National University
- First Published Online: 22 December 2016, International Journal of Systematic and Evolutionary Microbiology doi: 10.1099/ijsem.0.001755
- This is an open access article published by the Microbiology Society under the Creative Commons Attribution License
- Issue Published:
The recent advent of DNA sequencing technologies facilitates the use of genome sequencing data that provide means for more informative and precise classification and identification of Bacteria and Archaea. Because the current species definition is based on the comparison of genome sequences between type and other strains in a given species, building a genome database with correct taxonomic information is a premium need to enhance our efforts in exploring prokaryotic diversity and discovering new species as well as for routine identifications. Here we introduce an integrated database, called EzBioCloud, that holds the taxonomic hierarchy of Bacteria and Archaea that are represented by quality-controlled 16S rRNA gene and genome sequences. Whole genome assemblies in the NCBI Assembly Database were screened for low quality and subjected to a composite identification bioinformatics pipeline that employs gene-based searches followed by the calculation of average nucleotide identity. As a result, the database is made of 61,700 species/phylotypes including 13,132 with validly published names, and 62,362 whole genome assemblies that were taxonomically identified at the genus, species and subspecies levels. Genomic properties, such as genome size and GC content, and the occurrence in human microbiome data were calculated for each genus or higher taxa. This united database of taxonomy, 16S rRNA gene and genome sequences, with accompanied bioinformatics tools, should accelerate genome-based classification and identification of Bacteria and Archaea. The database and related search tools are available at http://www.ezbiocloud.net/.
Article metrics loading...
Full text loading...
Author and Article Information
/content/journal/ijsem/10.1099/ijsem.0.001755.v1dcterms_title,pub_serialTitlepub_serialIdent:journal/ AND -contentType:BlogPost104
/content/journal/ijsem/10.1099/ijsem.0.001755.v1dcterms_title-pub_serialIdent:journal/ AND -contentType:BlogPost104
Figure data loading....