In the latter half of the twentieth century, it became clear that

In the latter half of the twentieth century, it became clear that bacteria could be grouped into taxonomic clusters based on stable phenotypic characters (e.g. cellular morphology and composition, growth requirements and other metabolic traits) that could be measured reliably

in the laboratory. In the 1960s and 1970s, Sneath and Sokal exploited improved technical and statistical methods to develop a numerical taxonomy, which revealed discrete phenotypic clustering within many bacterial genera [6]. Such phenotypic approaches soon faced competition from genotypic approaches, such as DNA base composition (mol% G+C content) [7] and whole-genome DNA-DNA hybridization (DDH); the latter remains the gold standard in bacterial taxonomy [8]. Within this framework, Selumetinib research buy Wayne et al.[8] recommended that “a species generally would include strains with approximately 70% or greater DNA-DNA relatedness”. However, few laboratories now perform DNA-DNA hybridization assays as these are onerous and technically demanding when compared to the rapid and easy sequencing of small signature sequences, such as the 16S ribosomal RNA gene. This shift has led to an updated species definition: Adriamycin “a prokaryotic species is considered to be a group of strains that are characterized by a certain degree of phenotypic consistency, showing 70% of DNA–DNA binding and over 97% of 16S ribosomal RNA (rRNA)

gene-sequence identity” [9]. Most recently, whole-genome sequencing has delivered new taxonomic metrics—for example, average nucleotide identity (ANI), calculated from pair-wise comparisons of all sequences shared between any two strains. ANI exhibits a strong correlation with DDH values [10], with an ANI value of ≥ 95% corresponding to the traditional 70% DDH threshold [10]. Despite the ready availability of genome sequence data, microbial taxonomy remains a conservative discipline. When defining a bacterial species, most modern microbial taxonomists use a polyphasic approach, whereby a bacterial species represents

“a monophyletic and genomically coherent cluster of individual Cyclin-dependent kinase 3 organisms that show a high degree of overall similarity with respect to many independent characteristics, and is diagnosable by a selleck discriminative phenotypic property” [11]. Although the polyphasic approach is pragmatic and widely applicable, it has drawbacks. It relies on phenotypic information, which in turn relies on growth, usually in pure culture, in the laboratory, which may not be achievable for many bacterial species [12]. It also relies on techniques that are time-consuming and difficult to standardize, particularly when compared to the ease of modern genome sequencing [4, 13, 14]. We, like others, are therefore driven to consider whether, in the genomic era, bacterial taxonomy could, and should, abandon phenotypic approaches and rely exclusively on analyses of genome sequence data [4, 10, 14–18].

Cathepsin K

Cathepsin K is a cysteine protease expressed predominantly in osteoclasts

In the latter half of the twentieth century, it became clear that