Why to sequence “known” genomes?

Genetic variability is the driving force of evolution. In prokaryotes, many mechanisms emerged such as microorganisms could enhance variability, allowing their spread throughout different ecological niches. In a clinical context, major attention is given to the capacity of some microorganisms to colonize human tissues and to develop diseases. 

Even though a plethora of genome sequences are already known, usually, they are not enough to explain the variable behavior of such organisms. For instance, some bacteria are known as highly pathogenic, even though some variants of the same species are harmless to humans. This is due the genetic variability intrinsic to each strain.

Therefore getting to know as many genomes as possible, allow us to better understand patterns of variability and the genetic reasons of some phenotypes. 

As an example, consider the sequencing of an uropathogenic E. coli, named BH100, isolated in 1973 in Belo Horizonte, Brazil. This strain was found to be multiresistant to different kinds of antibiotics, and in some cases, revealed to be instablely resistant to different concentrations of streptomycin, which no one could explain why. Interestingly, since its isolation, it has been cultivated under environmental conditions, in simple growth media, which made this strain a good model to study the absence of a selective environment to pathogenic traits.

Thanks to the use of Next Generation Sequencing (NGS) and Bioinformatics protocols we were able to assemble and compare the genome of this strain with others, already in the databank and with variants built from curation of the original plasmids.  As a consequence we could interrogate the effects of these elements to the emergence of variability.

As a result we found that: 

1) This strain has shown a considerable number of insertion sequences spread substantially different along the chromosome of BH100 and its variants, revealing a strong structural effect, triggered by mobile elements, in the genome.

image (1)


2) Insertion sequences in the plasmids contribute to the variability shown in the chromosome, sometimes with plasmid components inserted in the chromosome. 

3) The comparative analysis showed this strain retained its uropathogenic traits, in spite of its cultivation conditions.

4) The functional annotation confirmed the presence of all resistance marks in transposons (Tn). Moreover, a new one was discovered as a kanamycin-resistance gene carrier.

5) We, also, were able to propose a new mechanism capable of explaining the emergence of unstable and diversified resistance to streptomycin within the population of E. coli BH100. In this model, we suggest streptomycin resistance shows up due a random increase in the copy number of the gene aadA1, located in one of the plasmids, triggered by their independent segregation during cell division

Many different E. coli genomes were already sequenced, by the time this research was conducted. However, as our results revealed, the sequencing of another strain contributed to the pool of new information regarding microbial variability and even aspects of uropathogenicity. This work demonstrates the importance of whole genome sequencing, as a strategy to add up relevant data to research.

As the price of NGS has been decreasing, it will be easier to know as many variant strains as needed, and consequently, allowing a deeper understanding of the microbial behavior.


OLIVEIRA, G. S. . Estudo da diversidade genética da Escherichia coli BH100 via genômica estrutural e comparativa. 2018. Dissertação (Mestrado em Bioinformática)

Footer blog-01


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.