The E. coli outbreak was responsible for 50 deaths, and caused kidney failure in nearly 1,000 individuals who developed hemolytic-uremic syndrome.
The findings, published online on November 8, 2012 in Nature Biotechnology, provide novel insights into the role of epigenetic DNA base modifications in driving molecular processes of this E. coli strain. The discovery now enables scientists to use new approaches to fully characterize pathogens that could lead to the development of new treatments.
During the 2011 outbreak, Eric Schadt, PhD, Director of Mount Sinai’s Institute for Genomics and Multiscale Biology, and colleagues teamed up with researchers from around the world to conduct DNA sequencing of the outbreak strain, and 11 related strains, to provide a detailed identification of the outbreak strain and insights regarding the strain’s evolutionary origins. While their work provided initial valuable findings, the DNA sequence alone did not fully explain the unusually high virulence seen in this E. coli outbreak strain.
To further investigate, Dr. Schadt and study co-author Matthew Waldor, MD, PhD, BWH Division of Infectious Diseases, Department of Medicine, reanalyzed the sequencing data to examine the chemical modifications to individual DNA bases.
“The information content of the genetic code is not limited to the primary nucleotide sequence of A’s, G’s, C’s and T’s,” said Schadt, who is also the Jean C. and James W. Crystal Professor of Genomics, and Chair of the Department of Genetics and Genomics Sciences at Mount Sinai. “Individual DNA bases can be chemically modified, changing how proteins interact with that particular sequence and as a result having significant functional consequences. Without genome-wide DNA base modification information, you simply don’t have a complete picture of all the variation and the phenotypic variability that we see.”
Although it is important to be able to capture base modification information during genomic analysis, limitations in sequencing technology make it difficult for scientists to study the many types of base modifications that occur in nature. The research team employed the PacBio® RS system from Pacific Biosciences, which can collect data on base modifications simultaneously as it collects DNA sequence data. The researchers applied the advanced algorithms that were published online on October 23 in Genome Research, which was the first paper to statistically model PacBio sequencing data for epigenetics. Schadt is a member of the scientific advisory board of Pacific Biosciences.
“By enabling genome-wide detection of chemical modifications of DNA, PacBio sequencing opens a new window in our understanding of epigenetic mechanisms in bacteria,” said Waldor.
The researchers detected widespread base modifications in the genome of the E. coli outbreak strain, identifying ~50,000 methyladenine modified bases. In collaboration with Richard Roberts, PhD, Chief Scientific Officer of New England BioLabs, Schadt and team discovered a series of enzymes that appeared to target specific DNA sequence motifs throughout the genome as they made their chemical changes. For example, the well-known Dam methyltransferase was directly observed to target the A residue in DNA with the sequence motif GATC, while a different, novel methyltransferase acts on the ACCACC motif.
“We found a whole array of methylase enzymes that were making modifications by targeting different sequence motifs,” said Schadt. “It almost appears like another language. The DNA bases targeted for modification are highly non-random, and the targeting had a broad effect on the transcription of genes.”
The team followed up the base modification study with RNA sequencing to determine how the genes inducing these epigenetic marks were affecting the transcriptome. “The accepted dogma for the primary role of restriction modification systems is defense of host cells from invasion by foreign DNA” Schadt said. “However, we found that these modifications had a very significant impact on the transcription of genes, and that the genes being affected were enriched in a number of different pathways.”
Notably, the team found marked enrichment for pathways linked to processes associated with bacterial growth and motility. Throughout the organism’s genome, many pathways were up- or down-regulated by one of the methylases found in a mobile element next to the Shiga toxin gene, which is known to impact virulence. This methylase, targeting the motif CTGCAG, was specific to the outbreak strain, but absent in several non-virulent strains that were also studied. In addition, upon removing this methylase in a knock out experiment, structural changes in the genome of the outbreak strain were observed, which may signify a potential role in further affecting processes associated with virulence and pathogenicity.
This research highlights the importance of generating more dimensions of data-such as DNA, DNA base modifications, RNA, and proteins-and then integrating to form a multiscale view of the organism’s biology. “Living systems are composed of lots of pieces interacting in very complex ways,” Schadt said. “To understand such systems, we need to take into account more of the information on a global level, not just a single protein level. This is how we can see the whole picture of an organism’s biology.”
Study authors include scientists from Harvard Medical School; Howard Hughes Medical Institute; University of Minnesota; New England Biolabs; University of Michigan; Pacific Biosciences; and Stanford University.
This press release was adapted from a release by Mount Sinai School of Medicine.