Genome-scale analysis of Shiga-toxin producing E. coli (STEC) O-serogroups


Strains of E. coli commonly associated with food poisoning and other serious human illnesses often produce shiga toxins (Stx), a family of related protein toxins encoded by lambdoid prophages with two major types of Stx1 and Stx2. Shiga toxin was originally described from Shigella dysenteriae by Japanese bacteriologist Kiyoshi Shiga. Over 100 serotypes of Shiga toxin-producing E. coli (STEC) have been associated with human infections, including the most common serotype E. coli O157:H7, a major food-borne pathogen that has been implicated in many food-poisoning outbreaks worldwide. It is estimated that E. coli O157:H7 causes greater than 73,000 cases of illness and 61 deaths in humans each year in the United States. A total of 70 serogroups of non-O157 STEC have been described in the literature, and non-O157 strains belonging to serogroups O26, O45, O91, O103, O111, O121, and O145 and others have become important public health problems in the United States, and cause an estimated 37,000 cases of illness and 30 deaths each year. Although several strains of E. coli O157:H7 have been sequenced, the genomes of only 4 of the 70 non-O157 STEC strains (serogroups O26, O111, O103 and O127) have been characterized thus far. It is widely recognized that this lack of genome scale information considerably limits our understanding of the genetics, pathogenic potential, and evolutionary history of this important group of organisms.

The project seeks to help fill this knowledge gap through obtaining complete genome sequence information of five major human and animal pathogenic groups of E. coli:
1. Non-O157 STEC (23 isolates);
2. Non-STEC O157 (4 isolates) groups;
3. Diarrheagenic E. coli (21 isolates);
4. Extraintestinal pathogenic E. coli (7 isolates); and
5. Reference isolates from international E. coli serotype collection (112 isolates).

Groups 1 and 2 are considered high-priority and will be sequenced to high quality using whole genome shotgun methodology and autoclosure. For Groups 3, 4, and 5 we propose only draft sequencing so as to determine the genetic potential and mechanisms of pathogenesis of this important group of human and animal pathogens.

The availability of the whole genome sequences of multiple O groups of STEC will enable researchers to:
(a) Identify and determine the role and relevance of genes encoding virulence factors such as extracellular toxins, cell-surface antigens, and other molecules implicated as determinants of pathogenicity and disease specificity;
(b) Study the molecular mechanisms involved in generating host specificity of clones recovered from human and animal infections;
(c) Elucidate the molecular basis for the nonrandom association of certain bacterial clones with specific disease conditions in humans and other mammalian hosts;
(d) Examine molecular mechanisms involved in the rise of new and unusually virulent bacterial clones; and
(e) identify specific genes and proteins suitable for use in the development of the next generation of diagnostic, therapeutic and immunoprophylactic agents.

White Paper Access

The initial white paper submitted can be downloaded here. Since white papers are not always approved exactly as submitted, this document may not exactly describe the final form of the project. Please contact if you have any questions.

All Publications that use data generated and/or are supported by the Sequencing Center at JCVI should acknowledge the sponsor as: This project has been funded in whole or part with federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services under contract numbers N01-AI30071
and/or HHSN272200900007C.

Investigators and Collaborators

Liliana Losada, PhD

Assistant Professor,, JCVI

Vivek Kapur, PhD

Department of Veterinary and Biomedical Sciences, Penn State University

Home   >  GCID   >  Projects   >  Completed Projects   >  E. coli Genome Project