8.8D: Actinobacteria (High G + C Gram-Positive Bacteria) - Biology

8.8D: Actinobacteria (High G + C Gram-Positive Bacteria) - Biology

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Actinobacteria are a group of Gram-positive bacteria with high guanine and cytosine content in their DNA.

Learning Objectives

  • Outline the characteristics associated with Actinobacteria

Key Points

  • Actinobacteria is one of the dominant phyla of the bacteria.
  • Actinobacteria include some of the most common soil life, freshwater life, and marine life, playing an important role in decomposition of organic materials, such as cellulose and chitin, and thereby playing a vital part in organic matter turnover and carbon cycle.
  • Actinobacteria are well known as secondary metabolite producers and hence of high pharmacological and commercial interest.
  • Some types of Actinobacteria are responsible for the peculiar odor emanating from the soil after rain (Petrichor), mainly in warmer climates.

Key Terms

  • actinobacteria: A group of Gram-positive bacteria with high guanine and cytosine content in their DNA
  • petrichor: The distinctive scent which accompanies the first rain after a long warm dry spell.
  • actinomycin: Any of a class of toxic polypeptide antibiotics found in soil bacteria of genus Streptomyces.

Actinobacteria are a group of Gram-positive bacteria with high guanine and cytosine content in their DNA. They can be terrestrial or aquatic. Actinobacteria is one of the dominant phyla of the bacteria. Analysis of glutamine synthetase sequence has been suggested for phylogenetic analysis of Actinobacteria.

Actinobacteria include some of the most common soil life, freshwater life, and marine life, playing an important role in decomposition of organic materials, such as cellulose and chitin, and thereby playing a vital part in organic matter turnover and carbon cycle. This replenishes the supply of nutrients in the soil and is an important part of humus formation. Other Actinobacteria inhabit plants and animals, including a few pathogens, such as Mycobacterium, Corynebacterium, Nocardia, Rhodococcus, and a few species of Streptomyces.

Actinobacteria are well known as secondary metabolite producers and hence of high pharmacological and commercial interest. In 1940 Selman Waksman discovered that the soil bacteria he was studying made actinomycin, a discovery for which he received a Nobel Prize. Since then, hundreds of naturally occurring antibiotics have been discovered in these terrestrial microorganisms, especially from the genus Streptomyces.

Some Actinobacteria form branching filaments, which somewhat resemble the mycelia of the unrelated fungi, among which they were originally classified under the older name Actinomycetes. Most members are aerobic, but a few, such as Actinomyces israelii, can grow under anaerobic conditions. Unlike the Firmicutes, the other main group of Gram-positive bacteria, they have DNA with a high GC-content, and some Actinomycetes species produce external spores. Some types of Actinobacteria are responsible for the peculiar odor emanating from the soil after rain (Petrichor), mainly in warmer climates. The chemical that produces this odour is known as Geosmin. Most Actinobacteria of medical or economic significance are in subclass Actinobacteridae, order Actinomycetales. While many of these cause disease in humans, Streptomyces is notable as a source of antibiotics. Of those Actinobacteria not in Actinomycetales, Gardnerella is one of the most researched. Classification of Gardnerella is controversial, and MeSH catalogues it as both a gram-positive and gram-negative organism.

Genomic basis for natural product biosynthetic diversity in the actinomycetes

The phylum Actinobacteria hosts diverse high G + C, Gram-positive bacteria that have evolved a complex chemical language of natural product chemistry to help navigate their fascinatingly varied lifestyles. To date, 71 Actinobacteria genomes have been completed and annotated, with the vast majority representing the Actinomycetales, which are the source of numerous antibiotics and other drugs from genera such as Streptomyces, Saccharopolyspora and Salinispora. These genomic analyses have illuminated the secondary metabolic proficiency of these microbes – underappreciated for years based on conventional isolation programs – and have helped set the foundation for a new natural product discovery paradigm based on genome mining. Trends in the secondary metabolomes of natural product-rich actinomycetes are highlighted in this review article, which contains 199 references.


Phylogenetic relatedness of Actinobacteria based…

Phylogenetic relatedness of Actinobacteria based on 16S rRNA sequence and the relative number…

Circular representation of the actinomycete…

Circular representation of the actinomycete chromosomes S. coelicolor A3(2), S. erythraea NRRL2338 and…

Schematic representation of the S.…

Schematic representation of the S. avermitilis MA-4680 chromosome. Drawing details are the same…

Schematic representation of the S.…

Schematic representation of the S. griseus IFO 13350 genome. Drawing details are the…

The comparative analysis of nucleotide…

The comparative analysis of nucleotide sequences of S. griseus IFO 13350 (top), S.…

Conserved genomic regions amongst S.…

Conserved genomic regions amongst S. griseus IFO 13350, S. avermitilis MA-4680 and S.…

Examples of “island” secondary metabolic…

Examples of “island” secondary metabolic gene clusters in Streptomyces involving (A) oligomycin biosynthesis…

Schematic representation of the S.…

Schematic representation of the S. arenicola CNS-205 genome. Drawing details are the same…

Gene organization and proposed biosynthesis…

Gene organization and proposed biosynthesis of a phosphonoformate-like compound in Frankia alni ACN14a.…

Actinobacteria: High G+C Gram-Positive Bacteria

The name Actinobacteria comes from the Greek words for rays and small rod, but Actinobacteria are very diverse. Their microscopic appearance can range from thin filamentous branching rods to coccobacilli. Some Actinobacteria are very large and complex, whereas others are among the smallest independently living organisms. Most Actinobacteria live in the soil, but some are aquatic. The vast majority are aerobic. One distinctive feature of this group is the presence of several different peptidoglycans in the cell wall.

The genus Actinomyces is a much studied representative of Actinobacteria. Actinomyces spp. play an important role in soil ecology, and some species are human pathogens. A number of Actinomyces spp. inhabit the human mouth and are opportunistic pathogens, causing infectious diseases like periodontitis (inflammation of the gums) and oral abscesses. The species A. israelii is an anaerobe notorious for causing endocarditis (inflammation of the inner lining of the heart) (Figure 1).

Figure 1. (a) Actinomyces israelii (false-color scanning electron micrograph [SEM]) has a branched structure. (b) Corynebacterium diphtheria causes the deadly disease diphtheria. Note the distinctive palisades. (c) The gram-variable bacterium Gardnerella vaginalis causes bacterial vaginosis in women. This micrograph shows a Pap smear from a woman with vaginosis. (credit a: modification of work by “GrahamColm”/Wikimedia Commons credit b: modification of work by Centers for Disease Control and Prevention credit c: modification of work by Mwakigonja AR, Torres LM, Mwakyoma HA, Kaaya EE)

The genus Mycobacterium is represented by bacilli covered with a mycolic acid coat. This waxy coat protects the bacteria from some antibiotics, prevents them from drying out, and blocks penetration by Gram stain reagents (see Staining Microscopic Specimens). Because of this, a special acid-fast staining procedure is used to visualize these bacteria. The genus Mycobacterium is an important cause of a diverse group of infectious diseases. M. tuberculosis is the causative agent of tuberculosis, a disease that primarily impacts the lungs but can infect other parts of the body as well. It has been estimated that one-third of the world’s population has been infected with M. tuberculosis and millions of new infections occur each year. Treatment of M. tuberculosis is challenging and requires patients to take a combination of drugs for an extended time. Complicating treatment even further is the development and spread of multidrug-resistant strains of this pathogen.

Another pathogenic species, M. leprae, is the cause of Hansen’s disease (leprosy), a chronic disease that impacts peripheral nerves and the integrity of the skin and mucosal surface of the respiratory tract. Loss of pain sensation and the presence of skin lesions increase susceptibility to secondary injuries and infections with other pathogens.

Bacteria in the genus Corynebacterium contain diaminopimelic acid in their cell walls, and microscopically often form palisades, or pairs of rod-shaped cells resembling the letter V. Cells may contain metachromatic granules, intracellular storage of inorganic phosphates that are useful for identification of Corynebacterium. The vast majority of Corynebacterium spp. are nonpathogenic however, C. diphtheria is the causative agent of diphtheria, a disease that can be fatal, especially in children (Figure 1b). C. diphtheria produces a toxin that forms a pseudomembrane in the patient’s throat, causing swelling, difficulty breathing, and other symptoms that can become serious if untreated.

The genus Bifidobacterium consists of filamentous anaerobes, many of which are commonly found in the gastrointestinal tract, vagina, and mouth. In fact, Bifidobacterium spp. constitute a substantial part of the human gut microbiota and are frequently used as probiotics and in yogurt production.

The genus Gardnerella, contains only one species, G. vaginalis. This species is defined as “gram-variable” because its small coccobacilli do not show consistent results when Gram stained (Figure 1c). Based on its genome, it is placed into the high G+C gram-positive group. G. vaginalis can cause bacterial vaginosis in women symptoms are typically mild or even undetectable, but can lead to complications during pregnancy.

Table 1 summarizes the characteristics of some important genera of Actinobacteria. Additional information on Actinobacteria appears in Taxonomy of Clinically Relevant Microorganisms.

Table 1. Actinobacteria: High G+C Gram-Positive
Example Genus Microscopic Morphology Unique Characteristics
Actinomyces Gram-positive bacillus in colonies, shows fungus-like threads (hyphae) Facultative anaerobes in soil, decompose organic matter in the human mouth, may cause gum disease
Arthrobacter Gram-positive bacillus (at the exponential stage of growth) or coccus (in stationary phase) Obligate aerobes divide by “snapping,” forming V-like pairs of daughter cells degrade phenol, can be used in bioremediation
Bifidobacterium Gram-positive, filamentous actinobacterium Anaerobes commonly found in human gut microbiota
Corynebacterium Gram-positive bacillus Aerobes or facultative anaerobes form palisades grow slowly require enriched media in culture C. diphtheriae causes diphtheria
Frankia Gram-positive, fungus-like (filamentous) bacillus Nitrogen-fixing bacteria live in symbiosis with legumes
Gardnerella Gram-variable coccobacillus Colonize the human vagina, may alter the microbial ecology, thus leading to vaginosis
Micrococcus Gram-positive coccus, form microscopic clusters Ubiquitous in the environment and on the human skin oxidase-positive (as opposed to morphologically similar S. aureus) some are opportunistic pathogens
Mycobacterium Gram-positive, acid-fast bacillus Slow growing, aerobic, resistant to drying and phagocytosis covered with a waxy coat made of mycolic acid M. tuberculosis causes tuberculosis M. leprae causes leprosy
Nocardia Weakly gram-positive bacillus forms acid-fast branches May colonize the human gingiva may cause severe pneumonia and inflammation of the skin
Propionibacterium Gram-positive bacillus Aerotolerant anaerobe slow-growing P. acnes reproduces in the human sebaceous glands and may cause or contribute to acne
Rhodococcus Gram-positive bacillus Strict aerobe used in industry for biodegradation of pollutants R. fascians is a plant pathogen, and R. equi causes pneumonia in foals
Streptomyces Gram-positive, fungus-like (filamentous) bacillus Very diverse genus (>500 species) aerobic, spore-forming bacteria scavengers, decomposers found in soil (give the soil its “earthy” odor) used in pharmaceutical industry as antibiotic producers (more than two-thirds of clinically useful antibiotics)

Think about It

Distribution and Ecology

As members of the phylum Actinobacteria, Actinomycetes are widely distributed in different parts of the world. Because they can live in different environments and exhibit high versatility in their nutrition, this allows them to spread and thrive in different regions across the globe and compete with other organisms in their surroundings.

While such diseases/infections as actinomycosis (caused by actimyce) also occur worldwide, studies have shown prevalence to be influenced by such factors as socioeconomic status and hygiene, etc.

While Actinomycetes can be found in a variety of habitats, they exist in the soil in significant numbers making them some of the most common micro-organisms in different types of soil (about 1 million cells per gram of soil). Here, however, a variety of factors (e.g. pH) influence the species that inhabit different types of soil.

Some of the other factors that may influence the type of species in soil include temperature and oxygen. In soil, Actinomycetes are particularly useful given that they break down a variety of tough compounds/polymers ranging from lignocelluloses to pectin and the cell wall of fungi among others.

In doing so, they play an important role in recycling organic matter in the soil while also controlling the growth of other microbial organisms that tend to be pathogenic to plant.

* Actinomycetes are also involved in nitrogen fixation in soil.

* The earthy smell of freshly turned soil is due to the activities of actinomycetes in soil.

Apart from the soil, some of the species can be found in aquatic environments. Although they can be found in both marine and freshwater environments, they make up a smaller population in marine habitats compared to their prevalence in terrestrial and freshwater.

As is the case in terrestrial/soil habitats, Actinomycetes are also involved in the breakdown of various materials in marine habitats. Here, they have been shown to help in the decomposition of cellulose, alginates, and various hydrocarbons.

* Thermophilic Actinomycetes can be found in compost/manure particularly during the early stages of decomposition. Through their involvement in decomposition, the heat level is increased in manure/compost which provides a favorable living environment.

* Based on studies aimed at investigating characteristics of Actinomycetes found in freshwater (lakes, rivers, etc) it was shown that the majority of these organisms wash-in from land.

Here, the spores that originate from the soil are hydrated which activates them to release motile spores. There is no conclusive evidence to prove that species found in marine habitats originated from land given that they are found in increasing numbers at lower depths and are well adapted to this environment.

Some of the species are capable of surviving in various extreme environments and can, therefore, be classified based on these habitats.

· Alkalophilic species - Identified in soda lake soil (e.g. Bogoriella caseilytica)

· Halophilic species - Survive in areas with high salt concentrations (e.g. Saccharomonospora halophila)

· Psychrophilic species - C ommonly found in very low temperatures (e.g. Modestobacter multiseptatus)

Bacteria, Distribution and Community Structure

A.C. Yannarell , A.D. Kent , in Encyclopedia of Inland Waters , 2009


The Actinobacteria , formerly known as the high G + C gram-positive bacteria, are another group of bacteria that are commonly found in lakes with a wide range of water chemistries. Actinobacteria may comprise a large fraction (up to 60%) of the bacterioplankton in some freshwater systems, and their abundance often peaks in late fall and winter. Actinobacteria appear to be more tolerant of conditions with low concentrations of organic carbon, and they may be replaced by Betaproteobacteria when algal blooms lead to increased carbon levels. Freshwater Actinobacteria fall into four distinct phylogenetic clusters: acI, acII, acIII, and acIV. Actinobacteria in the acI and acII clades are exclusive to freshwaters. Two acII subclusters, acII-B (formerly Luna-1) and acII-D (formerly Luna-2), are comprised of ultramicrobacterial isolates (cell volume <0.1 μm 3 ). The acIII cluster (formerly Actinobacteria cluster 2) has only been detected in the chemocline of a single European lake and from a hypersaline soda lake in California. AcIII Actinobacteria are most closely related to soil organisms. Actinobacteria from the acIV cluster may be found in a variety of habitat types: lakes and rivers, as well as estuaries and marine waters and sediments.

Taxonomy, Physiology, and Natural Products of Actinobacteria

Actinobacteria are Gram-positive bacteria with high G+C DNA content that constitute one of the largest bacterial phyla, and they are ubiquitously distributed in both aquatic and terrestrial ecosystems. Many Actinobacteria have a mycelial lifestyle and undergo complex morphological differentiation. They also have an extensive secondary metabolism and produce about two-thirds of all naturally derived antibiotics in current clinical use, as well as many anticancer, anthelmintic, and antifungal compounds. Consequently, these bacteria are of major importance for biotechnology, medicine, and agriculture. Actinobacteria play diverse roles in their associations with various higher organisms, since their members have adopted different lifestyles, and the phylum includes pathogens (notably, species of Corynebacterium, Mycobacterium, Nocardia, Propionibacterium, and Tropheryma), soil inhabitants (e.g., Micromonospora and Streptomyces species), plant commensals (e.g., Frankia spp.), and gastrointestinal commensals (Bifidobacterium spp.). Actinobacteria also play an important role as symbionts and as pathogens in plant-associated microbial communities. This review presents an update on the biology of this important bacterial phylum.

Copyright © 2015, American Society for Microbiology. All Rights Reserved.

Article information

Genomic basis for natural product biosynthetic diversity in the actinomycetes

M. Nett, H. Ikeda and B. S. Moore, Nat. Prod. Rep., 2009, 26, 1362 DOI: 10.1039/B817069J

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

A novel firmicute protein family related to the actinobacterial resuscitation-promoting factors by non-orthologous domain displacement

Background: In Micrococcus luteus growth and resuscitation from starvation-induced dormancy is controlled by the production of a secreted growth factor. This autocrine resuscitation-promoting factor (Rpf) is the founder member of a family of proteins found throughout and confined to the actinobacteria (high G + C Gram-positive bacteria). The aim of this work was to search for and characterise a cognate gene family in the firmicutes (low G + C Gram-positive bacteria) and obtain information about how they may control bacterial growth and resuscitation.

Results: In silico analysis of the accessory domains of the Rpf proteins permitted their classification into several subfamilies. The RpfB subfamily is related to a group of firmicute proteins of unknown function, represented by YabE of Bacillus subtilis. The actinobacterial RpfB and firmicute YabE proteins have very similar domain structures and genomic contexts, except that in YabE, the actinobacterial Rpf domain is replaced by another domain, which we have called Sps. Although totally unrelated in both sequence and secondary structure, the Rpf and Sps domains fulfil the same function. We propose that these proteins have undergone "non-orthologous domain displacement", a phenomenon akin to "non-orthologous gene displacement" that has been described previously. Proteins containing the Sps domain are widely distributed throughout the firmicutes and they too fall into a number of distinct subfamilies. Comparative analysis of the accessory domains in the Rpf and Sps proteins, together with their weak similarity to lytic transglycosylases, provide clear evidence that they are muralytic enzymes.

Conclusions: The results indicate that the firmicute Sps proteins and the actinobacterial Rpf proteins are cognate and that they control bacterial culturability via enzymatic modification of the bacterial cell envelope.

Tropheryma whipplei Twist: a human pathogenic Actinobacteria with a reduced genome

The human pathogen Tropheryma whipplei is the only known reduced genome species (<1 Mb) within the Actinobacteria [high G+C Gram-positive bacteria]. We present the sequence of the 927303-bp circular genome of T. whipplei Twist strain, encoding 808 predicted protein-coding genes. Specific genome features include deficiencies in amino acid metabolisms, the lack of clear thioredoxin and thioredoxin reductase homologs, and a mutation in DNA gyrase predicting a resistance to quinolone antibiotics. Moreover, the alignment of the two available T. whipplei genome sequences (Twist vs. TW08/27) revealed a large chromosomal inversion the extremities of which are located within two paralogous genes. These genes belong to a large cell-surface protein family defined by the presence of a common repeat highly conserved at the nucleotide level. The repeats appear to trigger frequent genome rearrangements in T. whipplei, potentially resulting in the expression of different subsets of cell surface proteins. This might represent a new mechanism for evading host defenses. The T. whipplei genome sequence was also compared to other reduced bacterial genomes to examine the generality of previously detected features. The analysis of the genome sequence of this previously largely unknown human pathogen is now guiding the development of molecular diagnostic tools and more convenient culture conditions.


( A ) Circular representation…

( A ) Circular representation of the T. whipplei Twist genome ( upper…

( A ) Circular representation…

( A ) Circular representation of the T. whipplei Twist genome ( upper…

Comparative analysis of the number…

Comparative analysis of the number of genes present in each functional category as…

Predicted amino acid metabolisms of…

Predicted amino acid metabolisms of T. whipplei based on the M. tuberculosis metabolisms…

Thioredoxin and glutaredoxin systems. Proteins…

Thioredoxin and glutaredoxin systems. Proteins in pink rectangles are predicted to be present…

Typical examples of anomalous phylogenies…

Typical examples of anomalous phylogenies exhibited by T. whipplei genes. Trees were constructed…

Extra domains in secondary transport carriers and channel proteins

Ravi D. Barabote , . Milton H. Saier Jr. , in Biochimica et Biophysica Acta (BBA) - Biomembranes , 2006

3.7 TrkA-C domains in proteins of the aspartate:alanine exchanger (AAE) family (TC #2.A.81)

No publication has previously described the characteristics of the AAE family and its constituent members. Consequently, in this section, we not only describe the uniquely situated duplicated TrkA-C domains, we also provide a brief description of this family.

A single functionally characterized protein, the aspartate:alanine exchanger (AspT) of the Gram-positive lactic acid bacterium, Tetragenococcus halophilaD10 (Pediococcus halophilus) serves to characterize the AAE family [51] . This organism takes up l -aspartate via AspT, decarboxylates it to l -alanine and CO2 in the cytoplasm using l -aspartate β-decarboxylase (AspD), and exports the l -alanine in a 1:1 exchange reaction with l -aspartate. AspT is a hydrophobic protein of 543 aas and 10–12 putative TMSs. This protein has many bacterial homologues of unknown function and one very distant homologue in the archaeon, Halobacterium sp. strain NRC-1. This last protein (384 aas 10–12 putative TMSs AAC82885 ) is the only AAE family member from an archaeon in the NCBI database, and there are none from eukaryotes (see below).

Because one more negative charge is brought in (aspartate) than is exported (alanine), the exchange transport process may result in net charge movement, creating a membrane potential, negative inside. Further, decarboxylation of aspartate consumes a scalar proton and thus generates a pH gradient (basic inside). The resultant pmf can drive ATP synthesis via the F-type ATPase (TC #3.A.2). Other such exchangers generating a pmf are the prototypical oxalate/formate exchanger of the MFS (TC #2.A.1) as well as glutamate/γ-amino butyrate, malate/lactate, citrate/lactate and histidine/histamine exchangers (for references see Ref. [51] ).

Each of the two hydrophobic halves of AAE family proteins contains six peaks of hydropathy that may correspond to 5 or 6 TMSs. The two halves exhibit almost unprecedented degrees of sequence similarity, suggesting that they arose by an intragenic duplication event relatively recently. If the six hydrophobic peaks correspond to 5 TMSs, there may be a single re-entrant loop between TMSs 4 and 5 [26,27] . Although this characteristic resembles that of one well-studied putative member of the IT superfamily, the CitS citrate transporter of Klebsiella pneumoniae in the CCS family (TC #2.A.24) [22,26,52] , we could not establish a statistically significant relationship between these two groups of proteins. Interestingly, in all but one of the bacterial homologues, but not in the archaeal or the Fusobacterium nucleatum homologue, the two hydrophobic repeat elements are separated by two tandem hydrophilic TrkA-C domains [23,24] that must have arisen by an independent duplication or insertional event.

Table AAE-S1 on our website (

msaier/supmat/AAE ) presents the members of the AAE family in alphabetical order according to species name. All but one of the members of this family derive from bacteria, but several bacterial kingdoms are represented. Most are derived from proteobacteria of the α, β, γ and δ-subclasses. Others are found in low and high G + C Gram-positive bacteria (Firmicutes and Actinobacteria, respectively) as well as Bacteroides species, Chloroflexus aurantracus, Fusobacterium nucleatum and Rhodopirellula baltica. Several species have multiple paralogues. Within the Bacteroides group, Porphyromonas gingivalis has two, Bacteroides fragilus has 3, and B. thetaiotaomicron has 4, the most of any species examined.

Among the proteobacteria, organisms listed in Table AAE-S1 can encode within their genomes from one to three paralogues. The α- and β-proteobacteria represented can have 1–3 paralogues while the γ- and δ-proteobacteria have just 1 or 2. The two Firmicutes represented have just one member while actinobacteria can have either one or two. All three Corynebacterial species have two. Finally, all other bacteria represented have only one. These proteins are almost all large, between 516 and 580 residues. This large size reflects the presence of the two tandem TrkA-C domains between the two putative 5 or 6 TMS repeat units (mentioned above). The two TrkA-C domains are each about 70 residues long. The central hydrophilic region is usually about 220 residues long while the two flanking hydrophobic domains are about 170 residues long. These two hydrophobic and two hydrophilic domains thus account for much of the full sizes of most of these proteins.

A few proteins listed in Table AAE-S1 are either shorter or longer than the large majority of homologues. Two of these, Msu1 and Msu2 from Mannheimia succiniciproducens, are half-sized (280 and 297 residues). They correspond to the N- and C-terminal halves of a typical AAE family member, each possessing a single AAE and a single TrkA–C domain. Fused, they comprise a full-length AAE family homologue. Either a sequencing error or a mutation accounts for this splicing event.

Two proteins, Dps2 of Desulfotalea psychrophila and Vfi2 from Vibrio fischeri, are large (669 and 624 residues) because they possess N-terminal extensions of 95 and 50 residues, respectively, both of which contain a single hydrophobic putative TMS. The former, but not the latter, exhibits sequence similarity with the N-terminal region of the HlyA protein of E. coli (gi #68520714). Thus, when residues 5–83 of Dps2 were compared with residues 15–94 of HlyD, with the GAP program [53] , 36% identity, 50% similarity and a comparison score of 14.3 S.D. was obtained. This is sufficient to establish homology. It is possible that these sequences are important for targeting of the proteins to the secretory (Sec) apparatus for membrane insertion [54] .

The sequences of members of the AAE family were multiply aligned with the CLUSTAL X program [17] , and this alignment (see Fig. AAE-S1 on our website) was used to generate average hydropathy/similarity plots ( Fig. 5 A AveHas program Ref. [19] ) as well as a clustal-generated tree ( Fig. 5 B TreeView program Ref. [55] ).

Fig. 5 . (A) Average hydropathy and similarity plots for the proteins of the AAE family. The proteins (Table AAE-S1) included in the CLUSTAL X alignment (Fig. AAE-S1), upon which this plot was based, can be viewed on our website (

msaier/supmat/Domains ). The putative 5 TMS repeat unit (putative TMSs 1–5 or 6–10) plus a proposed semipolar re-entrant loop (R1 or R2) [26,27] are indicated as are the duplicated TrkA-C domains [23,24] . (B) Clustal-generated tree for the AAE family. The TreeView program was used to draw the tree, based on the CLUSTAL X alignment shown in Figure AAE-S1 on our website. The proteins comprising the eleven clusters and their abbreviations are presented in Table AAE-S1.

The plot in Fig. 5 A reveals one hydrophobic peak (peak 0) at alignment positions 20–60, very poorly conserved, found in only two proteins as noted above. Five peaks of average hydropathy (peaks 1–5) follow peak 0, and these are all well conserved (alignment positions 100–300). A potential semipolar re-entrant loop occurs between putative TMSs 4 and 5 [26,52] . Then follows the central hydrophilic region where the two tandem TrkA–C domains are found. Finally, 5 more peaks of hydrophobicity, including a potential re-entrant loop between putative TMSs 9 and 10, follow. Similarity between peaks 1–5 and 6–10 can be seen for example, peaks 1, 4, 6 and 9 are the most hydrophobic while the probable re-entrant loops, peaks R1 and R2, are the least hydrophobic, exhibiting semipolar characteristics.

These observations prompted us to examine segments encompassing TMSs 1–5 and TMSs 6–10 to determine if they are homologous. The IC [18] and GAP [53] programs were used for this purpose. The results of one comparison are shown in Figure AAE-S2 on our website. The startling comparison score value of 23 S.D. is among the greatest observed for the two repeat units in any family within the TCDB (see Refs. [22,56] ). The high values obtained suggest that the intragenic duplication event that gave rise to these putative 10 TMS proteins must have occurred relatively recently in evolutionary history. They also suggest that the two halves of these proteins serve similar functions and have not yet acquired a high degree of functional selectivity.

The clustal-generated tree for the AAE family proteins is shown in Fig. 5 B. It can be seen that the various clusters exhibit a limited diversity of organismal types from which these proteins were derived. Thus, clusters 1 and 7 proteins are all from γ-proteobacteria, and all proteins in each of these two clusters are from different organisms. The only exception is the Msu1 and Msu2 proteins which may represent the two halves of a single protein, split because of a sequencing error or an authentic mutation. We suggest that each of these clusters represents a set of orthologues although the two clusters are clearly non-orthologous to each other. In a like fashion, cluster 4 proteins are derived from the only two Firmicutes represented, and cluster 6 proteins are all derived from members of the Bacteroides. Of these proteins, only Porphyromonas gingivalis has two paralogues. Clusters 10 and 11 each includes a single protein, from the only member of the Planctomycetacia and from Fusobacterium nucleatum, respectively. Other clusters consist of homologues from broader groups of bacteria, and each of these clusters is likely to comprise a different set of functionally distinct proteins. Thus, cluster 2 includes proteins from β-, γ- and δ-proteobacteria, cluster 3 is from α- and β-proteobacteria, cluster 5 is from α- and β-proteobacteria as well as Bacteroides, cluster 8 is from Bacteroides as well as a δ-proteobacterium, and cluster 9 is from β- and δ-proteobacteria as well as Chloroflexi and Actinobacteria.