Information

What is the transmembrane 'Positive-Inside Rule' nowadays? Has the definition changed over time?

What is the transmembrane 'Positive-Inside Rule' nowadays? Has the definition changed over time?


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

First definition.

Two publications by von Heijne in 1989 and 1992 coined the 'Positive-Inside rule' and showed it's practical value in topology prediction of transmembrane helices. It was clearly defined and evidenced that in bacteria the positively charged residues more commonly were found on the "inside" of a membrane (the cytoplasm rather than the periplasm).

Recent Literature.

However it seems the field has moved away from the idea of "inside the compartment" to "inside the cytoplasm" as more data became available. In a 2006 review even von Heijne describes some transmembrane helices as abiding to the positive inside rule because their positively charged residues were inside the cytoplasm, although never explicitly backtracks on the original definition. A similar review in 2007 by von Heijne offers a more definitive refinement of the rule, however removes the concept from being applicable to subcellular membranes.

… the loops connecting the helices differ in amino acid composition, depending on whether they face the inside or outside of the cell (the “positive-inside” rule).

We are now faced with an awkward rule that doesn't account for proteins elsewhere in the secretory pathway or in the other organelles. All those proteins are inside the cell.

More recently still large-scale analysis of transmembrane helices from different biological membrane surfaces, Sharpe et al., 2010 and Baeza-Delgado et al., 2013, show the clustering of positive charge being cytosolic rather than discussing it in terms of the inside of the compartment. Both of these papers still say that they corroborate with the positive inside rule.

Question.

To me it seems that the definition is somewhat sloppy and can broadly be used to say "inside the cytoplasm", despite publications being reluctant to define the rule clearly. Has the definition ever explicitly been changed, or has the field silently changed what "inside" means?… Or have I completely misinterpreted something? How do the organelles fit into any of these definitions?


I think you have misunderstood the "inside" part of the "positive-inside rule". Perhaps because "inside" is indeed an imprecise term (but now it is history and cannot be changed ;) ). In order to understand it a bit better it helps to think about the topology of the membrane. During synthesis most membrane proteins (ignoring peroxisomal and mitochondrial proteins, which are a whole other topic) are inserted into the ER membrane by the Sec translocon. The parts of the membrane protein that are exposed to the lumen are matured by the ER and Golgi apparatus to be presented on the extracellular surface. Secretory proteins (those that will ultimately be exported from the cell) are made in similar way except they pass entirely through the translocon and end up in the ER lumen. Therefore, the lumen of the ER (and consequently the lumen of other endomembrane compartments) are topologically consistent with the extracellular space, "outside", and not the "inside" of the cell. Perhaps is is easiest to see in a picture (borrowed from here: http://www.bioon.com/book/biology/mboc/chapter12/figure12-4.gif"> Coloured in pink are compartments whose topology is consistent with the "outside" and coloured grey are compartments whose topology is consistent with the "inside". So, for membrane proteins found in the endosome, for example, amino acid residues in the lumen are actually "outside" not "inside". It is true that this may not be immediately clear to those not steeped deeply in the field. As the initial studies used bacteria, which do not have membrane-enclosed subcellular compartments, "inside" was simple and made sense. Perhaps "positive-in-the-cytoplasm rule" would have been more accurate. Hope this helps!


Membrane-protein topology

The topology of an integral membrane protein describes the number and approximate locations in the sequence of the transmembrane segments, as well as the overall orientation of the protein in a membrane.

Topology is controlled primarily by the hydrophobicity and length of transmembrane helices as well as the distribution of positively charged residues in the loops that connect the helices.

In most cases, topology is determined co-translationally during the translocon-mediated insertion of a polypeptide into a membrane.

Topologies in which both the N terminus and the C terminus of a protein are in the cytoplasm are predominant in both prokaryotic and eukaryotic cells.

Membrane proteins evolve primarily by gene duplication and gene fusion. Many membrane proteins form dimers in which the two homologous chains have the same topology (parallel dimer) or opposite topologies (antiparallel dimer). Gene fusions create internally duplicated structures in which the two halves of a protein are orientated either in a parallel or an antiparallel manner.


Recent advances in the understanding of membrane protein assembly and structure

For a variety of reasons – not the least biomedical importance – integral membrane proteins are now very much in focus in many areas of molecular biology, biochemistry, biophysics, and cell biology. Our understanding of the basic processes of membrane protein assembly, folding, and structure has grown significantly in recent times, both as a result of new methodological developments, more high-resolution structure data, and the possibility to analyze membrane proteins on a genome-wide scale.

So what is new in the membrane protein field? Various aspects of membrane protein assembly and structure have been reviewed over the past few years (Cowan & Rosenbusch, 1994 Hegde & Lingappa, 1997 Lanyi, 1997 von Heijne, 1997 Bernstein, 1998) here, I will try to bring together a number of exciting recent developments. Particularly noteworthy are the discoveries related to the mechanisms of membrane protein assembly into the inner membrane of E. coli , the inner membrane of mitochondria, and the way transmembrane segments are handled by the ER translocon.

Other advances include detailed studies of the interaction between transmembrane helices and the lipid bilayer, and of helix–helix packing interactions in the membrane environment. The availability of full genomic sequences have made it possible to study membrane proteins on a genome-wide scale. Finally, a handful of new high-resolution 3D structures have appeared.


MATERIALS AND METHODS

DNA Constructs and Chemicals

The extracellular triple hemagglutinin (3HA) tag, consisting of amino acid residues of SLEYPYDVPDY-ASYPYDVPDYAYPYDVPD, was inserted into the fourth extracellular loop after residue 897 in the G551D CFTR, as described for the wild-type (wt) and ΔF508 CFTR (Sharma et al., 2004 Pedemonte et al., 2005). The G551D CFTR cDNA was kindly provided by Dr. J. Rommens (Hospital for Sick Children, Toronto). All chemicals of the highest available purity were obtained from Sigma-Aldrich (Oakville, ON, Canada) except when indicated. Bafilomycin A1 (Baf) was from LC Laboratories (Woburn, MA).

Cells

HeLa, Madin-Darby canine kidney (MDCK) and RAW264.7 cells were grown in Dulbecco's modified Eagle's medium (DMEM) containing 10% fetal bovine serum (FBS) in a thermostated cell culture incubator in 5% CO2 at 37°C. Baby hamster kidney (BHK) cells were grown in DMEM/F12 containing 5% FBS. IB3 cells (kindly provided by Dr. P. Zeitlin, Johns Hopkins University) were grown in LHC-8 basal medium (Invitrogen, Carlsbad, CA) containing 5% FBS. The IB3 cells have ΔF508/W1282X genotype (Zeitlin et al., 1991). The human bronchial epithelial cell line derived from a CF patient with a ΔF508F508 genotype (CFBE41o-, designated as CFBE) have been characterized (Cozens et al., 1994) and were cultured in MEM with GlutaMAX (Invitrogen, Carlsbad, CA) supplemented with 10% FBS serum at 37°C in a 5% CO2-humidified incubator as described (Gruenert et al., 1995). To allow differentiation of the MDCK and CFBE, epithelial cells were cultured at confluence for 3–5 d. Primary macrophages were obtained by intraperitoneal and bronchoalveolar lavages as previously described (Guilbault et al., 2008). Macrophages were isolated by centrifugation and resuspended in RPMI 1640 with 10% FBS and seeded on polylysine (Sigma)-coated glass coverslips. Coverslips were coated according to manufacturer's instructions. Experiments were performed 24–48 h after plating the cells.

All cell types used were either transiently or stably transfected with wt or G551D CFTR-3HA harboring triple hemagglutinin tags in the fourth extracellular loop (Sharma et al., 2004). BHK cell expressing the wt and G551D CFTR-3HA was generated as previously described and clones were selected in the presence of 500 μM methotrexate (Sharma et al., 2001).

CFTR variants were stably expressed in the CFBE and HeLa cells using lentiviral vectors by Dr. J. Wakefield (Tranzyme, Birmingham, AL) and selected in the presence of 5 μg/mP puromycin as described (Bebok et al., 2005). IB3 bronchial epithelia, expressing endogenous ΔF508 and W1282X CFTR at undetectable levels, were stably transfected with the pCEP4 expression plasmid encoding the wt or G551D CFTR-3HA. A mixture of clones were selected in hygromycin B. Transfected cells were maintained in LHC-8 containing 5% FBS and 100 μg · ml −1 hygromycin B. RAW cells were transiently transfected with CFTR encoding pNut-CFTR using FuGENE6 as DNA complexing agent (Roche, Basel, Switzerland) according to the manufacturer's recommendation and analyzed after 48 h. MDCK cells were infected with retrovirus encoding the CFTR-3HA variants as described previously (Benharouga et al., 2003).

Animals

Inbred C57BL/6-Cftr +/− heterozygous mice were maintained and bred in the Animal Facility of the McGill University Health Center Research Institute. All pups were genotyped between 12 and 14 d of age. The animals were kept in cages with sterile corn bedding (Anderson, Bestmonro, LA) and maintained in ventilated racks (Lab Products, Seaford, DE). Age-matched C57BL/6-Cftr +/+ mice and C57BL/6-Cftr −/− mice were maintained in murine pathogen–, Helicobacter-, and parasite-free conditions. They were housed (1–4 animals/cage), bred, and maintained in a facility under specific pathogen-free conditions. Mice were fed with either the NIH-31–modified, irradiated mouse diet for wt mice (Harlan Teklad, Indianapolis, IN) or a liquid diet starting at 14 d of age for knockout mice (Peptamen liquid diet Nestlé Canada, Brampton, ON, Canada). The liquid diet was freshly prepared each morning and provided in 50-ml centrifuge tubes (Fisher Scientific, Nepean, ON, Canada). Mice used for experiments were between 19 and 22 wk of age. Experimental procedures with mice were conducted in accordance with the Canadian Council on Animal Care guidelines and with the approval of the Facility Animal Care Committee of the Montreal General Hospital Research Institute (Montreal, PQ, Canada).

Labeling of Endocytic Organelles with pH-sensitive Dyes

The luminal pH of early endosomes, recycling endosomes, lysosomes, and phagosomes was routinely determined after the selective labeling of the respective compartment with fluorescein isothiocyanate transferrin (FITC)-conjugated cargo (e.g., CFTR, transferrin, dextran, and P. aeruginosa PAO1) by FRIA as described for CFTR and other cargo molecules (Sharma et al., 2004 Barriere et al., 2007 Kumar et al., 2007 Barriere and Lukacs, 2008 Duarri et al., 2008 Varghese et al., 2008 Glozman et al., 2009).

To label internalized CFTR, cells were incubated with anti-HA (1:500 dilution equivalent to 10 μg/ml, MMS101R, Covance Laboratories, Madison, WI) primary Ab and FITC-conjugated goat anti-mouse secondary Fab (1:500 dilution, Jackson ImmunoResearch Laboratories, West Grove, PA) by incubating primary for 1 h at 37°C. Cells were then washed (140 mM NaCl, 5 mM KCl, 20 mM HEPES, 10 mM glucose, 0.1 mM CaCl2, and 1 mM MgCl2, pH 7.3) and chased for indicated time at 37°C. Fluid-phase Ab uptake was not detectable in mock-transfected cells (data not shown). When indicated, cell surface–resident CFTR was labeled on ice by successive incubation with the primary anti-HA Ab and secondary FITC-Fab.

To confirm that the primary and secondary Ab remains bound to CFTR during FRIA experiments, the pH resistance of Ab binding was measured by immunoperoxidase assay. After anti-HA– and HRP-conjugated secondary Ab binding to CFTR-expressing HeLa cells, the extracellular medium pH was adjusted to pH 7.2, 5.0, and 2.5 for 5 min (van Kerkhof et al., 2001 0.15 M NaCl, 50 mM glycine, 0.1% BSA, 20 mM MES, for pH 2.5 and 5.0, 10 mM HEPES for pH 7.2). The amount of bounded HRP-conjugated Ab was measured by Amplex-Red as substrate, using a POLARstar OPTIMA fluorescence plate-reader (BMG Labtech, Offenburg, Germany Barriere et al., 2006). The Ab binding was virtually unaltered at pH 5.0, but reduced by 50% at pH 2.5 (Supplemental Figure S1A).

Recycling endosomes were labeled with FITC-transferrin (Tf 15 μg/ml, 45-min loading after 45 min serum-depletion at 37°C) and chased for 0–3 min. Lysosomal pH was measured with similar results on cells labeled by overnight fluid-phase uptake of FITC-dextran alone or in combination with Oregon Green 488-dextran (50 μg/ml, MW 10 kDa, Molecular Probes, Eugene, OR) and chased for >3 h.

Phagosomal pH was monitored following the uptake of FITC- or FITC/TRITC-conjugated P. aeruginosa (PAO1 strain, kindly provided by Dr. M. Parsek, University of Washington, Seattle). From an overnight culture ∼5 × 10 8 bacteria were opsonized in 20% FBS-PBS for 30 min at 37°C with agitation. Cells were washed with PBS and labeled in the presence of 1 mg/ml FITC, TRITC, or a combination of both in PBS at pH 8.0 for 30 min at room temperature under rotation. Excess of fluorescent dyes was removed by repeated centrifugation in ice-cold PBS. Bacteria were snap-frozen and stored at −80°C.

Organellar pH Measurement

FRIA of endocytic organelles was performed on an Axiovert 100 inverted fluorescence microscope (Carl Zeiss MicroImaging, Toronto, ON, Canada) at room temperature equipped with a Hamamatsu ORCA-ER 1394 (Hamamatsu, Japan) cooled CCD camera and a Planachromat (63× NA 1.4) objective essentially as described previously (Sharma et al., 2004 Barriere et al., 2007 Barriere and Lukacs, 2008 Glozman et al., 2009). Image acquisition and FRIA were performed with MetaFluor software (Molecular Devices, Downingtown, PA). Images were acquired at 490 ± 5 and 440 ± 10-nm excitation wavelengths, using a 535 ± 25-nm emission filter. To determine the initial rates of acidification of endosomes (see Figure 4E), CFTR was labeled with anti-HA and FITC-conjugated goat anti-mouse Fab sequentially for 1 h on ice and chased at 37°C for the indicated times. The cell surface remaining Abs were acid-stripped before image acquisition (van Kerkhof et al., 2001).

In situ calibration curves, describing the relationship between the fluorescence ratio values and endosome, lysosome, or phagosome pH, served to calculate the luminal pH of individual vesicles after fluorescence background subtraction at both excitation wavelengths (e.g., Supplemental Figure S1, B–D). In situ calibration was performed by clamping the vesicular pH between 4.5 and 7.4 in K + -rich medium (135 mM KCl, 10 mM NaCl, 20 mM HEPES, or 20 mM MES, 1 mM MgCl2, and 0.1 mM CaCl2) with 10 μM nigericin, 10 μM monensin, 0.4 μM Baf, and 20 μM carbonyl cyanide 3-chlorophenylhydrazone (CCCP Sigma-Aldrich) and recording the fluorescence ratios. Calibration curves were obtained for each cargo molecules and repeated at regular intervals. As an internal control, one-point calibration was performed on each coverslip by clamping the organellar pH to 6.5 with 10 μM monensin and nigericin, 0.4 μM Baf, and 20 μM CCCP. In each experiment the pH of 200–800 endosomes/lysosomes and 100–400 phagosomes was determined. Mono- or multipeak Gaussian distributions of vesicular pH values were obtained with Origin 7.0 software (OriginLab, Northampton, MA), and the results of individual experiments were illustrated. The mean pH of each vesicle population was calculated as the arithmetic mean of the data in each individual experiment using 200–800 vesicles from 15 to 60 cells. At least three independent experiments were performed for each condition.

Determination of the Relative Counterion Permeability and Buffer Capacity of Organelles

Rapid dissipation of the organellar pH gradient by the proton pump inhibitor (Baf) and the protonophore (CCCP) was used to determine the relative counterion permeability of the organelles. Cells were labeled as described above and the organellar pH was continuously monitored by FRIA for the indicated time. pH dissipation was measured in the presence of 0.4 μM Baf and 20 μM CCCP. To measure the passive proton leak, only Baf was added. Both Baf and CCCP were used at saturating concentrations (Lukacs et al., 1990, 1991 Hackam et al., 1997 Steinberg et al., 2007b). When indicated, CFTR was activated with the PKA agonist cocktail (20 μM forskolin, 0.5 mM 8-(4-chlorophenyl-thio)adenosine 3′,5′-cyclic monophosphate sodium salt [CPT-cAMP], and 0.2 mM isobutylmethylxanthine [IBMX]) for 3 min before Baf+CCCP addition. FRIA was performed as described above. The pH dissipation rate was calculated from the initial slope of the fluorescence ratio change. The buffer capacity of endocytic organelles was calculated from the extent of rapid alkalinization after the addition of 0.5–2 mM NH4Cl in the presence or absence of Baf. The calculation was based on the formula from Roos and Boron (Roos and Boron, 1981 Sonawane and Verkman, 2003).

The Passive Proton Permeability Determination

The passive proton permeability of recycling endosomes, lysosomes, and phagosomes was calculated according to the following equation: where PH+ is the organellar passive proton permeability in cm · s −1 , dpHo/dt is the organellar proton flux in pH · s −1 , V is the organellar volume in cm 3 , S is the organellar surface in cm 2 , βv is the organellar buffer capacity in M · pH −1 , and ([H + ]o − [H + ]c), is the transmembrane proton gradient between the organelle lumen and the cytosol (Chandy et al., 2001 Grabe and Oster, 2001). The passive proton flux was measured by monitoring the organellar alkalinization rate immediately after 400 nM Baf addition as shown in Figure 2D. The buffer capacity was measured by monitoring the rapid alkalinization of organelles after NH4Cl addition in the presence of Baf as detailed in Materials and Methods and in Chandy et al. (2001). The surface and volume of endosomes and lysosomes were obtained from published data (Griffiths et al., 1989). The phagosome volume and surface calculation was based on the assumption that average phagosome diameter is 3 μm and the distance between the bacterial wall and the inner leaflet of the phagosomal membrane is 100 nm, based on EM observations (Hart et al., 1987 Chastellier, 2008). In light of the high counterion conductance of the endocytic organelles and the modest membrane potential of phagolysosomes (<25 mV including the Donnan potential effect Steinberg et al., 2007b), we assumed that the membrane potential has modest contribution to the electrochemical proton driving force in endocytic organelles. Therefore, the passive proton permeability was calculated by taking into consideration only the chemical proton driving force, and thus it likely represents an overestimate. The cytoplasmic pH was assumed to be constant (pH 7.3) during the proton efflux measurements (Mukherjee et al., 1997).

Immunofluorescence Microscopy

CFTR subcellular distribution was determined after internalization of anti-HA Ab for 1 h in DMEM and chased for 0.5 h. CFTR was visualized by FITC-conjugated goat anti-mouse Fab or TRITC-conjugated goat anti-mouse after cell fixation. During the last 45 min, cells were labeled with FITC-Tf or TRITC-Tf (15 μg/ml) after 45-min serum depletion to visualize recycling endosomes. Early endosome marker 1 (EEA1) and Rab5 were detected by indirect immunostaining using rabbit polyclonal anti-EEA-1 and anti-Rab5 antibodies from Abcam (Cambridge, United Kingdom) and Santa Cruz Biotechnology, (Santa Cruz, CA), respectively. Lysosomes were labeled with FITC-dextran (50 μg/ml, MW 10 kDa) as described for the pH determination. In some experiments, the lysosome was stained with mouse monoclonal anti-Lamp-2 Ab (H4B4 Ab was developed by Dr. J. Thomas August and Dr. James E. K. Hildreth and was obtained from the Developmental Studies Hybridoma Bank maintained by The University of Iowa, Department of Biological Sciences, Iowa City, IA). Single optical sections were collected by Zeiss LSM510 laser confocal fluorescence microscope, equipped with a Plan-Apochromat 63X/1.4 (Carl Zeiss Microimaging, Thornwood, NY) as described (Lechardeur et al., 2004). Images were processed with Adobe Photoshop (Adobe Systems, San Jose, CA) software. CFTR phagosomal colocalization was measured by internalizing anti-HA Ab complexed with TRITC-conjugated goat anti-mouse IgG for 1 h in DMEM and chased for 0.5 h, and then FITC-PAO1 bacteria were phagocytosed for the indicated time. Colocalization was performed using the “Colocalization” feature in the Volocity 4.1.0 (Improvision software Molecular Devices, Sunnyvale, CA) software as described in the Supplemental Materials.

Western Blotting

CFTR expression level of cell lines was determined by immunoblotting using anti-HA (MMS101R, Covance) for epithelial cells or anti-CFTR antibodies (M3A7 and L12B4, Chemokine, East Orange, NJ) for macrophages. Cell lysates were prepared using RIPA buffer containing 10 μg/ml leupeptin, pepstatin, 100 μM phenylmethylsulfonyl fluoride, 10 μM MG132, and 10 mM N-ethylmaleimide, and immunoblotting of CFTR was performed as described previously using enhanced chemiluminescence (Sharma et al., 2004).

Iodide Efflux Assay

The plasma membrane cAMP-dependent halide conductance of transfected and parental cells was determined with the iodide efflux as described (Sharma et al., 2001). The parental cells have no or negligible amount of cAMP-activated halide conductance. The CFTR inhibitor MalH2 (kindly provided by Dr. A. Verkman, University of California, San Francisco) was added simultaneously with the PKA agonist cocktail containing 20 μM forskolin, 0.5 mM CPT-cAMP, and 0.2 mM IBMX. The pH sensitivity of the iodide efflux was determined after incubating the cells at pH 5 during the last 10 min of the iodide loading and the efflux measurement in 20 mM MES-supplemented loading and efflux buffer, respectively. The iodide-selective electrode was calibrated in MES containing buffer at pH 5.

Statistical Analysis

Experiments were repeated at least three times or as indicated. Data are means ± SEM. Significance was assessed by calculating two-tailed p values at 95% confidence level with unpaired t test, using Prism software (GraphPad Software, San Diego, CA).


3 Results

3.1 Inter-residue interactions in ATM: comparison to BTM and GLOB

We analyzed inter-residue interactions in the TM region of 3462 helices from 430 proteins (ATM dataset see Section 2). Absolute counts for the interactions are available in Supplementary Table S2 . The results are presented as heatmaps that display: (i) all the interactions ( Fig. 1A), (ii) those in the protein interior ( Fig. 1B) or in the lipid-exposed protein surface ( Fig. 1C), (iii) those of inter-helical ( Fig. 1D) or intra-helical ( Fig. 1E) types and (iv) interactions that involve either sidechain–sidechain, sidechain–backbone or backbone–backbone contacts ( Supplementary Figs S1 and S2 ). The participation of each residue to the total number of inter-residue interactions are shown in Table 1. The distribution of the relative frequencies of inter-residue interactions in ATM presents a heterogenic profile with a noteworthy participation of aliphatic residues, followed by aromatic and polar residues (mainly Ser and Thr), and very few interactions participated by charged residues. The most frequent residue–residue pairs involve all combinations of aliphatic residues and Phe, mainly Leu–Phe, Leu–Ile and Leu–Val. Clearly, interactions involving branched aliphatic residues appear mainly at the protein surface, whereas interactions involving Ala mostly happen in the protein interior ( Fig. 1B and C). There are also many interactions of aliphatic residues or Phe with Ser or Thr. By contrast there are few polar–polar, charged–charged or polar–charged interactions, despite its well-known role in protein functioning ( Muller et al., 2008) ( Fig. 1H). These interactions occur mainly in the protein core. Inter-helical interactions triplicate intra-helical interactions and are mainly performed by sidechain–sidechain interactions of hydrophobic and Phe residues and interactions between the sidechain of hydrophobic and Phe residues and the backbone of Gly and Ala residues (see Fig. 1D and Supplementary Fig. S2 ). Intra-helical interactions are mainly performed by Pro, Ser and Thr residues that interact through its sidechain with the backbone of residues in the preceding turn and also a minor contribution of hydrophobic and Phe sidechain–sidechain interactions (see Fig. 1E and Supplementary Fig. S2 ).

The distribution of interactions in the ATM set compared with BTM and GLOB sets. Heatmaps of the frequency of inter-residue interactions by residue pairs normalized by the total number of interactions in (A) ATM set, (B) the interior (ATM IN) and (C) surface (ATM OUT) of ATMs, (D) inter-helical (INTER) and (E) intra-helical (INTRA) in ATMs (F) BTM set and (G) α-helical GLOB set. (H) Net-plot representing the percentage of inter-residue interactions grouped by residue types: aromatic (Trp, Tyr and Phe), aliphatic (Ile, Leu, Val and Ala), Gly–Pro, sulfur containing (Met and Cys), polar (Ser, Thr, Asn and Gln) and charged (His, Arg, Lys, Glu and Asp) residues. Backbone–backbone interactions were not considered in these plots

The distribution of interactions in the ATM set compared with BTM and GLOB sets. Heatmaps of the frequency of inter-residue interactions by residue pairs normalized by the total number of interactions in (A) ATM set, (B) the interior (ATM IN) and (C) surface (ATM OUT) of ATMs, (D) inter-helical (INTER) and (E) intra-helical (INTRA) in ATMs (F) BTM set and (G) α-helical GLOB set. (H) Net-plot representing the percentage of inter-residue interactions grouped by residue types: aromatic (Trp, Tyr and Phe), aliphatic (Ile, Leu, Val and Ala), Gly–Pro, sulfur containing (Met and Cys), polar (Ser, Thr, Asn and Gln) and charged (His, Arg, Lys, Glu and Asp) residues. Backbone–backbone interactions were not considered in these plots

Number of residues and participation in inter-residue interactions for each residue in ATMs

. Number of residues . Participation in inter-residue interactions .
Number of interacting residues . . Number of interactions .
Trp1846 (2.6%) 6801 (3.6%) Aromatic 35 234 (18.8%) 6643 (7.1%) Aromatic 31 650 (33.8%)
Tyr2414 (3.4%) 8745 (4.7%) 8492 (9.1%)
Phe6071 (8.6%) 19 688 (10.5%) 18 446 (19.7%)
Ile7653 (10.8%) 18 541 (9.9%) Aliphatic 81 930 (43.8%) 17 389 (18.6%) Aliphatic 62 300 (66.6%)
Leu11694 (16.5%) 29 210 (15.6%) 26 498 (28.3%)
Val7634 (10.8%) 17 641 (9.4%) 16 734 (17.9%)
Ala8141 (11.5%) 16 538 (8.8%) 15 699 (16.8%)
Gly6241 (8.8%) 9133 (4.9%) Gly–Pro 16124 (8.6%) 9133 (9.8%) Gly–Pro 15 597 (16.7%)
Pro1806 (2.5%) 6991 (3.7%) 6919 (7.4%)
Met2612 (3.7%) 8689 (4.6%) Sulfur containing 11 594 (6.2%) 8467 (9.0%) Sulfur containing 11 190 (12.0%)
Cys1039 (1.5%) 2905 (1.6%) 2872 (3.1%)
Ser 3755 (5.3%) 11 384 (6.1%) Polar 30 659 (16.4%) 11 013 (11.8%) Polar 27 685 (29.6%)
Thr 3728 (5.3%) 11 927 (6.4%) 11 541 (12.3%)
Asn 1302 (1.8%) 4433 (2.4%) 4328 (4.6%)
Gln 902 (1.3%) 2915 (1.6%) 2863 (3.1%)
His 819 (1.2%) 2597 (1.4%) Charged 11 645 (6.2%) 2496 (2.7%) Charged 10 702 (11.4%)
Arg 1016 (1.4%) 2961 (1.6%) 2906 (3.1%)
Lys 718 (1.0%) 1559 (0.8%) 1546 (1.7%)
Glu 799 (1.1%) 2440 (1.3%) 2410 (2.6%)
Asp 668 (0.9%) 2088 (1.1%) 2060 (2.2%)
. Number of residues . Participation in inter-residue interactions .
Number of interacting residues . . Number of interactions .
Trp1846 (2.6%) 6801 (3.6%) Aromatic 35 234 (18.8%) 6643 (7.1%) Aromatic 31 650 (33.8%)
Tyr2414 (3.4%) 8745 (4.7%) 8492 (9.1%)
Phe6071 (8.6%) 19 688 (10.5%) 18 446 (19.7%)
Ile7653 (10.8%) 18 541 (9.9%) Aliphatic 81 930 (43.8%) 17 389 (18.6%) Aliphatic 62 300 (66.6%)
Leu11694 (16.5%) 29 210 (15.6%) 26 498 (28.3%)
Val7634 (10.8%) 17 641 (9.4%) 16 734 (17.9%)
Ala8141 (11.5%) 16 538 (8.8%) 15 699 (16.8%)
Gly6241 (8.8%) 9133 (4.9%) Gly–Pro 16124 (8.6%) 9133 (9.8%) Gly–Pro 15 597 (16.7%)
Pro1806 (2.5%) 6991 (3.7%) 6919 (7.4%)
Met2612 (3.7%) 8689 (4.6%) Sulfur containing 11 594 (6.2%) 8467 (9.0%) Sulfur containing 11 190 (12.0%)
Cys1039 (1.5%) 2905 (1.6%) 2872 (3.1%)
Ser 3755 (5.3%) 11 384 (6.1%) Polar 30 659 (16.4%) 11 013 (11.8%) Polar 27 685 (29.6%)
Thr 3728 (5.3%) 11 927 (6.4%) 11 541 (12.3%)
Asn 1302 (1.8%) 4433 (2.4%) 4328 (4.6%)
Gln 902 (1.3%) 2915 (1.6%) 2863 (3.1%)
His 819 (1.2%) 2597 (1.4%) Charged 11 645 (6.2%) 2496 (2.7%) Charged 10 702 (11.4%)
Arg 1016 (1.4%) 2961 (1.6%) 2906 (3.1%)
Lys 718 (1.0%) 1559 (0.8%) 1546 (1.7%)
Glu 799 (1.1%) 2440 (1.3%) 2410 (2.6%)
Asp 668 (0.9%) 2088 (1.1%) 2060 (2.2%)

Absolute counts and percentages (in parentheses) of each amino acid on the composition and number of interacting residues and interactions in the ATM set. Percentages of the number of interacting residues and number of interactions are calculated for each residue (or group of residues) dividing by the total number of interacting residues (187 186) and by the total number of interactions (93 593), respectively. Backbone–backbone interactions are not considered in this table.

Number of residues and participation in inter-residue interactions for each residue in ATMs

. Number of residues . Participation in inter-residue interactions .
Number of interacting residues . . Number of interactions .
Trp1846 (2.6%) 6801 (3.6%) Aromatic 35 234 (18.8%) 6643 (7.1%) Aromatic 31 650 (33.8%)
Tyr2414 (3.4%) 8745 (4.7%) 8492 (9.1%)
Phe6071 (8.6%) 19 688 (10.5%) 18 446 (19.7%)
Ile7653 (10.8%) 18 541 (9.9%) Aliphatic 81 930 (43.8%) 17 389 (18.6%) Aliphatic 62 300 (66.6%)
Leu11694 (16.5%) 29 210 (15.6%) 26 498 (28.3%)
Val7634 (10.8%) 17 641 (9.4%) 16 734 (17.9%)
Ala8141 (11.5%) 16 538 (8.8%) 15 699 (16.8%)
Gly6241 (8.8%) 9133 (4.9%) Gly–Pro 16124 (8.6%) 9133 (9.8%) Gly–Pro 15 597 (16.7%)
Pro1806 (2.5%) 6991 (3.7%) 6919 (7.4%)
Met2612 (3.7%) 8689 (4.6%) Sulfur containing 11 594 (6.2%) 8467 (9.0%) Sulfur containing 11 190 (12.0%)
Cys1039 (1.5%) 2905 (1.6%) 2872 (3.1%)
Ser 3755 (5.3%) 11 384 (6.1%) Polar 30 659 (16.4%) 11 013 (11.8%) Polar 27 685 (29.6%)
Thr 3728 (5.3%) 11 927 (6.4%) 11 541 (12.3%)
Asn 1302 (1.8%) 4433 (2.4%) 4328 (4.6%)
Gln 902 (1.3%) 2915 (1.6%) 2863 (3.1%)
His 819 (1.2%) 2597 (1.4%) Charged 11 645 (6.2%) 2496 (2.7%) Charged 10 702 (11.4%)
Arg 1016 (1.4%) 2961 (1.6%) 2906 (3.1%)
Lys 718 (1.0%) 1559 (0.8%) 1546 (1.7%)
Glu 799 (1.1%) 2440 (1.3%) 2410 (2.6%)
Asp 668 (0.9%) 2088 (1.1%) 2060 (2.2%)
. Number of residues . Participation in inter-residue interactions .
Number of interacting residues . . Number of interactions .
Trp1846 (2.6%) 6801 (3.6%) Aromatic 35 234 (18.8%) 6643 (7.1%) Aromatic 31 650 (33.8%)
Tyr2414 (3.4%) 8745 (4.7%) 8492 (9.1%)
Phe6071 (8.6%) 19 688 (10.5%) 18 446 (19.7%)
Ile7653 (10.8%) 18 541 (9.9%) Aliphatic 81 930 (43.8%) 17 389 (18.6%) Aliphatic 62 300 (66.6%)
Leu11694 (16.5%) 29 210 (15.6%) 26 498 (28.3%)
Val7634 (10.8%) 17 641 (9.4%) 16 734 (17.9%)
Ala8141 (11.5%) 16 538 (8.8%) 15 699 (16.8%)
Gly6241 (8.8%) 9133 (4.9%) Gly–Pro 16124 (8.6%) 9133 (9.8%) Gly–Pro 15 597 (16.7%)
Pro1806 (2.5%) 6991 (3.7%) 6919 (7.4%)
Met2612 (3.7%) 8689 (4.6%) Sulfur containing 11 594 (6.2%) 8467 (9.0%) Sulfur containing 11 190 (12.0%)
Cys1039 (1.5%) 2905 (1.6%) 2872 (3.1%)
Ser 3755 (5.3%) 11 384 (6.1%) Polar 30 659 (16.4%) 11 013 (11.8%) Polar 27 685 (29.6%)
Thr 3728 (5.3%) 11 927 (6.4%) 11 541 (12.3%)
Asn 1302 (1.8%) 4433 (2.4%) 4328 (4.6%)
Gln 902 (1.3%) 2915 (1.6%) 2863 (3.1%)
His 819 (1.2%) 2597 (1.4%) Charged 11 645 (6.2%) 2496 (2.7%) Charged 10 702 (11.4%)
Arg 1016 (1.4%) 2961 (1.6%) 2906 (3.1%)
Lys 718 (1.0%) 1559 (0.8%) 1546 (1.7%)
Glu 799 (1.1%) 2440 (1.3%) 2410 (2.6%)
Asp 668 (0.9%) 2088 (1.1%) 2060 (2.2%)

Absolute counts and percentages (in parentheses) of each amino acid on the composition and number of interacting residues and interactions in the ATM set. Percentages of the number of interacting residues and number of interactions are calculated for each residue (or group of residues) dividing by the total number of interacting residues (187 186) and by the total number of interactions (93 593), respectively. Backbone–backbone interactions are not considered in this table.

For comparison purposes, we analyzed a set of 129 BTMs and a set of 1231 bundles (19 861 helices) of α-helical GLOB (see Section 2 Supplementary Table S2 and Supplementary Figs S1 and S3 ). The distribution of interactions observed for the ATM set is remarkably different to that observed for the BTM set ( Fig. 1F and Supplementary Tables S3 and S4 ). The latter shows a more homogeneous distribution of the interactions, with less aliphatic–aliphatic and aliphatic–polar interactions and more interactions participated by aromatic. Tyr accounts for the highest participation in inter-residue interactions. Because the protein core of β-barrels is hydrophilic, the frequencies of polar–polar, polar–charged and charged–charged interactions are larger. The distribution of interactions in the GLOB set is notably similar to the ATM set (see Fig. 1G and Supplementary Tables S3 and S4 ). The GLOB set features a similar frequency of polar interactions compared with the ATM set, more charged–charged interactions and less aliphatic interactions involving residues other than Leu.

3.2 The role of residues in ATM: comparison to BTM and GLOB

To understand the role of each residue in the above described inter-residue interactions in MPs ( Fig. 1 and Supplementary Figs S1 and S2 ) it is important to know the preference of each residue (or groups of residues) for specific localizations within the bundle: for the protein interior or for the surface, and for the deep membrane regions or for interfacial regions. With this purpose we analyzed the frequency and distribution of each amino acid considering the overall protein (ALL), and residues located either in the protein core (IN) or in the lipid-exposed protein surface (OUT) (see Fig. 2A and Supplementary Table S5 ). We also looked at the residue distribution along a membrane profile (see Fig. 2D and Supplementary Fig. S4 ). These data show that many residues do exhibit marked preference for being located at the protein interior or facing the lipid bilayer, and for lying at the center of the TM segments or at one or both (extracellular and cytoplasmic) ends. The frequencies of each residue and groups of residues and the participation to the overall number of interactions are shown in Table 1. χ 2 goodness-of-fit tests (see Section 2) were performed to determine if inter-residue interactions observed in the ATM set ( Fig. 1) are statistically over-represented or under-represented ( Supplementary Fig. S5 ) compared with what would be expected assuming random interactions based on the observed composition (amino acid frequencies in Table 1 and Supplementary Table S5 ). The same analysis was carried out, for comparative purposes, in the BTM and GLOB sets ( Fig. 2F and G, Supplementary Table S5 and Supplementary Fig. S6 ).

Composition, distribution and preferred localization of each amino acid. The distribution of amino acid frequencies (top) and groups of residues (bottom see Fig. 1 for the definition of residue groups) in the ATM (A), BTM (B) and GLOB (C) datasets. The contribution from residues in the protein interior (IN) or in the protein surface (OUT) are also shown. * indicates statistical differences (P < 0.05) between IN and OUT frequencies in a χ2 goodness-of-fit test with Bonferroni corrections. (D) Sequence logos for the residue composition along the TM helices of the ATM dataset. Image generated with WebLogo ( Crooks et al., 2004). For each helix, the membrane center (set at 0 Å) is based on Orientations of Proteins in Membranes ( Lomize et al., 2012). The membrane goes approximately between −15 (intracellular) and + 15 Å (extracellular)

Composition, distribution and preferred localization of each amino acid. The distribution of amino acid frequencies (top) and groups of residues (bottom see Fig. 1 for the definition of residue groups) in the ATM (A), BTM (B) and GLOB (C) datasets. The contribution from residues in the protein interior (IN) or in the protein surface (OUT) are also shown. * indicates statistical differences (P < 0.05) between IN and OUT frequencies in a χ2 goodness-of-fit test with Bonferroni corrections. (D) Sequence logos for the residue composition along the TM helices of the ATM dataset. Image generated with WebLogo ( Crooks et al., 2004). For each helix, the membrane center (set at 0 Å) is based on Orientations of Proteins in Membranes ( Lomize et al., 2012). The membrane goes approximately between −15 (intracellular) and + 15 Å (extracellular)

3.2.1 Aliphatic residues

Aliphatic residues are the most common in TM helices. They account for half of the total number of residues and participate in two-thirds of the inter-residue interactions in TM helices. They are ubiquitously distributed along the hydrophobic membrane segment, though with a maximum at the bilayer center. Leu is by far the star residue and has the largest frequency (16%) and the largest participation in inter-residue interactions (28%). At the protein surface, Leu, Ile and Val are the most frequent residues, indicating that the membrane prefers to interact with branched residues. These residues interact with other Leu, Ile, Val (notably in Leu-Val, Leu-Leu and Leu-Ile interactions), as well as Phe and Ala residues located at the protein surface. These interactions are over-represented according to the individual residue frequencies. In the protein core, Ala is the most frequent aliphatic residue, followed by Leu, and consequently they are involved in most of the inter-residue interactions, mainly with other aliphatic (Leu, Ala, Val and Ile) residues and Phe through inter-helical sidechain–sidechain interactions. Ala also presents an important role in inter-helical interactions as its small sidechain can allocate hydrophobic and polar side chains close to its backbone. Thus, aliphatic residues present an important role in helix packing.

In BTM, aliphatic residues are exposed to the lipid bilayer, participating in 40% of all interactions, through aliphatic–aliphatic and through aliphatic–aromatic interactions. The presence of aliphatic residues in the core of BTM is rare as water molecules can access the protein core. In GLOB, aliphatic residues participate in 60% of the interactions, mainly through aliphatic–aliphatic interactions. The core of GLOB proteins is highly rich in aliphatic residues, especially in Leu and Ala residues. Although Leu and Ala have similar frequencies, Leu participates in one third of all interactions, while Ala contributes little.

3.2.2 Aromatic residues

Aromatic residues account for 15% of all residues in TM helices and participate in nearly 36% of the interactions. Despite aromatic–aromatic interactions are known to contribute to a significant portion of the stabilizing forces in proteins ( Goyal et al., 2017), the number of interactions of this type is scarce (about 4%). In contrast, aromatic residues interact mainly with aliphatic residues. Phe is the second most frequent residue on TM helices and by far, the most frequent aromatic residue. Phe residues are equally present in the surface or in the protein core. They exhibit a relatively flat distribution along the TM segment, though with a peak at the extracellular end. Phe mainly interacts with Leu in the surface and with aliphatic residues in general in the protein core. In fact, Phe-Leu is the most frequent inter-residue interaction in TM helices, where it has an important role in packing helices through sidechain–sidechain interhelical interactions. Tyr and Trp residues have small frequencies and have marked preference for the TM ends, in accordance with their role in interacting with the lipid head-groups ( Baker et al., 2017 Muller et al., 2008 von Heijne, 1992) and with polar residues in the extracellular and intracellular interface. Whereas Tyr residues prefer the protein core, Trp residues appear equally buried in the protein interior or at the surface. Tyr and Trp participate in few inter-residue interactions, mainly with Leu. Yet, these interactions are over-represented compared with the interactions expected from the individual amino acid frequencies.

Aromatic residues account for 40% of the total interactions in BTM, mainly through Tyr and Trp, facing the lipid bilayer and performing aromatic–aromatic and aliphatic–aromatic interactions. Tyr performs a significant number of interactions, mainly with polar and charged residues in the protein core and with aliphatic and other aromatic residues in the surface of the protein. The few aromatic residues found in GLOB are mainly located at the core of the protein.

3.2.3 Polar residues

Polar residues represent 14% of the total number of residues in TM segments and participate in 30% of the interactions, mainly involving Ser and Thr. They have clear preference for the protein core and are ubiquitously spread along the TM helix. Most of the interactions occur between the sidechain of Ser and Thr and the backbone carbonyl of the residue in the preceding turn inducing distortions ( Ballesteros et al., 2000). Thus, Ser and Thr have an important role in intra-helical interactions. Few inter-helical interactions are also formed between the methylene group of Thr sidechain and hydrophobic residues or through C-H···O-H weak hydrogen bonds between the hydroxyl groups and hydrophobic residues ( Desiraju, 2005). Although polar–polar interactions have an important role in the function of proteins ( Muller et al., 2008), there are relatively few of such interactions in TM helices. Gln and Asn exhibit small frequencies and thus they participate in a marginal number of inter-residue interactions. They prefer the exterior of the protein, close to the lipid head-groups.

BTM presents the highest percentage of polar residues (around 20%) compared with ATM and GLOB. Polar residues interact with all residue types and are mainly located at the protein interior where they can interact with water molecules. Although polar residues are frequent in GLOB, they contribute little to interactions, as these residues are located at the protein surface and interact mainly with water molecules.

3.2.4 Gly and Pro

Gly and Pro are well known helix distorters or breakers. Gly prefers to be in the protein interior—in fact it is the second most frequent residue in the protein core—and in the central part of the TM helices. The lack of sidechain allows to allocate the sidechains of bulky neighboring hydrophobic residues close to its backbone ( Eilers et al., 2002). The neighboring sidechains form weak C-H···O = C interactions with the backbone of Gly. Thus, Gly has an important role in backbone–sidechain interhelical interactions. Pro is present along the TM helix, although with preference for the extracellular region. Pro mainly interacts with Leu, either in the protein core or in the protein surface. Pro–aliphatic interactions are over-represented considering the frequencies of residues in TM helices. Pro has an important role in intra-helical interactions, as its sidechain interacts with the backbone of the preceding turn, as Ser and Thr residues. Thus, besides its role as helix breaker or as helix distortion inducer ( Cordes et al., 2002), Pro contributes to stabilize TM bundles through intra-helical interactions.

In BTM, Pro contributes in very few interactions, whereas Gly participates in a significant number of interactions with hydrophobic and aromatic residues through its backbone. In GLOB, both Pro and Gly have small frequencies and thus little participation in interactions.

3.2.5 Cys and Met

The sulfur-containing residues Cys and Met show low frequencies (specially Cys) in TM helices and prefer the protein core and the center of the TM helices (see Fig. 2). Cys residues interact mainly with aliphatic residues, Met and Phe. This is compatible with the fact that sulfur-containing residues form strong interactions with aliphatic and aromatic amino acids ( Cordomi et al., 2013 Gómez-Tamayo et al., 2016). Most of Cys–Cys interaction distances are compatible with being non-bonded interactions rather than disulfide bonds. Met shows an important contribution to the overall inter-residue interactions despite its low frequency. In fact, interactions involving Met are over-represented compared with the interactions expected from the individual amino acid frequencies.

Cys and Met are even scarcer in BTM and GLOB and contribute few interactions. They have no preference for being IN or OUT.

3.2.6 Charged residues

The group of putatively charged residues (His, Lys, Arg, Glu and Asp) have the smallest frequencies in TM segments—they sum only 6%—and participate in little number of inter-residue interactions (around 11%). They are located close to the ends of the TM segments. Arg and Lys predominantly face the cytoplasmic part, following the ‘positive inside rule’ ( Baker et al., 2017 Muller et al., 2008 von Heijne, 1992). Despite the low participation in inter-residue interactions, the few charged–charged interactions are over-represented compared with the interactions expected from the individual amino acid frequencies consistent with their predicted role in protein function ( Muller et al., 2008).

Charged residues are few in BTM, they are mainly found in the protein core where they perform interactions with charged and polar residues. In contrast they account for 40% of residues in the surface of GLOB, where these residues can interact with the hydrophilic environment and with other charged residues.


Functional sub-classification of transmembrane protein classes

After membrane proteins are identified and separated from soluble proteins using the topology prediction programs outlined above, the second level of classification in Figure 1 involves grouping the population of membrane proteins into individual functional classes, and to prospectively identify the function of characterized genes. Several methods have been reported to accomplish this task, which are summarized in Table 3 and visually diagrammed in Figure 2.

As with many topology prediction algorithms, these methods often require the amino acid sequence to be summarized in a quantitative fashion to compare two proteins. One such descriptor that has been successfully utilized is the fraction of a protein's sequence comprised of each of the twenty naturally occurring amino acids, a vector of length twenty that sums to one and is termed the 'amino acid composition' 56,57 . The intuition behind this descriptor is that distinct classes of membrane proteins have a bias to include particular amino acids at greater frequency due to the structural requirements or constraints for their function. Refinements of the amino acid composition descriptor have also been proposed, such as using the un-normalized count of the twenty amino acids in a protein sequence, a method reported to be more effective as it also captures differences in the characteristic length of a protein family 58 . Similarly, expanding the normalized amino acid composition to a vector length sixty – twenty for composition of the whole protein, and twenty elements each for the amino acid composition of transmembrane and non-transmembrane segments - has also allowed better discrimination 59 . Like amino acid composition, dipeptide frequencies have also been successfully utilized as descriptors to discriminate membrane proteins of different classes 56,57,60 . The previously mentioned PSSM derived from Position-Specific Iterative BLAST (PSI BLAST), which measure the likelihood of a substitution from the observed to an alternate amino acid at a particular position based on substitution patterns between a protein and its homologous neighbors, have also been found to have high sensitivity as a descriptor 61 . More abstractly, numerical descriptors of folding energetics have also been employed in predictive models 61 .

Just as the input descriptors to these algorithms are varied, so are the kinds of functional predictions produced in these studies. Several methods have been used to predict a query gene's family membership, such as classifying channels, transporters, and carriers from one another 58 . In greater detail, these methods have also been used to predict a protein's substrate, such as different metal ions for channels or protein/nucleic acids for transporters 62 . Predictions have also been targeted for functional parameters specific to particular classes of membrane proteins. For example, amino acid sequence has been used to predict the half-maximal activation potential of voltage gated channels 63 , discriminate between channels based on their electrophysiological parameters 64 , or identify channels that may serve as promising therapeutic targets 65 .

These previously described methods, in essence, rely on the proximity of a query protein to a neighborhood of known proteins in the space of the descriptor used. Further refinements have been proposed, where this proximity measurement may be combined with other features such as Gene Ontology terms describing the biological processes, molecular functions, subcellular localization of a protein 66 , presence of class-associated protein families (Pfam) domains 67 , or the number of predicted transmembrane domains 68 . The resulting combination of annotated and raw sequence information may then be used in a prediction algorithm such as the previously discussed SVM 68 . Indeed, the ability of amino acid profiles to serve as relevant features for identifying functionally related proteins may suggest that families share specific motifs, and specific structural fragments and motifs have also been identified in related studies 69,70 .

Expanding these predictions based on two-dimensional structure correlated with classifications or functional parameters, methods have also been developed to directly infer function based on a three dimensional conformation. For example, the SLITHER program uses molecular modeling simulations to predict whether a putative substrate molecule may permeate the cavities or channels in a protein structure 71 . In cases where the existence of a channel in a protein is unverified, the MolAxis program can be used to predict whether they exist using computational geometry 72 . Obviously, both of these methodologies require three-dimensional protein coordinates which are experimentally unavailable for most channels or other membrane proteins, but might be combined with homology-based three dimensional structure predictions described in the previous section to generate functional predictions for inferred three dimensional structures.

A related functional prediction is to identify the natural ligand, ion substrate or protein interaction partner of the novel proteins. Indeed, examples that highlight the challenge of deorphanizing a large number of seven transmembrane protein receptors, where the natural binding partner(s) of some otherwise well-characterized transmembrane receptors such as BRS-3 remains unknown 73 . Though not specifically developed to identify peptide – receptor interactions in silico, large-scale predictions of protein-protein interactions have been described using two and three-dimensional information 74,75,76 . Conceivably, such algorithms might be employed to identify novel interactions between peptide ligands and the subset of peptide-binding receptors. Direct bioinformatics identification of ligands such as neuropeptide precursors have also benefited from the increased availability of genome-wide proteomic and nucleotide data, as demonstrated by the computational prediction of more than 200 novel neuropeptides in the honeybee Apis mellifera, of which 100 were validated using peptidomics 77 . Related studies of the red flour beetle Tribolium castaneum have employed homology analysis to validate 30/41 predicted neuropeptide genes using mass spectrometry data, encoding 71 peptides 78 . Given the accuracy of the predictions in these studies using large genomics datasets, we speculate that such methods and information provide a promising pool of potential novel ligands that might be screened in functional assays against putative peptide-binding receptors.


The role of other protein factors

Interactions between TM domains cannot explain how two proteins that have identical primary structures and use the same basic translocation machinery can be synthesized in two different orientations. Several proteins, including the prion protein (PrP), ductin, myelin proteolipid protein (PLP) and the cystic fibrosis transmembrane conductance regulator (CFTR) exist in multiple topological forms (Lopez et al.,1990 Dunlop et al.,1995 Hegde et al.,1998b Wahle and Stoffel,1998). Although a nascent chain may access one of the many available folding funnels, studies of PrP have demonstrated that this distribution can be altered both in cis and in trans.

Interprotein interactions can play a role in both TM domain integration and STE recognition. PrP can be synthesized in three different topological forms: Ntm PrP, a type I membrane protein in which the N-terminus is in the lumen Ctm PrP, a type II membrane protein in which the C-terminus is in the lumen and a secretory form called sec PrP. In vitro, in the absence of translocation accessory factor (TrAF) activity, PrP is made exclusively as the Ctm PrP form(Hegde et al., 1998b), which causes neurodegeneration in mice and humans when synthesized in vivo(Hegde et al., 1998a). Little more is known about TrAF, but perhaps it regulates how or when other factors,such as TRAM, interact with PrP and probably with many other proteins. Early studies suggested that receptor-mediated recognition events occur during translocation starting and stopping (Mize et al., 1986), which is consistent with the subsequent identification of STEs (Yost et al.,1990). Recently, crosslinking studies of an IgM STE sequence identified two membrane proteins involved in STE recognition or function(Falcone et al., 1999). Characterization of these STE receptors will be one of the next steps toward understanding how integration is regulated.

Chaperone activity also appears to have a role in integration. At least one protein factor in the ER membrane is proposed to be responsible for proper biosynthesis of the gap junction component connexin. In vitro synthesis or in vivo overexpression of connexin results in the production of aberrantly cleaved molecules because signal peptidase mistakes the first TM domain for a signal peptide. In vivo cleavage of the TM domain is believed to be prevented by an unidentified chaperone in the membrane, which recognizes the nascent chain and blocks the access of signal peptidase. In vitro this chaperone may be absent or non-functional (Falk and Gilula, 1998).

Co-translocational modification of nascent chains can also affect biosynthesis. Oligosaccharyl transferase (OST) associates with the translocon and glycosylates nascent chains as they emerge in the ER. To look at possible effects of glycosylation on TM domain orientation, Goder et al.(Goder et al., 1999) created a chimeric protein that can be synthesized in either of two topological forms. When they engineered glycosylation sites, they found that reorientation of a transmembrane domain in the translocon was prevented by glycosylation of the lumenal TM loop. These results suggest that regulation of glycosylation of native proteins can control folding and orientation of proteins according to the needs of the cell.

The interprotein interactions described above probably affect biosynthesis of many different membrane proteins. Substrate-specific interprotein interactions also affect biosynthesis. In the membrane, as in the cytosol,proteins associate to form functional complexes. Studies of the P-type Na + /K + -ATPase revealed that the correct insertion of the polytopic α subunit seventh and eighth TM domains requires association of the bitopic β subunit with the extra-cytosolic loop between the two TM domains (Beguin et al., 1998). When the β subunit encounters the proper region of the α subunit,it appears to induce a conformational change that promotes proper folding and integration of the TM domains. Specific trans interactions that facilitate proper formation of membrane protein complexes might prevent the nascent chain from making undesirable or deleterious associations with itself or other proteins.

We are beginning to understand more about the proteins that influence membrane protein biosynthesis, but there is much left to learn. Characterization of both TrAFs and the STE receptors will improve our understanding of the mechanism of membrane domain integration, as will additional examples of substrate-specific interactions. Identification of the chaperone involved in connexin biosynthesis will enable us to learn how membrane chaperones function. Finally, discovery of proteins that use glycosylation to control orientation in vivo will clarify other ways in which biosynthesis can be regulated.


Are we missing out on the dynamic oligomeric state? How to study a symmetric and flexible dimer?

Gaining control of a weak dimer

The low number of currently available homodimeric TMD structures from SPTMRs is likely a reflection of the challenges associated with structural studies of these symmetric and flexible dimers. A high conformational flexibility and modest affinity of the TMD homodimers are presumably a prerequisite for their anticipated ability to switch between conformations these properties however also clearly challenge structural studies of SPTMR-TMD dimers. As discussed, the overall quality of the available structures appears limited, and, in addition, the oligomeric state of the protein is not always clearly defined. The homodimeric state of most of the TMDs has mainly been confirmed under the applied conditions by acquisition of interhelical NOEs [15, 23, 58-63, 65, 66, 69, 71, 72] , sometimes in combination with cross-linking experiments [64] or with the detection of size changes as estimated from backbone relaxation rates [61, 62, 69] or ultracentrifugation [70] . Interhelical NOEs obtained in filter-based experiments should preferably be compared to background reference spectra of the protein in the absence of unlabeled protein to avoid misinterpretation of strong intramolecular peaks leaking through the filter due to imperfect isotope labeling [50] , but it is rarely clear if these have been acquired. In the structural studies of the monomeric TMDs, it is also not always shown that the TMD is in fact monomeric under the applied conditions. This lack of evaluation of the oligomeric state is probably a result of the complications introduced by their embedment in membrane mimetics, which renders classical size estimations dependent on two unknowns (the extent of oligomerization of the protein and the size of the membrane mimetics). Detection of interhelical NOEs is, when used properly, a reliable measure of a significant population of dimer. Nonetheless, interspecies NOEs are resource-heavy measurements that are not suitable for evaluating large sets of conditions and are not easily, if at all possible, acquired on transiently interacting proteins [120] . Furthermore, they are not appropriate to confirm the absence of oligomerization for studies of the monomeric state. A possible strategy is utilization of paramagnetic tags [48] , the implementation of which nonetheless may require prior knowledge of the dimerization site, the generation of a series of mutants, as well as additional technical challenges associated with the use of hydrophobic labels in the context of a membrane-mimicking environment. Thus, development of high-throughput methods for reliable determination of the oligomeric state of micelle-embedded TMDs under a large set of NMR-suitable conditions is essential for the development of the field. One promising lead is the recent development of native mass spectrometry on membrane proteins, allowing acquisition of spectra on intact membrane protein complexes embedded in micelles [125] . These methods are, however, still not sufficiently quantitative to determine the extent of oligomerization.

A particular challenge is that of stabilizing membrane protein oligomers in non-native membrane mimetics. It is becoming increasingly clear that the lipid bilayer may participate in TMD associations through sequence independent effects such as membrane thickness and charge [126, 127] . Furthermore, sequence dependent effects mediating specific lipid binding to TMD helices have been suggested to play important roles in the regulation of the monomer–dimer equilibrium, both by counteracting and enhancing dimerization [128] . These effects are naturally not well simulated in the membrane mimetics that allow structural studies by solution-state NMR spectroscopy, putatively destabilizing the TMD homodimers further. Several studies have reported that the monomer–dimer equilibrium for some TMDs may be manipulated, and thereby studied, through changes in the detergent-to-protein (DP) or lipid-to-protein (LP) ratio, as first reported by Fisher et al. [129] on the GpA-TMD. This is essentially analogous to the common approach of diluting a protein complex with the purpose of pushing the equilibrium toward the monomer conformation to allow its dissociation constant to be determined. In addition to GpA, monomer–dimer transitions as a result of changes in the DP ratio have been reported for many of the TMDs of which structures have been reported, e.g., the homodimers of FGFR3, VEGFR, ErbB1-4, APP, and TLR3 [23, 59-63, 69, 71, 130] .

Solution-state NMR spectroscopy is a powerful tool for studying monomer–dimer transitions as a consequence of changes in e.g., DP ratio, offering detection and residue-level mapping of the formation and dissociation of low to moderate affinity complexes, such as e.g., homodimers, as well as providing structural information. Even in cases where multiple oligomeric states occur simultaneously, data may under certain conditions be extracted and assigned to each state. Methodologies for utilizing solution-state NMR spectroscopy for investigation of TMD oligomerization based on manipulating the DP or LP ratio have been developed [130, 131] however, a high-throughput method for confirming that the chemical shift changes observed stem from changes in oligomeric states is still desirable. The major drawback is additionally that the detergent concentrations applicable for NMR studies have both upper and lower limits. The upper limit is caused by an increase in sample viscosity, and thereby a slowdown in molecular tumbling, leading to line broadening [132] , with the exact concentration limit dependent on the nature of the detergent and the experimental conditions. The lower limit arises because it is difficult to properly analyze data obtained with detergent concentrations below the critical micelle concentration (CMC) of the detergent, as the nature of protein/detergent micelles under these conditions is poorly understood. Consequently, only systems undergoing complete monomer–dimer transitions within the applicable detergent range are interpretable. Furthermore, with the exception of APP-TMD in LMPG micelles [130] , the majority of TMD-detergent systems studied so far exhibit slow exchange on the NMR timescale of the monomer–dimer transition. Consequently, NMR methodologies for quantitatively studying systems in fast exchange are still lacking behind.

Solution-state NMR studies of the TMD monomer–dimer transition

Determining the kinetics of monomer–dimer transitions in detergent solvent is a rather complex case, in which factors imposed by the solvent system, such as the CMC, properties and participation of the detergent, as well as micellar collisions are possible contributors to the observed behavior. Several models have been proposed to aid the derivation of formalisms to describe the oligomerization of single-pass TMDs in such systems: the continuous solvent model [133, 134] , the detergent-release model [129, 134] , and the micellar solvent model [131] . In essence, the continuous solvent model assumes that the detergent acts as a solvent in the dimerization process, i.e., that dimerization in micellar solvent behaves similarly to dimerization in water or a lipid bilayer. Such an assumption is mainly applicable when the monomer–dimer transition occurs within the same micelle (e.g., at low DP ratio). In the detergent-release model, the detergent is not treated as a solvent but rather as a participant in the dimerization process, a description found best applicable at a high DP ratio and to relatively strong dimers. Lastly, the micellar solvent model is based on the assumption that dimerization and dissociation occur only upon collision and decay of the micelles, respectively. The associated formalism only applies when the micelle collisions are frequent and the dimer dissociation rates are slow on the NMR timescale, and transitions within a single micelle therefore become negligible. A detailed description of all the models and formalisms for describing the equilibrium can be found in Ref. [131] . That no single formalism or model is capable of describing all monomer–dimer transitions highlights their complexity, and that much remains to be understood. The homodimerization of the APP-TMD has for example been studied in both DPC micelles [71] and in LMPG micelles [130] , where the monomer–dimer is in slow exchange on the NMR timescale in DPC, but in fast exchange in LMPG. That is, the monomer–dimer transition of the APP-TMD fits within the detergent-release model and the continuous solvent model, depending on the detergent rather than the intrinsic nature of the TMD association. This is in line with the observation that the magnitude of the dimer dissociation constant is dependent on the applied detergent [129, 134] , and thus the question of the biological relevance of such estimations of the equilibrium kinetics may be raised. Even when these measurements are conducted in lipid bilayers, the strength of the associations varies considerably with membrane composition [126, 127] , a scenario which might reflect biological relevance. It can be argued, as suggested by Mineev et al. [131] , that these estimations may be used to compare the strength of dimer formation of TMDs measured in the same membrane-mimicking environment. However, as mentioned, different detergents and lipids do not affect structure, stability, and functionality of membrane proteins similarly and this interplay is currently not well understood. The question is therefore if the same membrane mimetic environment also affects different TMD monomer–dimer equilibria differently, and the measured equilibrium kinetics thus becomes an equation with too many unknowns to provide any biologically meaningful results. It should be noted that most of these studies have been conducted on the GpA-TMD only, which is generally considered to form a very stable homodimer [135] , presumable also more tolerant to varying conditions.

In terms of atomic resolution structural knowledge on the monomer–dimer transition, only the TMDs of ErbB1, ErbB2, and APP are represented by structures of both a monomer and a homodimer, and in case of ErbB2, the monomer and dimer does not originate from the same species. Additionally, as discussed, controversy around the structures of the ErbB1-TMD exists, while the characteristics of the APP-TMD are unclear due to the multiple structures representing the monomeric and dimeric APP-TMD. Consequently, little direct knowledge is available on the details of the structural transformation of the TMDs upon monomer to dimer transition. The handful of monomeric, helical TMDs, however, clearly demonstrates that the formation of secondary structure is uncoupled from oligomerization. It should, however, be noted that the ErbB1-TMD α-helix has been found to be extended upon homodimerization [62] .


Cellular tensegrity

Tensegrity is a building principle that was first described by the architect R. Buckminster Fuller (Fuller,1961) and first visualized by the sculptor Kenneth Snelson(Snelson, 1996). Fuller defines tensegrity systems as structures that stabilize their shape by continuous tension or `tensional integrity' rather than by continuous compression (e.g. as used in a stone arch). This is clearly seen in the Snelson sculptures, which are composed of isolated stainless steel bars that are held in position and suspended in space by high tension cables(Fig. 1A). The striking simplicity of these sculptures has led to a description of tensegrity architecture as a tensed network of structural members that resists shape distortion and self-stabilizes by incorporating other support elements that resist compression. These sculptures and similar structures composed of wood struts and elastic strings (Fig. 1B) beautifully illustrate the underlying force balance, which is based on local compression and continuous tension(Fig. 2A) that is responsible for their stability. However, rigid elements are not required, because similar structures can be constructed from flexible springs that simply differ in their elasticity (Fig. 1C).

Tensegrity structures. (A) Triple crown, a tensegrity sculpture, by the artist Kenneth Snelson, that is composed of stainless steel bars and tension cables. Note that this structure is composed of multiple tensegrity modules that are interconnected by similar rules. (B) A tensegrity sphere composed of six wood struts and 24 white elastic strings, which mimics how a cell changes shape when it adheres to a substrate(Ingber, 1993b). (C) The same tensegrity configuration as in B constructed entirely from springs with different elasticities.

Tensegrity structures. (A) Triple crown, a tensegrity sculpture, by the artist Kenneth Snelson, that is composed of stainless steel bars and tension cables. Note that this structure is composed of multiple tensegrity modules that are interconnected by similar rules. (B) A tensegrity sphere composed of six wood struts and 24 white elastic strings, which mimics how a cell changes shape when it adheres to a substrate(Ingber, 1993b). (C) The same tensegrity configuration as in B constructed entirely from springs with different elasticities.

(A) A high magnification view of a Snelson sculpture with sample compression and tension elements labeled to visualize the tensegrity force balance based on local compression and continuous tension. (B) A schematic diagram of the complementary force balance between tensed microfilaments(MFs), intermediate filaments (IFs), compressed microtubules (MTs) and the ECM in a region of a cellular tensegrity array. Compressive forces borne by microtubules (top) are transferred to ECM adhesions when microtubules are disrupted (bottom), thereby increasing substrate traction.

(A) A high magnification view of a Snelson sculpture with sample compression and tension elements labeled to visualize the tensegrity force balance based on local compression and continuous tension. (B) A schematic diagram of the complementary force balance between tensed microfilaments(MFs), intermediate filaments (IFs), compressed microtubules (MTs) and the ECM in a region of a cellular tensegrity array. Compressive forces borne by microtubules (top) are transferred to ECM adhesions when microtubules are disrupted (bottom), thereby increasing substrate traction.

According to Fuller's more general definition, tensegrity includes two broad structural classes — prestressed and geodesic — which would both fail to act like a single entity or to maintain their shape stability when mechanically stressed without continuous transmission of tensional forces(Fuller, 1961 Fuller, 1979 Ingber, 1998 Chen and Ingber, 1999). The former hold their joints in position as the result of a `prestress'(pre-existing tensile stress or isometric tension) within the structure(Fig. 1). The latter triangulate their structural members and orient them along geodesics (minimal paths) to geometrically constrain movement. Our bodies provide a familiar example of a prestressed tensegrity structure: our bones act like struts to resist the pull of tensile muscles, tendons and ligaments, and the shape stability (stiffness) of our bodies varies depending on the tone (prestress)in our muscles. Examples of geodesic tensegrity structures include Fuller's geodesic domes, carbon-based buckminsterfullerenes (Bucky Balls), and tetrahedral space frames, which are popular with NASA because they maintain their stability in the absence of gravity and, hence, without continuous compression.

Some investigators use tensegrity to refer only to the prestressed `bar and cable' structures or particular subclasses of these (e.g. unanchored forms)(Snelson, 1996 Heidemann et al., 2000). Since Fuller defined the term tensegrity, I use his more general definition here. The existence of a common structural basis for these two different classes of structure is also supported by recent work by the mathematician Robert Connelly. He developed a highly simplified method to describe prestressed tensegrity configurations and then discovered that the same fundamental mathematical rules describe the closest packing of spheres(Connelly and Back, 1998),which also delineate the different geodesic forms(Fuller, 1965).

The cellular tensegrity model proposes that the whole cell is a prestressed tensegrity structure, although geodesic structures are also found in the cell at smaller size scales. In the model, tensional forces are borne by cytoskeletal microfilaments and intermediate filaments, and these forces are balanced by interconnected structural elements that resist compression, most notably, internal microtubule struts and extracellular matrix (ECM) adhesions(Fig. 2B). However, individual filaments can have dual functions and hence bear either tension or compression in different structural contexts or at different size scales (e.g. rigid actin filament bundles bear compression in filopodia). The tensional prestress that stabilizes the whole cell is generated actively by the contractile actomyosin apparatus. Additional passive contributions to this prestress come from cell distension through adhesions to the ECM and other cells, osmotic forces acting on the cell membrane, and forces exerted by filament polymerization. Intermediate filaments that interconnect at many points along microtubules,microfilaments and the nuclear surface provide mechanical stiffness to the cell through their material properties and their ability to act as suspensory cables that interconnect and tensionally stiffen the entire cytoskeleton and nuclear lattice. In addition, the internal cytoskeleton interconnects at the cell periphery with a highly elastic, cortical cytoskeletal network directly beneath the plasma membrane. The efficiency of mechanical coupling between this submembranous structural network and the internal cytoskeletal lattice depends on the type of molecular adhesion complex that forms on the cell surface. The entire integrated cytoskeleton is then permeated by a viscous cytosol and enclosed by a differentially permeable surface membrane.


CMB 334 Exam 1

Modern-day filter systems are use disposable plastic laboratory items rather than algae based filters and have upper/lower chambers and are available in a variety of pore sizes.

1.) Nature of the nucleic acid in the virion (Baltimore Model, most often used)

2.) Symmetry of the protein shell/capsid

3.) Presence of a lipid membrane/envelope

Resistant cell- does not allow virus entry

Florescence-Focus Assay- similar to the plaque assay, except an antibody for the virus is added along with a second florescent indicator antibody that attaches to the first antibody in order to observe the virus titer as "fluorescent-focus-forming-units/ml".

Infectious-Centers Assay- similar to the plaque assay, except it begins with a known number of infected cells and is used to measure the proportion of infected cells in persistently infected cultures.

Transformation Assay- useful for retroviruses that don't form plaques, it measures the number of "piles" or foci that form due to infected cells losing their contact inhibition which allows them to form a normal monolayer. this is measured in focus-forming-units/ml.


Watch the video: Receptors: Signal Transduction and Phosphorylation Cascade (January 2023).