Information

F3. Hydrogen Bonding - Biology

F3. Hydrogen Bonding - Biology


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Linus Pauling first suggested that H bonds (between water and the protein and within the protein itself) would play a dominant role in protein folding and stability. Remember all the intrachain H bonds in the helix? Are they collectively more stable than H bonds between water and the peptide in a (random) coil?

In thinking about conformational studies involving small peptides, it is useful to apply Le Chatelier's Principle to the equilibrium below:

random coil ( ightleftharpoons ) helix

Anything perturbant (small molecules, solvent, etc) that would preferentially interact with the helical form would push the equilibrium to the helical form.

Early models assumed that intrachain H bonds were energetically (enthalpically) more favorable than H bonds between peptide and water. But to form a hydrogen bond requires an entropy payback since a helix is much more ordered (lower entropy) than a random coil (higher entropy). At low temperature, enthalpy predominates and helix formation in solution is favored. At high temperature, the helix is disfavored entropically. Imagine the increased vibrational and rotational states permitted to the atoms at higher temperatures. (Remember the trans to gauche conformational changes in the acyl chains of double chain amphiphiles as the temperature increased, leading to a transition from a gel to liquid crystalline phase in bilayer vesicles.) Theoretical studies on helix-coil transitions predict the following:

  • as the chain length increases, the helix gets more stable;
  • increasing the charge on the molecule destabilizes the helix, since the coil, compared to the more compact helix, has a lower charge density;
  • solvents that protonate the carbonyl oxygen (like formic acid) destabilizes the helix; and
  • solvents that form strong H bonds compete with the peptide and destabilize the helix. In contrast, solvents such as CHCl3 and dimethylformamide (a nonprotic solvent), stabilize the helix. Likewise 2-chloroethanol and trifluoroethanol, which form none or weaker H bonds to the peptide than does water, stabilizes the helix. (In the case of trifluoroethanol, molecular dynamics simulations have shown that TFE preferentially inteacts with (solvates) the peptide, which inhibits H bonds from the peptide backbone to water, stabilizing the intrahelical H bonds.

These helix-coil studies suggest that H bonds are important in stabilizing a protein.

But do they really? Why should these H bonds differ from those in water? It's difficult to figure out whether they are since there are so many possible H bonds (between protein and water, water and water, and protein and protein), and their strength depends on their orientation and the dielectric constant of the medium in which they are located.

If intrachain H bonds in a protein are not that much different in energy than intermolecular H bonds between the protein and water, and given that proteins are marginally stable at physiological temperatures, then it follows that the folded state must contain about as many intramolecular hydrogen bonds within the protein as possible intermolecular H bonds between the protein and water, otherwise the protein would unfold.

To resolve this issue, and determine the relative strength of H bonds between the varying possible donors and acceptors, many studies have been conducted to compare the energy of H bonds between small molecules in water with the energy of H bonds between the same small molecules but in a nonpolar solvent. The rationale goes like this. If the interior of a protein is more nonpolar than water (lower dielectric constant than water), then intrastrand H bonds in a protein might be modeled by looking at the H bonds between small molecules in nonpolar solvents and asking the question, is the free energy change for the following process < 0:

Dw + Aw ( ightleftharpoons ) (DA)n, DGo, K

where D is a hydrogen bond donor (like NH) and A is a hydrogen bond acceptor, (like C=O), w is water (i.e. donor and acceptor are in water), and n is a nonpolar solvent, and DGo and K are the standard free energy change and the equilibrium constant, respectively, for the formation of a H-bond in a nonpolar solvent from a donor and acceptor in water. This reaction simulates H-bond contributions to protein folding, where a buried H-bond is mimicked by a H-bond in a nonpolar solvent. The reaction written above is really a thought experiment, since it would be hard to set up the necessary conditions to make the measurement. However, we can calculate the DGo for this reaction since it is a state function and it really doesn't matter how one accomplishes this process.

Let's consider a specific example: the formation of H bonds between two molecules of N-methylacetamide (NMA) in water and in a nonpolar solvent. The reaction scheme shown below describes a set of reactions (a thermodynamic cycle) involving the formation of H-bonded dimers of NMA . A and B are both molecules of NMA, in either water (w) or a nonpolar solvent (n).

N-methylacetamide is a good mimic for the H bond donors and acceptors of the peptide bond of a polypeptide chain.

In the reaction scheme shown above,

K1 is the equilibrium constant for the dimerization of NMA in a nonpolar medium. This can be readily determined, and is >1, implying that DGo < 0. (Remember, DGo = -RTlnKeq) For the dimerization of NMA in CCl4, DGo1 = -2.4 kcal/mol.

K2 is the equilibrium constant (think of it as a partition coefficient) for the transfer of two NMA molecules from water to a nonpolar solvent (again easily measurable). For NMA transferring from water to CCl4, DGo2 = + 6.12 kcal/mol.

K3 is the equilibrium constant for the dimerization of NMA in water. This can be readily determined, and is <1, implying that DGo > 0. For the dimerization of NMA in water, DGo3 = +3.1 kcal/mol.

K4 is the equilibrium constant (think of it as a partition coefficient) for the transfer of a hydrogen-bonded dimer of NMA from water to a nonpolar solvent. You try to think of a way to measure that! I can't. This is where thermodynamic cycles comes in so nicely. You don't have to measure it. You can calculate it from K1-3 since DGo is a state function!

[ Delta G^o_2 + Delta G^o_1 = Delta G^o_3 + Delta G^o_4 hspace{15px} extrm{ or }hspace{15px} -RTln K_2 + -RTln K_1 = -RTln K_3 + -RTln K_4 ]

[ ln K_2 + ln K_1 = ln K_3 + ln K_4 = ln(K_2K_1 )= ln(K_3K_4) hspace{15px} extrm{ or }hspace{15px} (K_2K_1 )= (K_3K_4) ]

For NMA transferring from water to CCl4, D Go4 = + 0.62 kcal/mol.

(Note: Biochemists like to talk about thermodynamic cycles which may seem new to you. However, believe it or not, you have seen them before - in General Chemistry - in the form of Hess's Law!)

From K1-4and the corresponding DGo values, we can now calculate DGo5 for the formation of H-bonded NMA dimers in a nonpolar solvent from two molecules of NMA(aq). This reaction, which we hope simulates formation of buried intrachain H bonds in proteins on protein folding, is:,

Dw + Aw ( ightleftharpoons) (DA)n, for which DGo5= +3.72 (i.e. disfavored).

If this model is a good mimic for studying H bond formation on protein folding, it suggests that the formation of buried H bonds during protein folding does not drive protein folding.

However, if the transfer of D and A (from a large protein) from water to the nonpolar medium (modeled by K2) is driven by other forces (such as the hydrophobic effect), the positive value of K1 will strongly favored buried H bond formation. So, if this happens in proteins, it is clear why so many intrachain H bonds occur, since K1 is so favored. H bonds may not assist the collapse of a protein, but would favor internal organization within a compact protein. That is, H bonds don't drive protein folding per se, but form so that the folded protein would not be destabilized by too many unsatisfied H bonds.

There are potential problems with this simple model. The interior of a protein is not homogeneous (i.e. the effective dielectric within the protein will vary). H bond strength is also very sensitive to geometry. Also, there are many H bonds within a protein, so slight errors in the estimation of H bond strength would lead to large errors in determination of protein stability.

Another argument against H bonds being the determining factor in protein folding and stability comes from solvent denaturation studies. If intrachain H bonds are so important, then should not solvents that can H bond to the backbone denature the protein? Shouldn't water (55 M) act as a denaturant? It doesn't, however. Dioxane (5 member heterocyclic ring with O) which has only a H bond acceptor wouldn't be expected to denature proteins, but it does. H bonds also increase in nonpolar solvents. Peptides which have random structures in water can be induced to form helices when placed in alcohol solutions (trifluorethanol, for example), which are more nonpolar than water, as explained above in the helix-coil studies. If H bonds are the dominate factor in protein stability, the alcohols would stabilize proteins. At low concentrations of alcohol, proteins are destabilized.

Hence, based on small molecule studies and the study of protein in various cosolvents, it is unlikely that H bonds are the big stabilizers of protein structure. Only 11% of all C=O's and 12% of all NH's in protein have no H bonds (determined by analysis of X ray crystallographic structures). Of all H bonds to C=O, 43% are to water, 11% to side chains, and 46% to main chain NH's. Of all H bonds to NH, 21% are to water, 11% to side chains, and 68% to the main chain C=O. We will see later, however, than an opposite conclusion is reached from studies using site-specific mutagenesis.


A noncanonical FLT3 gatekeeper mutation disrupts gilteritinib binding and confers resistance

Elie Traer, Oregon Health & Science University, 3181 SW Sam Jackson Park Road, KR-HEM, Portland, OR 97239.

Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon, USA

Department of Physiology & Pharmacology, School of Medicine, Oregon Health & Science University, Portland, Oregon, USA

Division of Hematology & Medical Oncology, Department of Medicine, Oregon Health & Science University, Portland, Oregon, USA

Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon, USA

Division of Hematology & Medical Oncology, Department of Medicine, Oregon Health & Science University, Portland, Oregon, USA

Department of Cell, Development, & Cancer Biology, Oregon Health & Science University, Portland, Oregon, USA

Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon, USA

Division of Hematology & Medical Oncology, Department of Medicine, Oregon Health & Science University, Portland, Oregon, USA

Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon, USA

Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA

Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon, USA

Division of Hematology & Medical Oncology, Department of Medicine, Oregon Health & Science University, Portland, Oregon, USA

Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon, USA

Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA

Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon, USA

Division of Hematology & Medical Oncology, Department of Medicine, Oregon Health & Science University, Portland, Oregon, USA

Department of Cell, Development, & Cancer Biology, Oregon Health & Science University, Portland, Oregon, USA

Knight Cancer Institute, Oregon Health & Science University, Portland, Oregon, USA

Division of Hematology & Medical Oncology, Department of Medicine, Oregon Health & Science University, Portland, Oregon, USA

Department of Cell, Development, & Cancer Biology, Oregon Health & Science University, Portland, Oregon, USA

Elie Traer, Oregon Health & Science University, 3181 SW Sam Jackson Park Road, KR-HEM, Portland, OR 97239.

Acute myeloid leukemia (AML) is a genetically heterogenous disease with approximately 20 000 new cases per year in the United States. 1 Patients with AML have a 5-year survival of <25%, and intense efforts are underway to develop new treatments to improve survival. 2 Mutations in the FMS-like tyrosine kinase-3 (FLT3) gene are among the most common genomic aberrations in AML. Internal tandem duplication (ITD) in the juxtamembrane domain of FLT3 are present in approximately 20% of patients with AML. These mutations cause constitutive kinase activity, and lead to an increased risk of relapse and reduced survival. Another set of mutations in the tyrosine kinase domain (TKD) of FLT3 occur in 5–10% of AML patients and also result in activation of FLT3. 3

Multiple FLT3 tyrosine kinase inhibitors have been developed and can be separated into two classes. Type I inhibitors are canonical ATP competitors that bind the ATP binding site of FLT3 in the active conformation and are effective against both ITD and TKD mutations. By contrast, type II inhibitors bind the hydrophobic region adjacent to the ATP binding domain in the inactive conformation. Type II inhibitors are effective against FLT3-ITD, but do not inhibit FLT3-TKD mutations. Quizartinib, a type II inhibitor, has potent activity against FLT3, KIT, and RET. Despite high response rates as a monotherapy in patients with relapsed/refractory AML, the duration of response to quizartinib is approximately 4 months, and resistance via FLT3-TKD mutations is common. 4, 5 These mutations occur frequently at the activation loop residue D835 and less commonly at F691 which represents the “gatekeeper” position in FLT3. 6

Gilteritinib is second-generation inhibitor that targets FLT3 and AXL. 7 As a type I inhibitor, it is active against TKD mutations that impart quizartinib resistance. It was approved as monotherapy in relapsed/refractory patients with AML based upon the randomized phase 3 clinical study (ADMIRAL) which compared gilteritinib with chemotherapy. 8 Despite the significant survival benefit in the gilteritinib arm, monotherapy is limited by the development of resistance, which typically occurs after 6–7 months. Resistance to gilteritinib most commonly occurs through acquisition/expansion of NRAS mutations, however a minority of patients with F691L gatekeeper mutations were also identified. 7 To search for additional resistance mutations to gilteritinib, Tarver et al. used a well-established ENU mutagenesis assay and identified Y693C/N and G697S as mutations that confer resistance in vitro. 5 These mutations appear to function similar to the gatekeeper mutation by blocking gilteritinib binding to FLT3, but have not been reported in patients.

To more broadly investigate mechanisms of resistance to gilteritinib, we developed a two-step model of resistance that recapitulates the role of the marrow microenvironment (Figure 1A). In the first stage of resistance, or early resistance, the FLT3-mutated AML cell lines MOLM14 and MV411 are cultured with exogenous ligands, fibroblast growth factor 2 (FGF2) and FLT3 ligand (FL), that are normally supplied by marrow stromal cells. These culture conditions allow the cells to become resistant to gilteritinib without the need for resistance mutations. 11 When ligands are removed, the cells regain sensitivity to gilteritinib, but ultimately become resistant, which we term late resistance. At this point, intrinsic resistance mutations were identified in all of the cultures via whole exome sequencing. Similar to clinical data, 8 we found that the most common mutations are activating mutations in NRAS. 12 One late resistant culture had an FLT3 F691L gatekeeper mutation, and three cultures had an FLT3 N701K mutation, which has not previously been reported (Figure 1B). Given its proximity to F691L (Figure 1C,D), we hypothesized that this mutation might also disrupt gilteritinib binding to FLT3.

To determine whether the FLT3 N701K mutation has oncogenic capacity, we evaluated this mutation in the Ba/F3 transformation assay. The Ba/F3 cells are normally IL-3 dependent but the presence of certain oncogenes transforms them to grow indefinitely in the absence of IL-3. 13 The FLT3 N701K mutation, similar to FLT3 ITD and FLT3 D835Y , is an activating mutation and promoted growth of Ba/F3 cells in the absence of IL-3, whereas the parental, empty vector, FLT3 wild type (FLT3 WT ), or FLT3 F691L did not confer IL-3-independent growth (Figure 1E).

In contrast to Ba/F3 cells expressing FLT3 D835Y , Ba/F3 cells with FLT3 N701K were much less sensitive to gilteritinib with an approximate 8.5-fold increase in IC50 (Figure 1F). To test whether FLT3 N701K also promoted resistance to gilteritinib in the presence of FLT3 ITD mutations (Figure 1B), we generated FLT3 ITD+N701K and FLT3 ITD+F691L double mutants and expressed them in Ba/F3 cells. Concordant with previous studies, 6 the FLT3 ITD+F691L mutant demonstrated an approximate 11-fold increase in IC50 to gilteritinib compared to FLT3-ITD alone. The FLT3 ITD+N701K Ba/F3 cells were nearly identical to FLT3 ITD+F691L cells in their resistance to gilteritinib (Figure 1G). As a control, FLT3 WT Ba/F3 cells grown with IL-3 were insensitive to gilteritinib at comparable doses.

Next, we assessed the impact of FLT3 N701K mutations on downstream FLT3 signaling pathways. The Ba/F3 cells transformed with FLT3 N701K , FLT3 ITD , FLT3 ITD+F691L , and FLT3 ITD+N701K all resulted in phosphorylation of FLT3 (Y589/591) and STAT5 (Y694), AKT (S473), and ERK (T202/Y204) (Figure S1A). However, only FLT3 ITD+N701K or FLT3 ITD+F691L showed sustained phospho-FLT3 with increasing concentrations of gilteritinib (Figure 1H), indicating that both of these mutations prevent gilteritinib inhibition of FLT3, particularly at lower doses. The FLT3 kinase activity as reflected by FLT3 phosphorylation mirrored the viability assays in Figure 1G.

Since F691L gatekeeper mutations are known to drive resistance to multiple FLT3 inhibitors, 6, 7, 14, 15 we treated FLT3 ITD , FLT3 ITD+N701K and FLT3 ITD+F691L Ba/F3 cells with midostaurin, crenolanib, and quizartinib. Although FLT3 ITD+F691L and FLT3 ITD+N701K were largely insensitive to type I inhibitors midostaurin and crenolanib, cells with FLT3 ITD+N701K were notably more sensitive to the type II inhibitor quizartinib (Figure S2), suggesting that N701K blocks gilteritinib binding of type I inhibitors more effectively than type II. This was further apparent from our modeling of the FLT3 N701K mutation. While the FLT3 N701K mutation may sterically interfere with the binding of gilteritinib, quizartinib binding does not appear to be affected (Figure S3).

Through our studies, we identified the novel FLT3 N701K mutation in addition to the FLT3 F691L gatekeeper mutation. We used the Ba/F3 system to demonstrate that N701K blocks gilteritinib binding to FLT3, similar to the gatekeeper F691L, and promotes resistance to gilteritinib. Our data fit nicely with recent data from a mutagenesis screen of Ba/F3 cells with FLT3-ITD that identified F691L in addition to D698N, G697S, and Y693C/N as mutations that drive resistance to gilteritinib. 5 Modeling of these mutations indicates that they cause the loss of hydrogen bonding that accommodates the FLT3 side chain, leading to a steric clash between the tetrahydropyran ring of gilteritinib and FLT3. 5 Given the proximity of N701K to these mutations, we speculate that the mechanism of resistance to gilteritinib imparted by this mutation is similar (Figure S3). Importantly, these complementary methods identify a common hotspot for gilteritinib resistance mutations (Figure S4). Given the increasing use of gilteritinib in the clinic, we anticipate that additional resistance mutations will likely be identified in patients. Of note, the N701K mutation appears to be more resistant to type I inhibitors but retains sensitivity to type II inhibitors such as quizartinib (Figure S2), implicating that TKI class switching could serve as a promising avenue to mitigate development of gilteritinib resistance. The use of type I FLT3 inhibitors following the acquisition of resistance to type II inhibitors is a well-established approach to overcome resistance. However, what makes the case with the N701K mutation interesting is acquired sensitivity to a type II inhibitor following development of resistance to a type I inhibitor, which is a largely underappreciated concept. This knowledge can be used to help rationally sequence FLT3 inhibitors upon development of resistance.


Materials and Methods

RNA synthesis and sample preparation

15 N-labeled nucleotide triphosphates were prepared, as described previously ( 16 , 17 ), from Escherichia coli grown on M9 minimal medium supplied with 15 N NH 4 C1 as the sole nitrogen source. 15 N-labeled RNA molecules (hpGA, 5SDE and 5SE, see Fig. 1 ) were prepared by in vitro transcription with T7 RNA polymerase ( 18 , 19 ) from either synthetic (MWG Biotech) double-stranded DNA (hpGA) or linearized plasmid DNA (5SDE, 5SE) templates containing the appropriate sequences. These molecules were purified on a DEAE Sepharose FF column (Amersham Pharmacia Biotech) developed with a sodium acetate step gradient and subsequently by HPLC on a preparative C18 column (Vydac 218TP510), equilibrated with 50 mM KH 2 PO 4 /K 2 HPO 4 and 2 mM tetrabutylammonium hydrogen sulfate at pH 5.9 in water employing an acetonitrile gradient. In the case of hpGA and 5SDE a small amount of N+1 product was not separated from the main product. The 5SE template was fused to a hammerhead ribozyme sequence. The resulting transcript underwent self-cleavage at the 3′ end of 5SE ( 20 ) and no additional products (N+1, N-1) were observed in this case. Lyophylized products were desalted with a NAP-25 gel filtration column (Amersham Pharmacia Biotech) and precipitated twice with 5 vol of 2% (w/v) lithium perchlorate in acetone. hpGA was denatured at 95°C at a concentration of 0.03 mM in 10-fold diluted NMR buffer (100 mM NaCl, 10 mM NaH 2 PO 4 /Na 2 HPO 4 , 2.5 mM EDTA, pH 6.9) and slowly cooled to room temperature over a period of 3 h. 5SDE and 5SE were folded into a monomeric hairpin form by denaturing at 95°C at a concentration of 0.25 mM and subsequent 5-fold dilution into ice cold water and finally exchanged into NMR buffer (5 mM KH 2 PO 4 /K 2 HPO 4 , 60 mM KCl and 8 mM CaCl 2 , pH 5.75 for 5SDE or 20 mM NaH 2 PO 4 /Na 2 HPO 4 , 0.1 M KCl, 4 mM MgCl 2 , 3 mM NaN 3 , pH 7.2 for the 5SE-L25 complex) using Centricon-3 microconcentrators (Amicon, Inc.). Ribosomal protein L25 was prepared and the 5SE-L25 complex formed by titration as described previously ( 20 ). The final sample concentrations were ∼1.7 mM for 5SDE and ∼0.8 mM for hpGA and the 5SE-L25 complex.

NMR spectroscopy

NMR experiments were performed on a Varian UNITY INOVA 600 MHz, a Varian UNITY INOVA 750 MHz or a Broker DMX 600 MHz spectrometer. All spectra were processed and analyzed using Vnmr, Xeasy ( 21 ) and NMRpipe ( 22 ). NMR spectra were recorded in 95% H 2 O/5% D 2 O at a temperature of 10°C (hpGA) or 25°C (5SDE and 5SE-L25 complex) using the WATERGATE ( 23 ) water suppression scheme including water flip-back pulses ( 24 ).

A 1 H- 1 H-NOESY spectrum was recorded at 750 MHz for hpGA with a data matrix consisting of 400 (t 1 × 960 (t 2 ) complex data points. Sweep widths of 9000 and 16 000 Hz in F1 and F2, respectively, and a NOE mixing time of 150 ms were used. A total of 128 scans per complex t 1 increment were collected. The 1 H carrier was positioned at 6.8 p.p.m. during t 1 and at the H 2 O resonance during t 2 and the 15 N carrier at 195 p.p.m. 15 N decoupling was employed during data acquisition. Data were zero filled to 4K × 4K complex data points and apodized using squared cosine functions in both dimensions before Fourier transformation.

Sequences and secondary structures of the RNA molecules used in this study. Non-Watson-Crick base pairs with NH⋯N hydrogen bonds are shaded. hpGA, secondary structure of hpGA. The boxed region indicates the sequence studied by Wu et al. ( 4 ). The (pseudo) 2-fold symmetry axis of the molecule is indicated by a black oval. The numbering scheme is according to ( 4 ). 5SDE, secondary structure of 5SDE, which contains nucleotides 70–106 of E.coli 5S ribosomal RNA. 5SE, secondary structure of 5SE, which contains nucleotides 70–82 and 94–106 of E.coli 5S ribosomal RNA. The internal bulge region known as the E-loop is indicated for 5SDE and 5SE.

Sequences and secondary structures of the RNA molecules used in this study. Non-Watson-Crick base pairs with NH⋯N hydrogen bonds are shaded. hpGA, secondary structure of hpGA. The boxed region indicates the sequence studied by Wu et al. ( 4 ). The (pseudo) 2-fold symmetry axis of the molecule is indicated by a black oval. The numbering scheme is according to ( 4 ). 5SDE, secondary structure of 5SDE, which contains nucleotides 70–106 of E.coli 5S ribosomal RNA. 5SE, secondary structure of 5SE, which contains nucleotides 70–82 and 94–106 of E.coli 5S ribosomal RNA. The internal bulge region known as the E-loop is indicated for 5SDE and 5SE.

3D 1 H- 1 H- 15 N-NOESY-HSQC spectra were recorded at 750 MHz for 5SDE and the 5SE-L25 complex with 192 (t 1 × 44 (t 2 ) × 512 (t 3 ) complex data points, eight scans per increment, spectral widths of 11 300 Hz, 2190 Hz and 16 000 Hz in F1, F2 and F3, respectively, and a NOE mixing time of 80 ms. The 1 H carrier was positioned at 6.8 p.p.m. during t 1 and at the H 2 O resonance during acquisition. The 15 N carrier was positioned at 120 p.p.m. and 15 N decoupling was employed during acquisition. The data were zero filled to 1K × 128 × 1K complex data points in F1, F2 and F3, respectively, and apodized using cosine functions in all dimensions before Fourier transformation.

The 2 J HN - 1 H- 15 N-HSQC experiments were recorded at 600 MHz either as a conventional WATERGATE ( 23 ) water flip-back ( 24 ) 1 H- 15 N-HSQC (hpGA) or according to Sklenar et al. ( 25 ) (5SDE, 5SE-L25 complex) with the INEPT transfer delays set to 10 ms. The data matrices consisted of 256 (t 1 ) × 800 (t 2 ) complex data points. A total of 64 scans per t 1 increment were collected. Spectral widths were 7000 Hz in the 15 N and 12 000 Hz in the 1 H dimension. The experiments were performed with the 1 H carrier positioned at the H 2 O resonance and the 15 N carrier at 195 p.p.m. The data were zero filled to 2K × 2K complex data points and apodized using cosine functions in both dimensions before Fourier transformation. The quantitative J NN HNN-COSY experiments were performed as previously described ( 1 ). The data matrices consisted of 128 (t 1 ) × 800 (t 2 ) (5SE-L25 complex) or 350 (t 1 × 800 (t 2 complex data points (5SDE and hpGA). A total of 64 (5SDE, hpGA) or 256 (5SE-L25) scans per t 1 increment were collected. Spectral widths were 7000 Hz in the 15 N and 12 000 Hz in the 1 H dimension. The experiments were performed with the 1 H carrier positioned at the H 2 O resonance and the 15 N carrier at 153 p.p.m.for the 1 H- 15 N-INEPT and at 194 p.p.m. during the NN-COSY step. A mixing time of 30 ms was used for the NN-COSY transfer. The data were zero filled to 512 × 2K complex data points and apodized using cosine functions in both dimensions before Fourier transformation. The quantification of the 2h J NN -coupling constants was carried out as described previously without correcting for an underestimation of 10–20% due to the finite excitation bandwidth of the 15 N radio frequency pulses ( 1 ).

Assignment of a 2h J NN coupling in hpGA to a GA base pair with a G N1H1-A N1 hydrogen bond. ( A ) Geometry of an imino-hydrogen bonded GA base pair (left) in comparison with a Watson—Crick GC base pair (right). ( B ) One-dimensional cross section from a 2D 1 H- 1 H-NOESY spectrum taken at the chemical shift of the G5/G5′ H1 hydrogen showing the strong NOE cross peak to the A4/A4′ H2 hydrogen. ( C ) HNN-COSY spectrum of hpGA showing cross correlations between G H1 hydrogens and G N1 and C N3 nitrogens (dashed black lines) typical for Watson—Crick GC base pairs and between the G5/G5′ H1 hydrogen and the G5/G5′ N1 and A4/A4′ N1 nitrogens (red line). ( D ) 2 J HN - 1 H- 15 N-HSQC spectrum of hpGA showing the correlation of the A4/A4′ H2 hydrogen to the A4/A4′ N1 and N3 nitrogens (blue line). Typical chemical shift ranges of the relevant nitrogen atoms are indicated at the right side of the spectrum.

Assignment of a 2h J NN coupling in hpGA to a GA base pair with a G N1H1-A N1 hydrogen bond. ( A ) Geometry of an imino-hydrogen bonded GA base pair (left) in comparison with a Watson—Crick GC base pair (right). ( B ) One-dimensional cross section from a 2D 1 H- 1 H-NOESY spectrum taken at the chemical shift of the G5/G5′ H1 hydrogen showing the strong NOE cross peak to the A4/A4′ H2 hydrogen. ( C ) HNN-COSY spectrum of hpGA showing cross correlations between G H1 hydrogens and G N1 and C N3 nitrogens (dashed black lines) typical for Watson—Crick GC base pairs and between the G5/G5′ H1 hydrogen and the G5/G5′ N1 and A4/A4′ N1 nitrogens (red line). ( D ) 2 J HN - 1 H- 15 N-HSQC spectrum of hpGA showing the correlation of the A4/A4′ H2 hydrogen to the A4/A4′ N1 and N3 nitrogens (blue line). Typical chemical shift ranges of the relevant nitrogen atoms are indicated at the right side of the spectrum.


Molecular switches in GPCRs

GPCRs are key players in cell-cell communication and pass the signal via coordinated action of switches.

There are two types of switches: toggle switches and locks.

The action of switches can be permanent or temporary.

Receptor activation is associated with breaking of hydrophobic barriers and influx of water molecules.

Molecular switches in GPCRs enable passing the signal from the agonist binding site, usually located close to the extracellular surface, to the intracellular part of the receptor. The switches are usually associated with conserved structural motifs on transmembrane helices (TMs), and they are accompanied by adjacent residues which provide the signal to the central residue in the toggle switch. In case of locks being the molecular switches, they are breaking (permanently or temporarily) upon agonist binding. Cascade action of switches is correlated with influx of water molecules to form a pathway linking both sides of the receptor. The switches remove the hydrophobic barriers and facilitate water movement while water molecules help to rearrange the hydrogen bond network inside the receptor.


2 GENOMICS STRUCTURE AND BIOLOGICAL FEATURES OF SARS-COV-2

Coronaviruses belong to the order Nidovirales in the family coronaviridae. Coronavirinae and Torovirinae subfamilies are divided from the family. The subfamily Coronavirinae is further divided into four genera: Alpha-, Beta-, Gamma- and Deltacoronavirus. 15 Phylogenic analysis revealed that SARS-CoV-2 is closely related to the beta-coronaviruses. Similar to other coronaviruses, the genome of SARS-CoV-2 is positive-sense single-stranded RNA [(+) ssRNA] with a 5′-cap, 3'-UTR poly(A) tail. The length of the SARS-CoV-2 genome is less than 30 kb, in which there are 14 open reading frames (ORFs), encoding non-structural proteins (NSPs) for virus replication and assembly processes, structural proteins including spike (S), envelope (E), membrane/matrix (M) and nucleocapsid (N), and accessory proteins. 16, 17 The first ORF contains approximately 65% of the viral genome and translates into either polyprotein pp1a (nsp1–11) or pp1ab (nsp1–16). Among them, six nsps (NSP3, NSP9, NSP10, NSP12, NSP15 and NSP16) play critical roles in viral replication. Other ORFs encode structural and accessory proteins. 18, 19 The S protein is a transmembrane protein that facilitates the binding of viral envelop to angiotensin-converting enzyme 2 (ACE2) receptors expressed on host cell surfaces. Functionally, the spike protein is composed of receptor binding (S1) and cell membrane fusion (S2) subunits. 20 The N protein attaches to the viral genome and is involved in RNA replication, virion formation and immune evasion. The nucleocapsid protein also interacts with the nsp3 and M proteins. 21 The M protein is one of the most abundant and well-conserved proteins in the virion structure. This protein promotes the assembly and budding of viral particles through interaction with N and accessory proteins 3a and 7a. 22, 23 The E protein is the smallest component in the SARS-CoV-2 structure that facilitates the production, maturation and release of virions. 18

The most complex component of the coronaviruses genome is the receptor-binding domain (RBD) in the spike protein. 24, 25 Six RBD amino acids are necessary for attaching to the ACE2 receptor and hosting SARS-CoV-like coronaviruses. According to multiple sequence alignment, they are Y442, L472, N479, D480, T487 and Y4911 in SARS-CoV, and L455, F486, Q493, S494, N501 and Y505 in SARS-CoV-2. 26 Therefore, SARS-CoV-2 and SARS-CoV vary with respect to five of these six residues. The SARS-CoV strain genome sequences derived from humans were very close to those in bats. Even so, several differences have been identified between the gene sequences of the S gene and the ORF3 and ORF8 gene sequences that encode the attachment and fusion proteins and replication proteins, respectively. 27 Specific MERS-CoV strains derived from camels were shown to be identical to those extracted from humans, with the exception of differences between the genomic regions S, ORF4b and ORF3. 28 In addition, genome sequencing-based experiments have shown that human MERS-CoV strains are phylogenetically linked to those of bats. However, for the S proteins, the species have a similar genome and protein structures. 29 Based on the recombination studies of orf1ab and S encoding genes, the MERS-CoV was derived from the interchange of genetic elements between coronaviruses in camels and bats. In comparison, with a 96% overall identification, the primary protease is strongly protected among SARS-CoV-2 and SARS-CoV. 29-31

The ACE2 protein is found in many mammalian body tissues, primarily in the lungs, kidneys, gastrointestinal tract, heart, liver and blood vessels. 32 ACE2 receptors are vital elements in regulating the renin-angiotensin-aldosterone system pathway. Based on structural experiments and biochemical studies, SARS-CoV-2 appears to have an RBD that strongly binds to ACE2 receptors of humans, cats, ferrets and other organisms with the homologous receptors. 33

The genome sequencing of SARS-CoV-2 in January 2020 was shown to be 96% identical to the bat coronavirus (BatCoV) RaTG13 genome and 80% identical to the SARS-CoV genome. 34 However, significant differences exist. For example, the protein 8a sequence in the SARS-CoV genome is absent in the 2019-nCoV, and the protein 8b sequence of SARS-CoV-2 is 37 amino acids longer than that in SARS-CoV. 35

Alignment sequence analysis of the CoV genome indicates non-structural and structural proteins being 60% and 45% identical, respectively, among various types of CoVs. 36 These data show that nsps are more conservative than structural proteins. RNA viruses have a higher mutational load as a result of shorter replication times (Figure 1). 36 Based on comparative genomic studies between SARS-CoV-2 and SARS-like coronaviruses, there are 380 amino acid substitutions in the nsps genes and 27 mutations in genes encoding the spike protein S of SARS-CoV-2. These variations might explain the different behavioral patterns of SARS-CoV-2 compared to SARS-CoVs. 8 For example, the primary N501 T mutation in the Spike protein of SARS-CoV-2 could have significantly increased its binding affinity to ACE2. 37

2.1 Pathogenesis of SARS-CoV-2

The entry of the SARS-CoV-2 into host cells and release their genomes into target cells is dependent on a sequence of steps. The virus uses the protein spike, which is important for assessing tropism and virus transmissibility. Additionally, SARS-CoV-2 even targets human respiratory epithelial cells with ACE2 receptors, indicating a structure of RBD similar to SARS-CoV. 38 After attachment of the S1-RBD to the ACE2 receptor, host cell-surface proteases such as TMPRSS2 (transmembrane serine protease 2) act on a critical cleavage site on S2. 38 This results in membrane fusion and viral infection. Following virus entry, the uncoated genomic RNA is translated into polyproteins (pp1a and pp1ab) and then assembled into replication/transcription complexes with virus-induced double-membrane vesicles (DMVs). Subsequently, this complex replicates and synthesizes a nested set of subgenomic RNA by genome transcription, encoding structural proteins and some accessory proteins. Newly formed virus particles are assembled by mediating the endoplasmic reticulum and the Golgi complex. Finally, virus particles are budded and released into the extracellular milieu compartment. Thus, both the viral replication cycle and progression begin. 10

Inside the host cells, survival of SARS CoVs is maintained by multiple strategies to evade the host immune mechanism, which can also be generalized to SARS-CoV-2. 39, 40 As a result of the lack of pathogen-associated molecular patterns on DMVs originating from the first step of SARS-CoVs infection, they are not recognized by pattern recognition receptors of the host immune system. 25 Nsp1 can impede the interferon (IFN)-I responses through several mechanisms, such as a silencing of the host translational system, the induction of host mRNA degradation and the repression of transcription factor signal transducer and activator of transcription (STAT)1 phosphorylation. Nsp3 antagonizes interferon and cytokine production by blocking the phosphorylation of interferon regulation factor 3 (IRF3) and interrupting the nuclear factor-kappa B (NF-ΚB) signaling pathway. NSPs 14 and 16 cooperate to form a viral 5′ cap similar to that of the host. Thus, the viral RNA genome is not recognized by immune system cells. 41 The accessory proteins ORF3b and ORF6 can disrupt the IFN signaling pathway by inhibiting IRF3 and NF-KB-dependent IFNβ expression and blocking the JAK-STAT signaling pathway, respectively. Also, IFN signaling is flattened by structural proteins M and N, which result in a disturbance in TANK-binding kinase 1 (TBK1)/IKB kinase ε and TRAF3/6-TBK1-IRF3/NF-ΚB/AP1 signals. 22, 39 Because the D614 G mutation is found in the outer spike protein of the virus, this attracts a huge amount of attention from the human immune system and may therefore impair the ability of SARS-CoV-2 to avoid vaccine-induced immunity. D614 G is not in the RBD, although it is involved in the interaction between individual spike protomers that regulate their mature trimeric form on the surface of the virion by hydrogen bonding. 42 Korber et al. reported that the SARS-CoV-2 variant in the D614 G spike protein has become influential across the globe. Although clinical and in vitro evidence indicate that D614 G alters the phenotype of the virus, the effect of the mutation on replication, pathogenesis, vaccine and therapy development is relatively unknown. 43 From in vitro and clinical evidence, it is apparent that D614 G has a distinct phenotype, although it is not clear whether this is the outcome of verified adaptation to human ACE2, as well as whether it enhances transmissibility, or will have a significant impact. 43

2.2 Diagnosis of COVID-19

Early diagnosis and isolation of suspected patients play a vital role in controlling this outbreak. 44 The specificity and sensitivity of different diagnostic techniques differs between populations and the types of equipment employed. 45 Several proceedures have been recommended for the diagnosis of COVID-19:

COVID-19 symptoms are observed approximately 5 days after incubation. 46 The median time of symptom onset from COVID-19 incubation is 5.1 days, and those infected display symptoms for 11.5 days. 47 This duration was shown to have a close link with the patient's immune system and age. Gastrointestinal symptoms include diarrhea, vomiting and anorexia, recorded in almost 40% of patients. 48, 49 Up to 10% of patients with gastrointestinal symptoms show no signs of fever or respiratory tract infections. 50 COVID-19 has also been linked to hypercoagulable disease, elevating the risk of venous thrombosis. 51 There are also records of neurological symptoms (such as fatigue, dizziness and disturbed awareness), ischemic and hemorrhagic strokes, and muscle damage. 52 Many extrapulmonary symptoms comprise skin and eye manifestations. Italian researchers have identified skin manifestations in 20% of patients. 53 The clinical outlook for children can progressively worsen as a result of respiratory failure, which could not be corrected within 1–3 days by traditional oxygen (i.e. nasal catheter 54 ) in severe cases the hallmarks are septic shock, sepsis, extreme and continuum bleeding as a result of coagulation abnormalities, and metabolic acidosis. 55

Septic shock could cause severe damage and impair several organs, in addition to a severe pulmonary infection. When extrapulmonary system dysfunction occurs, including the circulatory and the digestive systems, septic shock is probable, and the mortality rate increases substantially. 55 Premature delivery and intrauterine hypoxia occur when the prenate is deprived of an adequate environment of oxygen. Insidious symptoms require specific care in some newborn and preterm infants. Reports have indicated a good prognosis for children within 1 or 2 weeks. 55 Children are prone to a hyperinflammatory response to COVID-19 similar to Kawasaki disease, which responds well to management, for which a new term is being coined. 56

The findings of most blood tests are usually nonspecific but could help determine the causes of the disease. A complete blood count typically shows a normal or low count of white blood cells and lymphopenia. C-reactive protein (CRP) and erythrocyte sedimentation rate were generally increased, which would optimally be rechecked on days 3, 5 and 7 after admission. 1, 57, 58 Creatine kinase plus myoglobin, aspartate aminotransferase and alanine aminotransferase, lactate dehydrogenase, D-dimer, and creatine phosphokinase levels could be increased in severe forms of COVID-19 disease. During viral-bacterial co-infections, procalcitonin levels may be elevated. 59, 60 In a systematic review and meta-analysis study, Pormohammad et al. 61 investigated the accessible laboratory results obtained among 2361 SARS-CoV2 patients, with the results demonstrating 26% leukopenia, 13.3% leukocytosis and 62.5% lymphopenia. Also, among 2200 patients, 91% and 81% revealed elevated platelets (thrombocytosis) and CRP, respectively. 61 Additionally, a review of case studies identified clinical diagnosis and clinical parameter modification in a 47-year-old man diagnosed with the disease from Wuweian. 62

To investigate the effect of the coronavirus during the acute phase of the disease, plasma cytokines/chemokines tumor necrosis factor (TNF)-α and interleukin (IL)-1β, IL1RA, IL2, IL4, IL5, IL-6, IL-10, IL13, IL15 and IL17A were measured. 1, 63 One study showed that macrophages and dendritic cells play crucial roles in an adaptive immune system. These cells contain inflammatory cytokines and chemokines, such as IL-6, IL-8, IL-12, TNF-α, monocyte chemoattractant protein-1, granulocyte-macrophage colony-stimulating factor and granulocyte colony-stimulating factor. These inflammatory reactions could cause a systemic inflammation. 64

Chest X-ray examination may display diverse imaging characteristics or patterns in COVID-19 patients with a different disease severity and duration. Imaging results differ based on patient age, disease stage during screening, immune competency and drug therapy protocols. 66 On the other hand, computed tomagraphy (CT) imaging is essential for monitoring disease progression and assessing therapeutic effectiveness. It can be re-examined 1 to 2 days after admission, based on the Diagnostic and Treatment Protocols Regulation (DTPR). 67

The cardinal hallmark of COVID-19 was multiple, bilateral, posterior and peripheral ground-glass opacities with or without pulmonary consolidation and, in severe cases, infiltrating shadows. 68 Autopsy analysis of a COVID-19 patient displayed fluid accumulation and hyaline membrane formation in alveolar walls, which may be the primary pathological driver of the ground-glass opacity. 69

However, further studies indicated that small patchy shadows, pleural changes, a subpleural curvilinear line and reversed halo signs are generally observed in COVID-19 patients. 70, 71 The intralobular lines and thickened interlobular septa were shown in a crazy-paving pattern on the ground-glass opacity background. 67 Also, several lobar lesions can be found in the respiratory system in children with a severe infection. Evidence showed that chest CT manifestations (pulmonary edema) reported for COVID-19 are generally close to SARS and MERS. 69

The clinical diagnosis of COVID-19 is focused primarily on epidemiological data, clinical symptoms and some adjuvant technologies, such as nucleic acid detection and immunological assays. In addition, the isolation of SARS-CoV-2 requires high-throughput equipment (biosafety level-3) to ensure personnel safety. Moreover, serological tests have not yet been validated. In the field of molecular diagnosis, there are three main issues: (i) decreasing the number of false negatives by detecting minimal amounts of viral RNA (ii) avoiding the number of false positives through the correct differentiation of positive signals between different pathogens and (iii) a high capacity for fast and accurate testing of a large number of samples in a short time. 73

2.3 Nucleic acid detection

Two widely used technologies for SARS-CoV-2 nucleic acid detection are the real-time RT-PCR (rRT-PCR) and high-throughput sequencing. Nevertheless, as a result of a reliance on equipment and high costs, high-throughput sequencing in clinical diagnosis is restricted. Access to the whole genome structure of SARS-CoV-2 has helped the design of specific primers and has introduced the best diagnostic protocols. 47, 74 In the first published reports on applying the rRT-PCR in COVID-19 diagnosis, targeting the spike gene region (S) of SARS-COV-2 has shown remarkable specificity and limited sensitivity. 68 Later, the sensitivity of this technique was greatly improved by the use of specific probes for the other viral-specific genes, including RNA-dependent RNA polymerase (RdRp) in the ORF1ab region, Nucleocapsid (N) and Envelop (E). To avoid cross-reaction with other human coronaviruses and prevent the potential genetic drift of SARS-CoV-2, two molecular targets should be involved in this assay: one nonspecific target to detect other CoVs, and one specific target for SARS-CoV-2. The comparison of the results obtained from targeting all studied genes exhibited that the RdRp gene is the most appropriate target with the highest sensitivity. The RdRp assays were validated in approximately 30 European laboratories using synthetic nucleic acid technology. 73 Currently, Chan et al. 75 have proposed a novel RT-PCR assay targeting a sequence of the RdRp/Hel that could detect low SARS-CoV-2 load in the upper respiratory tract, plasma and saliva samples without any cross-reactivity with other common respiratory viruses. Although the CDC-recommended assays in the USA rely on two nucleocapsid proteins N1 and N2, the WHO recommends the E gene assay as a first-line screening, followed by the RdRp gene assay as a confirmatory test. Based on the most recent evidence, the QIAstat-Dx SARS-CoV-2 panel, a multiplex RT-real time PCR system targeting genes RdRp and E, remains highly sensitive despite the nucleotide variations affecting the annealing of the PCR assay. 76

Generally, quantitative (RT-PCR) RT-qPCR has high specificity as a gold standard assay for the final diagnosis of COVID-19. However, its sensitivity could be variable based on viral load, RNA extraction technique, sampling source and disease stage during the time of sampling. Indeed, the RT-PCR false-positive results are related to the cross-contamination of samples and handling errors. By contrast, inaccuracies during any stage of the collection, storage and processing of samples may lead to false-negative results. Some studies have revealed that samples from the upper respiratory tract (bottom of the nostrils and the oropharynx) are more desirable for the RT-PCR assay as a result of many viral copies. 77 Moreover, other shortcomings of RT-qPCR assays include biological safety hazards arising from maintaining and working on patient samples, as well as time-consuming and cumbersome nucleic acid detection process. 66, 68

To improve the molecular diagnostic techniques for COVID-19, isothermal amplification-based methods are currently in development. The loop-mediated isothermal amplification (LAMP) utilizes the DNA polymerase and 4 to 6 different primers binding to the distinct sequences on the target genome. 78 In the LAMP reactions, the amplified DNA is indicated by turbidity arising from a by-product of the reaction, a detectable color generated by a pH-sensitive dye, or fluorescence produced by a fluorescent dye. 79 The approach occurs at a single temperature, in less than 1 hour, and with minimal background signals. The LAMP diagnostic testing for COVID-19 is more specific and sensitive compared to the conventional RT-PCR assays and does not dependent on specialized laboratory equipment such as a thermocycler. However, as a result of the multiplicity of primers used in this method, optimizing the reaction conditions presents a major challenge. 80, 81

2.4 Microarray-based technique

Antigen detection and immunological techniques can be used for a rapid and cost-effective diagnosis at the same time as providing an alternative to molecular methods. Immunological techniques including the immunofluorescence assay, direct fluorescence antibody test, nucleocapsid protein detection assay, protein chip, semiconductor quantum dots and the microneutralization assay define a binding between a viral antigen and a specific antibody. 88-91 These immunological methods are simple to operate but have low specificity/sensitivity. In the case of COVID-19, virus morphology can be observed by electron microscopy according to traditional Koch’s postulates. 92, 93 Serological tests can improve coronavirus detection such that associated antigens and monoclonal antibodies can represent a new diagnostic approach for future development (Figure 2). 94, 95

Serological tests could be specific to one type of immunoglobulin, they can concurrently measure IgM and IgG antibodies, or they may be absolute antibody examinations, which often measure IgA antibodies. 96 Based on the specific procedure and device, these experiments will usually be carried out within 1–2 hours after a sample arrives in the laboratory and is loaded onto the appropriate platform. 97 Guo et al. 98 indicated that IgA and IgM antibodies have positive rates of 93.0% and 85.5% after 3–6 days, respectively. Also, 78.0% of positive IgG antibodies were detected during 10–18 days. The efficiency of detection by an IgM enzyme-linked immunposorbent assay (ELISA) is higher than that of qPCR after 5.5 days of symptom onset. After 5 days, IgM ELISA detection is more efficient than a qPCR.

Moreover, the combination of PCR and IgM ELISA increased the detection rate by 98.5%. 98 Xiang et al. 99 tested 63 infected patients of SARS-CoV-2 admitted to Jinyintan Hospital in Wuhan, Hubei, China. Patient serum samples were evaluated using an ELISA and indirect ELISA IgG capture. The study results indicate that IgM was positive with an accuracy of 64.3%, a sensitivity of 44.4% and a specificity of 100% in 28 of 63 samples. The sample identification of 52 cases also showed a positive IgG test with a sensitivity of 82.54%, a specificity of 100% and an accuracy of 88.8%. In addition, a sensitivity of 87.3% was achieved using IgM and IgG combination analysis. 99

2.5 CRISPR technique

Nucleic acid detection with CRISPR-Cas13a/C2c2 is a highly rapid, sensitive and specific molecular detection platform, which may aid in the epidemiology, diagnosis and control of the disease. In addition, Cas13a/C2c2 can detect the expression of transcripts in live cells and different diseases. 101, 102 Zhang et al. presented a protocol for the detection of COVID-19 using the CRISPR diagnostics-based SHERLOCK (Specific High Sensitivity Enzymatic Reporter UnLOCKing) technique. RNA fragments of the SARS-CoV-2 virus help detect target sequences of approximately 100 copies. The experiment is performed by isothermal amplification of the extracted nucleic acid of samples from patients and then amplification of the viral RNA sequence via Cas13 and is finally read out by a paper dipstick in less than 1 hour. 103, 104 Huang et al. 105 established a CRISPR-based assay by a custom CRISPR Cas12a/gRNA complex. They used a fluorescent probe to identify target amplicons produced by standard RT-PCR or isothermal recombinase polymerase amplification. This method showed specific detection at places not equipped with the PCR systems needed for qPCR diagnostic tests in real time. The analysis allows the identification of SARS-CoV-2 positive samples with a test-to-response time of approximately 50 minutes and a detection limit of two copies of each sample to be detected. The findings of the CRISPR test on nasal samples collected from persons with COVID-19 were comparable with matched data achieved from the CDC-approved RT-qPCR test. 105

Broughton et al. 106 described the development of a fast (< 40 min), simple-to-implement and precise CRISPR-Cas12-based lateral flow test to diagnose SARS-CoV-2 from RNA extract from a nasal swab. Using artificial reference samples and clinical specimens from patients, comprising patients diagnosed with COVID-19 disease and 42 patients with other respiratory illnesses, they confirmed their process. This CRISPR-based approach has provided a visual and quicker alternative option to the SARS-CoV-2 real-time RT-PCR method used in the US Centers for Disease Control and Prevention, with approximately 100% negative predictive agreement and 95% positive predictive agreement. 106

2.6 LAMP-based technique

Loop-mediated isothermal amplification (LAMP) is a new isothermal nucleic acid amplification method with great efficiency. This is used to amplify RNAs and DNAs with high specificity and sensitivity as a result of its exponential amplification feature and six particular target sequences diagnosed by four separate primers. 107 The LAMP assay is rapid and does not need high-priced reagents or equipment. Furthermore, the gel electrophoresis method is widely utilized for investigation of the amplified items to detect endpoints. Hence, the LAMP test will help to decrease the cost of coronavirus detection. Several strategies for the detection of coronavirus based on LAMP are defined here, as developed and performed in clinical diagnosis. 108

Poon et al. 109 have reported a simple LAMP test in the SARS study and demonstrated the feasibility of this method for SARS-CoV detection. The SARS-CoV ORF1b site was selected for SARS detection and amplified in the presence of six primers via the LAMP reaction, and then the amplified products were assessed by gel electrophoresis. The sensitivity and detection levels in LAMP test for SARS are close to those of traditional PCR-based techniques. Pyrc et al. 110 effectively applied LAMP to HCoV-NL63 detection with a desirable sensitivity and specificity in mobile cell cultures and clinical specimens. Particularly, one replica of RNA template was found to be responsible for the detection restriction. Amplification is observed as fluorescent dye or magnesium pyrophosphate precipitation. These techniques can be achieved in real time by monitoring the turbidity of the pyrophosphate or fluorescence, which correctly overcome the restriction of endpoint detection. 110

Shirato et al. 111 developed a beneficial RT-LAMP assay for the diagnosis and epidemiological monitoring of human MERSCoV. This method was highly specific, without any cross-reaction with other specific respiratory viruses, and detected as few as 3.4 copies of RNA. 111 Subsequently, they developed the RT-LAMP assay by revealing a sign using a quenching probe (QProbe), which has the same efficiency as the usual real-time RT-PCR test with respect to MERSCoV detection. 112

Based on other evidence, a nucleic acid visualization method was developed that combines RT-LAMP and a vertical flow visualization strip for MERS detection. 113

2.7 Penn RAMP technology

Based on the effectiveness reported by Zhang et al. 104 using the comparatively less sensitive LAMP, the improved sensitivity of the Penn-RAMP technique achieved by Huang et al. 114 , which is attributable to an updated two-step LAMP protocol, can prove to be substantially effective as a diagnostic. To amplify specific targets by recombinase polymerase amplification, in which all targets are simultaneously amplified, the Penn-RAMP requires a preliminary reaction with outer LAMP primers. A next highly precise LAMP reaction is then triggered. Especially, the first stage uses F3 and B3 outer LAMP primers, whereas the other four RAMP primers are further mixed in the stage 2. Compared to normal LAMP, this ‘nested’ concept considerably improved the sensitivity of LAMP by approximately 10–100 times, especially when working with distilled and crude samples. 115 Additionally, when extended to mock trials, the Penn-RAMP methodology was given a 100% approval rating at 7–10 copies of viral RNA per reaction, compared to a 100% approval rating at the 700 viral RNA copies needed for PCR analysis. 114, 115

2.8 Droplet digital PCR

For the direct identification and quantification of DNA and RNA targets, droplet digital PCR (ddPCR) comprises an extremely sensitive technique. 116 It has been widely used for infectious disease conditions, particularly because of its ability to identify a few copies of viral genomes accurately and efficiently. 117 If low-level and/or residual viral existence identification is appropriate, ddPCR quantitative data are much more insightful than those provided by regular RT-PCR tests. In view of the need to restrict (as far as possible) false-negative results in COVID-19 diagnosis, use of the ddPCR can provide a vital support. Even so, the ddPCR assay is still very rarely studied in clinical settings and there is currently no available evidence for European cases. 118

2.9 Next-generation sequencing (NGS)-based technique

RNA viruses come in great assortment of varieties, and they are the etiological specialists of numerous significant human and animal infectious diseases. 119

RNA viruses comprise the major variety and are the etiologic agents of very infectious diseases in humans and animals such as SARS, hepatitis, influenza and IB (avian infectious bronchitis). High-throughput NGS technology has a vital role in primary and accurate diagnosis. 120 In addition, the NGS method can detect whether or not various types of virus comprise a pathogen. The fast novel technique of viruses by NGS, including DNA-sequencing and RNA-sequencing has developed the identification of viral diversity. 121 The identification of a huge range of pathogen using NGS technologies is also significant for controlling viral infection caused by a new pathogen. 122 In recent years, the advancement of the NGS method via RNA-sequencing has enabled us to make great progress in the fast recognition of new RNA viruses. RNA-sequencing detects millions of reversely transcribed DNA fragments from complex RNA samples at the same time using random primers. 123 Chen et al. 122 reported a new duck coronavirus using the RNA-sequencing method, which differed from that of chicken IBV (infectious bronchitis virus). 122 The new duck-specific CoV was a possible new species within the genus Gamma-coronavirus, as shown by sequences of the viral 1b gene from three regions.

In conclusion, the outbreak of a novel virus emerged at the end of December 2019. COVID-19 spread immediately and challenged medicine, economics and public health worldwide. Numerous evidence proposed that the ACE2 receptors comprise crucial structural proteins for virus budding and entry into host cells. Both transmission from unidentified intermediate hosts to cross-species and human to human transmission have been recognized. Hence, early detection and isolation of suspected patients can play an essential role in controlling this outbreak. Currently, diagnostic methods for COVID-19 are numerous hence, it is imperative to choose a suitable detection protocol. Each of the described techniques has its specific disadvantages and advantages. Both chest CT imaging and RT-PCR tests are recommended for COVID-19 patients. However, the use of PCR requires various equipment and a well-established laboratory. LAMP can be detected with low numbers of DNA or RNA templates within 1 hour. Microarray is an expensive method for COVID-19 diagnosis, and other newly developed methods also require additional investigation to achieve rapid development and detection in the future. Given that the number of infected cases is rapidly increasing, future studies should reveal the secrets of the molecular pathways of the virus with respect to developing targeted vaccines and antiviral treatments.


Abstract

Density functional methods are used to examine the geometries and energetics of molecules containing a phenyl ring joined to the trigonal bipyramidal SF3 framework. The phenyl ring has a strong preference for an equatorial position. This preference remains when one or two ether −CH2OCH3 groups are added to the phenyl ring, ortho to SF3, wherein an apical structure lies nearly 30 kcal/mol higher in energy. Whether equatorial or apical, the molecule is stabilized by a S···O chalcogen bond, sometimes augmented by CH···F or CH···O H-bonds. The strength of the intramolecular S···O bond is estimated to lie in the range between 3 and 6 kcal/mol. A secondary effect of the S···O chalcogen bond is elongation of the S–F bonds. Solvation of the molecule strengthens the S···O interaction. Addition of substituents to the phenyl ring has only modest effects upon the S···O bond strength. A strengthening arises when an electron-withdrawing substituent is placed ortho to the ether and meta to SF3, while electron-releasing species produce an opposite effect.


F3. Hydrogen Bonding

Linus Pauling first suggested that H bonds (between water and the protein and within the protein itself) would play a dominant role in protein folding and stability. It would seem to make sense since amino acids are dipolar and secondary structure is common. Remember, however, the H bonds would be found not only in the native state but also in the denatured state. Do they contribute differently to the stability of the D vs N states? Many experimental and theoretical studies have been performed investigating helix <===> (random) coil transitions in small peptides. Remember all the intrachain H bonds in the helix? Are they collectively more stable than H bonds between water and the peptide in a (random) coil?

In thinking about conformational studies involving small peptides, it is useful to apply Le Chatelier's Principle to the equilibrium below:

Anything perturbant (small molecules, solvent, etc) that would preferentially interact with the helical form would push the equilibrium to the helical form.

Early models assumed that intrachain H bonds were energetically (enthalpically) more favorable than H bonds between peptide and water. But to form a hydrogen bond requires an entropy payback since a helix is much more ordered (lower entropy) than a random coil (higher entropy). At low temperature, enthalpy predominates and helix formation in solution is favored. At high temperature, the helix is disfavored entropically. Imagine the increased vibrational and rotational states permitted to the atoms at higher temperatures. (Remember the trans to gauche conformational changes in the acyl chains of double chain amphiphiles as the temperature increased, leading to a transition from a gel to liquid crystalline phase in bilayer vesicles.) Theoretical studies on helix-coil transitions predict the following:

  • as the chain length increases, the helix gets more stable
  • increasing the charge on the molecule destabilizes the helix, since the coil, compared to the more compact helix, has a lower charge density
  • solvents that protonate the carbonyl oxygen (like formic acid) destabilizes the helix and
  • solvents that form strong H bonds compete with the peptide and destabilize the helix. In contrast, solvents such as CHCl3 and dimethylformamide (a nonprotic solvent), stabilize the helix. Likewise 2-chloroethanol and trifluoroethanol, which form none or weaker H bonds to the peptide than does water, stabilizes the helix. (In the case of trifluoroethanol, molecular dynamics simulations have shown that TFE preferentially inteacts with (solvates) the peptide, which inhibits H bonds from the peptide backbone to water, stabilizing the intrahelical H bonds.

These helix-coil studies suggest that H bonds are important in stabilizing a protein.

But do they really? Why should these H bonds differ from those in water? It's difficult to figure out whether they are since there are so many possible H bonds (between protein and water, water and water, and protein and protein), and their strength depends on their orientation and the dielectric constant of the medium in which they are located.

If intrachain H bonds in a protein are not that much different in energy than intermolecular H bonds between the protein and water, and given that proteins are marginally stable at physiological temperatures, then it follows that the folded state must contain about as many intramolecular hydrogen bonds within the protein as possible intermolecular H bonds between the protein and water, otherwise the protein would unfold.

To resolve this issue, and determine the relative strength of H bonds between the varying possible donors and acceptors, many studies have been conducted to compare the energy of H bonds between small molecules in water with the energy of H bonds between the same small molecules but in a nonpolar solvent. The rationale goes like this. If the interior of a protein is more nonpolar than water (lower dielectric constant than water), then intrastrand H bonds in a protein might be modeled by looking at the H bonds between small molecules in nonpolar solvents and asking the question, is the free energy change for the following process < 0:

where D is a hydrogen bond donor (like NH) and A is a hydrogen bond acceptor, (like C=O), w is water (i.e. donor and acceptor are in water), and n is a nonpolar solvent, and D G o and K are the standard free energy change and the equilibrium constant, respectively, for the formation of a H-bond in a nonpolar solvent from a donor and acceptor in water. This reaction simulates H-bond contributions to protein folding, where a buried H-bond is mimicked by a H-bond in a nonpolar solvent. The reaction written above is really a thought experiment, since it would be hard to set up the necessary conditions to make the measurement. However, we can calculate the D G o for this reaction since it is a state function and it really doesn't matter how one accomplishes this process.

Let's consider a specific example: the formation of H bonds between two molecules of N-methylacetamide (NMA) in water and in a nonpolar solvent. The reaction scheme shown below describes a set of reactions (a thermodynamic cycle) involving the formation of H-bonded dimers of NMA . A and B are both molecules of NMA, in either water (w) or a nonpolar solvent (n).

N-methylacetamide is a good mimic for the H bond donors and acceptors of the peptide bond of a polypeptide chain.

In the reaction scheme shown above,

K 1 is the equilibrium constant for the dimerization of NMA in a nonpolar medium. This can be readily determined, and is >1, implying that D G o < 0. (Remember, D G o = -RTlnKeq) For the dimerization of NMA in CCl4, D G o1 = -2.4 kcal/mol.

K 2 is the equilibrium constant (think of it as a partition coefficient) for the transfer of two NMA molecules from water to a nonpolar solvent (again easily measurable). For NMA transferring from water to CCl4, D G o2 = + 6.12 kcal/mol.

K 3 is the equilibrium constant for the dimerization of NMA in water. This can be readily determined, and is <1, implying that D G o > 0. For the dimerization of NMA in water, D G o3 = +3.1 kcal/mol.

K 4 is the equilibrium constant (think of it as a partition coefficient) for the transfer of a hydrogen-bonded dimer of NMA from water to a nonpolar solvent. You try to think of a way to measure that! I can't. This is where thermodynamic cycles comes in so nicely. You don't have to measure it. You can calculate it from K1-3 since D G o is a state function!

D G o2 + D G o1 = D G o3 + D G o4 OR -RTlnK2 + -RTlnK1 = -RTlnK3 + -RTlnK4

lnK2 + lnK1 = lnK3 + lnK4 = ln(K2K1 )= ln(K3K4) or (K2K1 )= (K3K4)

For NMA transferring from water to CCl4, D G o4 = + 0.62 kcal/mol.

(Note: Biochemists like to talk about thermodynamic cycles which may seem new to you. However, believe it or not, you have seen them before - in General Chemistry - in the form of Hess's Law!)

From K1-4and the corresponding D G o values, we can now calculate D G o5 for the formation of H-bonded NMA dimers in a nonpolar solvent from two molecules of NMA(aq). This reaction, which we hope simulates formation of buried intrachain H bonds in proteins on protein folding, is:,
Dw + Aw <=======> (DA)n, for which D G o5 = +3.72 (i.e. disfavored).

If this model is a good mimic for studying H bond formation on protein folding, it suggests that the formation of buried H bonds during protein folding does not drive protein folding.

However, if the transfer of D and A (from a large protein) from water to the nonpolar medium (modeled by K2) is driven by other forces (such as the hydrophobic effect), the positive value of K1 will strongly favored buried H bond formation. So, if this happens in proteins, it is clear why so many intrachain H bonds occur, since K1 is so favored. H bonds may not assist the collapse of a protein, but would favor internal organization within a compact protein. That is, H bonds don't drive protein folding per se, but form so that the folded protein would not be destabilized by too many unsatisfied H bonds.

There are potential problems with this simple model. The interior of a protein is not homogeneous (i.e. the effective dielectric within the protein will vary). H bond strength is also very sensitive to geometry. Also, there are many H bonds within a protein, so slight errors in the estimation of H bond strength would lead to large errors in determination of protein stability.

Another argument against H bonds being the determining factor in protein folding and stability comes from solvent denaturation studies. If intrachain H bonds are so important, then should not solvents that can H bond to the backbone denature the protein? Shouldn't water (55 M) act as a denaturant? It doesn't, however. Dioxane (5 member heterocyclic ring with O) which has only a H bond acceptor wouldn't be expected to denature proteins, but it does. H bonds also increase in nonpolar solvents. Peptides which have random structures in water can be induced to form helices when placed in alcohol solutions (trifluorethanol, for example), which are more nonpolar than water, as explained above in the helix-coil studies. If H bonds are the dominate factor in protein stability, the alcohols would stabilize proteins. At low concentrations of alcohol, proteins are destabilized.

Hence, based on small molecule studies and the study of protein in various cosolvents, it is unlikely that H bonds are the big stabilizers of protein structure. Only 11% of all C=O's and 12% of all NH's in protein have no H bonds (determined by analysis of X ray crystallographic structures). Of all H bonds to C=O, 43% are to water, 11% to side chains, and 46% to main chain NH's. Of all H bonds to NH, 21% are to water, 11% to side chains, and 68% to the main chain C=O. We will see later, however, than an opposite conclusion is reached from studies using site-specific mutagenesis.

/>
Biochemistry Online by Henry Jakubowski is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


SUMMARY

Neuroblastoma is a childhood extracranial solid tumour that is associated with a number of genetic changes. Included in these genetic alterations are mutations in the kinase domain of the anaplastic lymphoma kinase (ALK) receptor tyrosine kinase (RTK), which have been found in both somatic and familial neuroblastoma. In order to treat patients accordingly requires characterisation of these mutations in terms of their response to ALK tyrosine kinase inhibitors (TKIs). Here, we report the identification and characterisation of two novel neuroblastoma ALK mutations (A1099T and R1464STOP), which we have investigated together with several previously reported but uncharacterised ALK mutations (T1087I, D1091N, T1151M, M1166R, F1174I and A1234T). In order to understand the potential role of these ALK mutations in neuroblastoma progression, we have employed cell culture-based systems together with the model organism Drosophila as a readout for ligand-independent activity. Mutation of ALK at position 1174 (F1174I) generates a gain-of-function receptor capable of activating intracellular targets such as ERK (extracellular signal regulated kinase) and STAT3 (signal transducer and activator of transcription 3) in a ligand-independent manner. Analysis of these previously uncharacterised ALK mutants and comparison with ALK F1174 mutants suggests that ALK mutations observed in neuroblastoma fall into three classes. These classes are: (i) gain-of-function ligand-independent mutations such as ALK F1174l , (ii) kinase-dead ALK mutants, e.g. ALK I1250T (Schönherr et al., 2011a) and (iii) ALK mutations that are ligand-dependent in nature. Irrespective of the nature of the observed ALK mutants, in every case the activity of the mutant ALK receptors could be abrogated by the ALK inhibitor crizotinib (Xalkori/PF-02341066), albeit with differing levels of sensitivity.


CONCLUSIONS AND PERSPECTIVES

The recent advances in the study of poly(ADP-ribosyl)ation are mainly concerned with the biological role and mode of operation of proteins belonging to the PARP family as well as with different functions of poly(ADP-ribosyl)ation in cells. The key role of this process in mammalian cells has stimulated the publication of excellent reviews ( 111–113) and journal issues [Mol Aspects Med. 2013 34(6)] devoted to PARP and poly(ADP-ribosyl)ation as well as to PARP inhibitors. At the same time, the molecular mechanism of this process, which is important for understanding the part played by mono- and poly(ADP-ribosyl)ation in regulation of replication, transcription, DNA repair and protein stability/degradation remains unclear to a large extent. All these events have to be regulated by protein-DNA and protein-protein interactions as well as by interactions of proteins with poly(ADP-ribose). It is known that the function of RNA-binding proteins is also dependent on PARP and poly(ADP-ribosyl)ation ( 27, 114).

The combined action of PARP1 and PARP2 seems important in the regulation of poly(ADP-ribosyl)ation of proteins ( 43, 115) and recently discovered poly(ADP-ribosyl)ation of damaged DNA ( 32, 33, 116, 117). PARP3 along with other members of the PARP family catalyzing mono(ADP-ribosyl)ation may contribute to the initiation of PAR synthesis under specific conditions ( 117). It is very likely that the synthesis of short or long and branched PAR polymers may perform different functions during the regulation of cellular processes. The synthesis of branched PAR chains could mainly serve for the creation of non-membranous cell compartments by PAR-induced liquid demixing events ( 84) necessary for regulation of chromatin remodeling and the subsequent multienzyme processes of DNA repair and transcription. In this regard, the combined action of mono-, oligo- and poly(ADP-ribosyl)-transferases in the cell may ensure multilevel regulation of DNA and RNA metabolism.

Since damaged DNA is not the sole activator of PARP1 ( 85), it can be assumed that more complicated mechanisms are necessary for well-tuned activation and accuracy of ADP-ribosylation process. Indeed, recent studies have revealed a number of proteins that can not only stimulate or inhibit PARP1 catalytic activity, but also appear to regulate target and PAR length. As PARP inhibition hold great promise in cancer therapy, elucidation of nuances of PARylation mechanism as well as molecular mechanisms of activation and regulation of PAR synthesis in cells may provide a basis for the rational development of new treatment strategies.

The aim of this review was to summarize the current knowledge on the molecular mechanism of poly(ADP-ribosyl)ation catalyzed with PARP1 and its regulation to move further forward our study of this key process in mammalian cells.