|
1. |
Introductory lecture: what are enzyme structures telling us? |
|
Faraday Discussions,
Volume 93,
Issue 1,
1992,
Page 1-11
M. F. Perutz,
Preview
|
PDF (1013KB)
|
|
摘要:
Faraday Discuss., 1992, 93, 1-11 Introductory Lecture What are Enzyme Structures telling Us? M. F. Perutz MRC Laboratory of Molecular Biology, Cambridge CB2 2QH, UK Most globular proteins are waxy inside and soapy outside. Their compact structures are stabilised by the hydrophobic effect which is mainly entropic, and by hydrogen bonds and dispersion forces which are mainly enthalpic. Structurally homologous proteins tend to share a common set of internal sites from which polar residues are excluded, even if they share little sequence homology. Electrostatic effects are dominant in enzyme catalysis. The active sites of enzymes are generally buried in clefts or cavities where dipoles tend to be oriented so as to optimise the pK,s of ionizable amino acid side chains for catalysis; substrates are clamped close by to maximise electrostatic interactions.The activities of many enzymes have long been known to be controlled by allosteric effects or by induced fit. Some serine proteinase inhibitors have evolved yet another control mechanism: this is a spring-loaded safety catch that makes them revert to their latent, stable, inactive form unless the catch is kept in the ‘loaded’ position by another molecule. In 1953, when I discovered that the phase problem of protein crystallography could be solved by the method of isomorphous replacement with heavy atoms, I expected that the structures, not only of haemoglobin, but also of many other proteins would presently be solved, but this did not happen. Only three protein structures had been solved by 1965, and only eleven by 1970. The practical difficulties of crystallization, of preparing isomorphous heavy-atom derivatives and of recording the X-ray diffraction data were so great that determination of each new structure took many years, Besides, most professional crystallographers were reluctant to enter this risky new field. Today the situation is transformed.Since 1975 there has been an exponential rise in the annual number of protein structures solved; in 1990 alone over 100 new ones came to light and by mid-1991 ca. 300 protein structures had been solved, many of them of practical interest to medicine (Fig. 1). What are these structures telling us? The Hydrophobic Effect Most water-soluble proteins are waxy inside and soapy outside, because their larger hydrophobic amino acid residues shy away from water and coalesce.Van der Waals interactions make a large, but insufficient, contribution to the stability of the hydrophobic core thus formed. The main contribution comes from an entropic effect discovered in 1945 by Frank and Evans in a classic paper on the solubility of hydrocarbons in water.’ Near room temperature, the enthalpy of dissolution of gaseous non-polar atoms or molecules in water is always negative and proportional to the surface area of the solute 1 Introductory Lecture 1959 1965 1970 1975 1980 1985 18 year Fig. 1 The growth of protein crystallography (Courtesy Dr. Arthur Lesk) molecule; the absolute value of that enthalpy decreases with rising temperature.Privalov later demonstrated that dissolution of non-polar molecules in water raises the heat capacity of the water; that rise is also proportional to the surface area of the solute and also decreases with increasing temperature.2 The entropy of solution is negative, and its magnitude drops with rising temperature. Frank and Evans concluded that the non-polar atoms and molecules become solvated, such that their surface becomes covered with a layer of partially ordered water molecules which they likened to icebergs. In 1959 Kauzmann recognized the importance of Frank and Evans’ hydrophobic effect for the stability of protein^.^ He suggested that the water molecules’ anarchic distaste for the orderly regimentation imposed upon them by the hydrophobic sidechains of the protein forces these sidechains to shy away from water and congregate in the centre of the protein.His prediction was borne out in the same year by Kendrew’s structure of myoglobin. Direct experimental evidence for Frank and Evans’ icebergs was first found by Hendrickson and Teeter in the structure of crambin where they saw ordered water molecules covering the surface of a leucine ~idechain.~ The hydrophobic effect stabilizes proteins only near ambient temperatures. With increasing temperature, the loss of entropy due to water adhering to the unfolded protein diminishes, which destabilizes the folded structure. When the temperature drops, the stability of the hydrated hydrocarbons in the unfolded polypeptide chain begins to exceed that of the compact hydrophobic core in the native protein, and the protein unfolds with the release of heat. Privalov used microcalorimetry to demonstrate this effect in myoglobin* (Fig. 2).Chothia pointed out that the density of globular proteins approaches the average density of their component crystalline amino acids.’ Their packing density, Le. the ratio of the volume of their component atoms to their actual volume is 0.75, compared to 0.44 or less for organic liquids, which proves that the protein interior really does resemble M.F. Pemtz 1 I 1 I 1 I I 1 -1 0 0 10 20 30 40 50 60 70 T/"C Fig. 2 Microcalorimetric recording of the enthalpy changes on cooling and subsequent heating of metmyoglobin solutions at pH 3.83.The low-temperature peaks correspond to heat release on cold denaturation and heat absorption on subsequent renaturation by warming the protein. The temperature shift of the peaks is due to the slow kinetics of unfolding and folding of the myoglobin structure at low temperatures. The large peak on the right gives the heat absorbed on heat denaturation. (Reproduced, with permission, from P. L. Privalov and S. J. Gill, Adu. Protein Chern., 1988,39,205) a crystalline wax rather than an oil drop. Fersht6 will show in this Discussion that the crystallization of the hydrophobic core is one of the driving forces throughout the folding of a globular protein. Common Sites from which Polar Residues are Excluded By 1965 it had become clear that the amino acid sequences even of mammalian myoglobins and haemoglobins diverged greatly, yet there was strong evidence in favour of them all having the same tertiary structure.When I wondered what determines that structure, I found that all their sequences shared a set of 32 common sites from which polar residues other than serines and threonines were excluded, though the nature of the hydrophobic residue at each site varied.' Recently Ivo de Baere and Luc Moens at Antwerp determined the amino acid sequence of a haemoglobin in the parasitic worm Ascaris.' This haemoglobin consists of eight identical polypeptides, eacn containing two haem-binding domains in tandem. When I wondered whether their structure really was like that of mammalian globins I found that hydrophilic residues were excluded from the same set of sites, with the exception of the N-terminal residues of each domain which show no sign of being helical as in mammalian globins.There have been many instances of structural homology between pairs of proteins that fulfil different functions and exhibit little sequence homology. A recent striking example is the structural homology between rabbit-muscle actin and the ATP-ase fragment of the bovine HSWOheat-shock protein? Actin contains 375 and HSWO 386 amino acid residues. Superposition of the 241 pairs of structurally equivalent sites in the two proteins revealed only 39 chemical identities. Among the 241 structurally 4 Introductory Lecture equivalent sites are 93 with accessible surface areas of less than 20 A2.(This figure was chosen because it applies to the invariant non-polar sites in the globins.) Of these sites 67 are occupied by non-polar residues in both actin and HSWO; 17 sites are occupied by a non-polar residue in one protein and by serine or threonine in the other; six sites are occupied by serines or threonines in both proteins. This leaves only three sites where the rule is violated; this usually happens when sidechains can take up two alternative conformations, so that the non-polar one is largely buried, while only the aliphatic stem of the polar one is buried and the polar group itself reaches the solvent. The result suggests that an appropriately programmed algorithm might have been able to detect the structural homology of the two proteins, despite the very different lengths of their external loops.Electrostatic Effects in Enzyme Catalysis In a Friday Evening Discourse at the Royal Institution in 1948, Linus Pauling said: ‘I think that enzymes are molecules that are complementary in structure to the activated complexes of the reactions that they catalyse, that is, the molecular configuration that is intermediate between the reacting substances and the products of the reaction’.’’ His prediction was borne out in 1965 by lysozyme, the first enzyme structure to be solved. That structure showed the transition state of the substrate to be stabilized by the strong electric field of two carboxylates on either side of the active site cleft.” I generalized that finding with the statement: ‘We may now ask ourselves why chemical reactions, which normally require powerful organic solvents or strong acids and bases, can be made to proceed in aqueous solution near neutral pH in the presence of enzyme catalysts.Organic solvents have the advantage over water of providing a medium of low relative permittivity, in which strong electrical interactions between the reactants can take place. The non-polar interiors of enzymes provide the living cell with the equivalent of the organic solvents used by the chemists. The substrate may be drawn into a medium of low relative permittivity in which strong electrical interactions between it and specific polar groups of the enzyme can occur’.*2 Recent measurements have shown that the interior relative permittivities of protein interiors are between 40 and 50, rather than 4 as I then believed.Instead, as Warshel13 will tell us, the protein dipoles tend to be oriented so as to reinforce the field created by the ionized amino acid sidechains in the active sites of enzymes. Pepsin offers a striking example. Pepsin Pepsin is most active at pH2.0, the pH in the stomach, at which most other proteins are denatured. Moreover, it carries a net negative charge even at pH 1.0 and has an isoelectric point below 1.0. It was a triumph of structure analysis to explain these remarkable properties. When surrounded by water, external carboxylates of proteins normally have pK,s between 3.6 and 4.5, so that one would expect them to be protonated and discharged at pH 1.0.On the other hand, internal carboxylates can have their pK,s shifted, sometimes by several units, by the electric fields of neighbouring protein atoms or ions. Their pK,s may be raised by neighbouring anions and lowered by neighbouring cations or by hydrogen-bond donors such as NH and OH which repel other protons. Porcine pepsin, which was the first mammalian acid proteinase whose structure became accurately known, contains 29 aspartates and 13 glutamates, but only two arginines, one lysine and one histidine in its 326 amino acid-long polypeptide chain. Eleven of the aspartates and glutamates are buried, either wholly or partly, two of them M.E Perutz -HO-Fig. 3 Structure of pepsin. (a) Stereodrawing of the tertiary structure of rhizopus pepsin. The catalytic site lies in the large cleft. (6) Pair of aspartates forming the catalytic site of pepsin. The pK,s of the aspartates are lowered by hydrogen bonds from two main-chain NHs and by the OHs of a serine and a threonine. The distances marked are in A. (Reproduced, with permission, from I(.Suguna, R. B. Bott, E. A. Padlan, E. Subramanian, S.Sheriff, G. H. Cohen and D. R. Davies, J. Mol. BioL, 1987, 196, 884) in the active-site cleft. The carboxylates of the two aspartates in that cleft are hydrogen bonded to each other, which lowers the pK, of the proton donor to 1.2 and raises that of the acceptor to 4.7.Both carboxylates accept hydrogen bonds from the neighbouring water molecule and from NHs and OHs of the protein (Fig. 3). One of the other buried aspartates has its carboxylate hydrogen bonded to an arginine, and another is close enough to an arginine to have its negative charge partly compensated. The remainder accept hydrogen bonds from NHs, OHs or from buried water molecules which repel protons and therefore stabilize their negative charges.I4 NH-0-and OH-0-bonds can be very strong; they may provide enough energy to keep the structure of pepsin Introductory Lecture HIS 52 HIS 52 ASP 235 WET 230 MET 230 Fig. 4 Stereodrawing of the active site of cytochrome c peroxidase, showing on the proximal side of the haem the haem-linked His-175 hydrogen bonded to Asp-235 and Trp-191; parallel to the imidazole ring of the histidine and flanking it on the right is the indol ring of Trp-191, itself backed by contact with the sulfur atoms of Met-230 and -231.On the distal side of the haem are Arg-48, His-52 and Trp-51. (Reproduced, with permission, from S. L. Edwards, J. Kraut and T. L. Poulos, Biochemistry, 1988, 27, 8078. Copyright 1988 American Chemical Society.) intact at low pH. Its strong overall negative field reinforces the proteolytic activity of its catalytic centre. Haem Proteins At about the time when Pauling delivered his discourse, my mentor David Keilin asked me: ‘Haemoglobin, peroxidase and catalase all contain the same haem. What gives them their very different properties?’ Keilin’s tone implied that as a structural chemist I ought to know the answer, but I was clueless. Had I had more imagination, I ought to have replied that these different properties were likely to be due to the different electrostatic effects exercised on the haem by the different proteins.In all haem proteins, the haem iron is linked to the protein by amino acid residues that are electron donors: histidine in haemoglobin and peroxidase, tyrosine in catalase, cysteine in cytochrome P450 and methionine in cytochrome c. In oxyhaemoglobin the electron donation has been measured by replacing the haem iron by cobalt. While iron oxyhaemoglobin is diamagnetic, cobalt oxyhaemoglobin is paramagnetic with a spin of S=i. EPR measurements showed that ca.60 % of the cobaltous spin density is transferred to the bound oxygen molec~le.’~ Peroxidases catalyse the oxidation of a variety of substrates by peroxides. To accomplish this oxidation, they split the peroxide ion into a hydroxyl ion and an oxygen atom that becomes bound to the iron as a ferry1 oxide. This splitting requires the transfer of two electrons from the haem iron to the peroxide ion. The transfer proceeds in two distinct steps. In the first step an electron from the haem iron is transferred to the iron-bound oxygen atom, with formation of Fe4+O; in the second step an electron from M.F. Perutz Fig. 5 Stereodrawing of the active site of cytochrome P450 from Pseudornonus putidu showingthe hydrophobic environment of the substrate-binding site.The carbon atoms of the camphor substrate are shown as full black circles. (Reproduced, with permission, from T. L. Poulos, B. C. Finzel, I. C. Gunsalus, G. C.Wagner and J. Kraut, J. Biol. Chern., 1985, 260, 16122) the protein is transferred to the leaving OH with formation of a free radical. The necessary electrostatic field required to do this is provided by appropriate charged residues on either side of the deeply buried haem. On the proximal side, electron donation by the iron-bound histidine is reinforced by its hydrogen bond to an aspartate that is also hydrogen bonded to the electron-rich indol sidechain of a tryptophan. The indol ring of the tryptophan covers the imidazole ring of the histidine, suggesting that they could form a charge-transfer complex in the transition state of the enzyme, allowing the tryptophan, together with two neighbouring methionines, to form the free radical.On the distal side of the haem an arginine acts as an electron-attractant and a histidine could act as an acid-base catalyst. The field produced by the buried aspartate and arginine seems to mediate the transfer of the electrons, first from the haem and then from the indol sidechain of the tryptophan to the substrate. In this way the protein provides an electrostatic push-me-pull-you (Fig. 4).16,17 In catalase the haem iron is bound to the phenolate ion of a tyrosine sidechain that is also hydrogen bonded to the guanidinium sidechain of an arginine.I8 Its catalytic mechanism remains unclear despite detailed knowledge of its structure.Cytochromes P450 solubilize hydrocarbons by turning them into epoxides in the presence of oxygen. Their buried haem iron is linked to the thiolate of a cysteine that acts a strong electron donor, but no other polar residues on either side of the haem seem to be needed to reinforce the thiolate’s action. The enzymes extract two electron equivalents from reduced pyridine nucleotides to cleave dioxygen into water and an oxygen atom that first binds to the iron, forming an Fe4’0 complex, and is then inserted into the substrate’s hydrocarbon bond. I had often wondered how the enzyme activates hydrocarbons. Poulos and Kraut’s structure of a cytochrome P450 from a camphor-metabolizing species of Pseudornonas shows that the enzyme does no such thing. The camphor molecule lies undistorted in Introductory Lecture Fig. 6 Chain tracing of hen egg albumin.The black line marks the a-helical loop that occupies the position equivalent to the active sites of the serpins. The active methionine of a,-antitrypsin would be at the top left of the helix, although its loop is probably not helical. (Courtesy Drs. A. Strand and E. J. Goldsmith) a largely hydrophobic pocket, tied down there only by one hydrogen bond from a tyrosine hydroxyl to its carbonyl oxygen. All the enzyme does is to force the substrate's oxidizable hydrocarbon bond close to the iron-bound oxygen atom. In P450, therefore, it is not the substrate, but the enzyme itself that assumes a reactive transition state.The substrate is merely a passive victim of the reactive oxygen atom generated by the strong field in the deeply buried, water-free haem pocket of the enzyme (Fig. 5).19 I shall not talk to you about allostery or induced fit since Johnson2' and Schulz2' will show examples of both in this Discussion; instead I should like to draw your attention to a new type of protein control mechanism predicted by Robin Carrel1 and recently verified both chemically and crystallographically. Serpins: a New Control Mechanism The health of the lungs depends on proper function of a serine proteinase inhibitor in blood plasma named, misleadingly, cyI -antitrypsin, or better, cyl -antiproteinase. This inhibits neutrophil elastase, the chief matrix-cleaving proteinase secreted by activated leukocytes. It is the best-studied example of the family of serpins to which also belong antithrombin and plasminogen-activator inhibitor.It is a glycoprotein of 394 amino acid residues in a single chain. Its specificity for elastase is determined by a methionine. The structure of the intact protein is still unknown, but it can be inferred from that of its inactive homologue, egg albumin, solved here in Cambridge by Penny Stein and Andrew Leslie.** The structure is made of two layers of pleated sheets, partly covered M. F. Perutz 359SER 359SER \ s2c 7 s2c Fig. 7 Stereoview of a ribbon model of cleaved a,-antiproteinase. Proteolysis has broken the peptide bond between Ser-359, shown at the top, and the active Met-358 at the bottom of this diagram. The active loop, shown in black, has inserted itself between the strands of the pleated P-sheet, removing the methionine 70 A from its link to the serine in the active inhibitor.(Repro- duced, with permission, from R. A. Egh, H. T. Wright and R. Huber, Protein Eng., 1990, 3, 470) by a-helices. Protruding from its compact structure like a lamp filament is an a-helical loop (Fig. 6). Its counterpart in a,-antiproteinase is the peptide loop that contains the active methionine, but to fit into the active site cleft of leukocyte elastase, the loop in a1-antiproteinase should be extended rather than helical. The bond between the active methionine and the serine that follows it is easily cleaved; the structure of that cleaved form was solved in Huber’s laboratory near Munich.It shows cleavage to be followed by a drastic and irreversible change of structure which separates the active methionine and the serine from each other by ca. 70A; the active-site loop becomes incorporated in the six-stranded pleated sheet of the inhibitor in the form of an additional 6-strand (Fig. 7).23 The biological significance of this structural change was first suspected and then proved chemically in cyI -antiproteinase, by Carrell’s group here at Cambridge; it was then confirmed crystallographically in plasminogen activator inhibitor by Goldsmith’s group at Dallas, Texas. Carrel1 predicted that serpins can take up two alternative conformations: one in which the active-site loop is exposed, and another in which it is withdrawn between the strands of the P-sheet.Chemical experiments showed active a,-antiproteinase to bind peptides that have the same sequence as their active X-Ray crystallography proved that the inactive intact plasminogen activator has its active loop tucked away between the 6-strands of the pleated sheet, just as the chemical experiments had suggested (Fig. 8).25 In vivo, plasminogen activator inhibitor is kept in its active form by combination with the 54 kD protein vitronectin, but when freed from vitronectin, the active peptide is folded back to its stable, inert position. Similarly antithrombin is active only when combined with heparin; without that it snaps back to its inert form.Antithrombin and plasminogen activator inhibitor seem to have evolved a spring- loaded safety catch that makes them revert to their latent, stable, inactive form unless the catch is kept in the ‘loaded’ position by another molecule: antithrombin by heparin, Introductory Lecture Fig. 8 Chain tracing of inactive human plasminogen activator inhibitor. The active loop (black) is folded away in the P-sheet as in the cleaved a,-antiproteinase. Its cleavage takes place between Arg-358 and Met-359 which would be near the arrowhead in this form. (Courtesy Drs. A. Strand and E. J. Goldsmith) PA1 by vitronectin, any serpin by the proteinase that it is designed to inhibit. Only when that safety catch is in the tense position is the active site loop of these serpins exposed and ready for action; otherwise it snaps back and hides inside the protein.a,-antitrypsin may exist in an equilibrium between the two forms in plasma even when free. Spring-loaded protein molecules represent a remarkable new biological control mechanism. References 1 H. S. Frank and M. W. Evans, J. Chem. Phys., 1945, 13, 507. 2 P. L. Privalov and S. J. Gill, Adv. Protein Chem., 1988, 39, 191. 3 W. Kauzmann, Adv. Protein Chem., 1959, 14, 1. 4 W. A. Hendrickson and M. M. Teeter, Nufure (London), 1981, 290, 107. 5 C. Chothia, Nature (London), 1975, 254, 304. 6 A. R. Fersht, A. Matouschek, J. Sancho, L. Serrano and S. Vuilleumier, Furaduy Discuss., 1992,93, 183. 7 M. F.Perutz, J. C. Kendrew and H. C. Watson, J. Mof. Biof., 1965, 13, 669. 8 I. de Baere, L. Liv, L. Moens, J. van Beeumen, C. Gielens, J. Richelle, C. Trotman, J. T. Finch, M. Gerstein and M. F. Perutz, Proc. Nutl. Acad. Sci. USA, 1992, 89, in the press. 9 K. M. Flaherty, D. B. McKay, W. Kabsch and K. C. Holmes. Roc. Nutf.Acad. Sci. USA, 1991,88,5041. 10 L. Pauling, Nature (London), 1948, 161, 707. 11 C. C. F. Blake, D. F. Koenig, G. A. Mair, A. C. T. North, D. C. Phillips and V. R. Sarma, Nufure (London), 1965,206, 757. 12 M. F. Perutz, Roc. R. SOC. London Biof. Sci. B., 1966, 167, 448. 13 A. Warshel, J. K.Hwang and J. Aqvist, Furaduy Discuss., 1992. 93, 225. 14 A. R. Sielecki, A. A. Fedorov, A. Boodhoo, N. S. Andreeva and M. N. G. James, J. Mof.Biol, 1990, 214, 143. 15 R. K. Gupta, A. S. Mildvan, T. Yonetani and T. S. Srivastava, Biochem. Biophys. Research Commun., 1975,67, 1005. 16 B. C. Finzel, T. L. Poulos and J. Kraut. J. Biof. Chem., 1984, 259, 13027. M.F. Penrtz 17 S.L. Edwards, J. Kraut and T. L. Poulos, Biochemistry, 1988,27, 8074. 18 M. R.N.Murphy, T. J. Reid 111, A. Sicignano, N. Tanaka and M.G. Rossmann, J. Mol. Biol., 1981, 152,465. 19 T.L.Poulos, B. C. Finzel, I. C. Gunsalus, G. C. Wagner and J. Kraut, J. Biol. Chem., 1985,260,16122. 20 L. N.Johnson, S-H. Hu and D. Barford, Furuduy Discuss., 1992,93, 131. 21 G. E.Schulz, Furuduy Discuss., 1992, 93, 85. 22 P. E.Stein, A. G. W. Leslie, J. T. Finch and R.W. Carrell. J. MoZ. BioL, 1991, 221, 941. 23 H.R.Loebermann, R. Tokuoka, J. Deisenhofer and R.Huber. J. Mol. BioL, 1984,177,531. 24 R. W.Carrell, D.L1. Evans and P. E. Stein, Nature (London), 1991, 353, 576. 25 J. Mottonen, A. Strand, J. Symersky, R. M. Sweet, D. E. Danley, K. F. Georghegan, R. D. Gerard and E.J. Goldsmith, Nature (London), 1992, 355, 270. Paper 2/01778D;Received 3rd April, 1992
ISSN:1359-6640
DOI:10.1039/FD9929300001
出版商:RSC
年代:1992
数据来源: RSC
|
2. |
Substrate recognition by proteinases |
|
Faraday Discussions,
Volume 93,
Issue 1,
1992,
Page 13-23
Simon J. Hubbard,
Preview
|
PDF (1178KB)
|
|
摘要:
Faraday Discuss., 1992, 93, 13-23 Substrate Recognition by Proteinases Simon J. Hubbard and Janet M. Thornton" Biomolecular Structure and Modelling Unit, Department of Biochemistry and Molecular Biology, University College, Gower Street, London WClE 6BT, UK Simon F. Campbell Department of Discovery Chemistry, PJEzer Central Research, Sandwich, Kent, UK The molecular recognition of limited proteolytic site substrates by serine proteinases has been compared and contrasted to the recognition of serine proteinase inhibitors, utilising the coordinate sets contained in the Brook- haven Protein Databank. Most families of these inhibitors are known to possess a structurally conserved recognition motif at their reactive site- binding loops. Structural comparisons with trypsin limited proteolytic sites revealed that the in situ conformation of these substrates bears little resemblance to the inhibitor-binding loops.Assuming that 'both inhibitors and substrates bind to the proteinase in the same manner, segmental mobility would be required to permit substrates to adopt an 'inhibitor-like' binding conformation, which is presumed to be necessary for proteolysis. Modelling experiments have been conducted to attempt to introduce such a conforma- tion into tryptic limited proteolytic segments of the native proteins, to test the ability of the limited proteolytic sites to alter their geometry. Further to this, the conformational parameters of accessibility, pro- trusion, mobility and secondary structure have been analysed and incorpor- ated into a predictive algorithm to assign likely limited proteolytic sites within native protein structures.Serine proteinases are a family of proteolytic enzymes which are well characterised in terms of their structure, function and catalytic mechanism. The recognition which these enzymes exhibit may be classified into two categories: the recognition shown towards inhibitors and that shown towards their substrates. The former has been widely examined, as there are a number of crystallographic studies at atomic resolution of inhibitors in their native forms, and complexed to serine proteinases. The inhibitors recognise proteinases generally via a conserved structural motif, being that of an extended binding loop situated about the putative reactive site bond itself.'92 The inhibitors bind extremely tightly to the enzyme, but are cleaved very slowly.2 The substrates of the proteinases are not characterised so well in terms of their recognition conformations.The protein substrates of proteinases are termed limited proteolytic sites or 'nick sites' which are sites of specific fission amongst the many in a native protein fold. There are no crystallographic studies of proteinase-nick site complexes, although the structures of proteins containing known nick sites are available at atomic resolution. In our earlier studies3 the conformations of the recognition regions of the inhibitors and nick sites were compared. This confirmed that the inhibitor binding loops share a common main-chain conformation as observed by many authors.1*2*4*5 This was shown by comparisons of main-chain torsion angles and least-squares fitting of the inhibitor- binding loops [see Fig. l(a)]. The main-chain segment from P,-Ps (see ref. 6 for notation) was the most structurally conserved. Comparisons of this segment from (bovine pancreatic trypsin inhibitor) BPTI with a dataset of tryptic nick sites revealed that they 13 Proteinase Recognition Fig. 1 Superpositions onto BPTI. (a) Stereo superposition of serine proteinase inhibitor binding loops upon BPTI (from the Brookhaven dataset 2PTC). BPTI is shown in bold. All inhibitors shown from P,-Pi, although superpositions are based upon fitting from P,-P:. The other inhibitors are taken from the Brookhaven entries ITGS, 3SGB, lCSE, 2SNI and 4SGB.(b) Stereo superposi- tion of staphylococcal nuclease P, = Lys-48 nick site onto BPTI (shown in bold). Main-chain atoms only are shown. (c) Stereo superposition of elastase P, = Arg-125 nick site onto BPTI (shown in bold). Main-chain atoms only are shown were structurally very different [see Fig. 1(b) and 1(c)1. These results suggested that segmental mobility would play an important part in the recognition of such substrates. The basic premise for making these structural comparisons between nick sites and inhibitors is that the nick sites must adopt an ‘inhibitor-like’ conformation in order to be cleaved. This is supported by the fact that the inhibitors are generally considered to be ‘idealised substrates’.Indeed, BPTI is rapidly digested by Dermasterias imbricata (Starfish) tryp~in.~ Hence, the nick sites would be expected to adopt a very similar conformation local to the scissile peptide, in order to be cleaved. To this end, a considerable degree of conformational flexibility would thus be req~ired.~ In order to assess the extent of this proposed mobility, various modelling experiments have been conducted involving simple least-squares fitting and superposition of the tryptic nick sites onto BPTI, and loop-closure modelling. The former gives an idea of the extent of steric clashing occurring between nick site protein and enzyme, and the latter examines the ability of the nick site segments to take up ‘inhibitor-like’ conforma- tions in situ.The identification of limited proteolytic sites from structure has been attempted, in light of the reported dominance of conformational parameters such as accessibility* and mobility’ as principle determinants for proteolysis. The protrusion of amino acids as well as their secondary structural assignments’2 are also assayed and combined into a predictive algorithm. S. J. Hubbard, J. M. Thornton and S. E Campbell Table 1 Tryptic limited proteolytic sites used in this study, and their RMSDs to BPTI RMSD ref. protein P1 Br code resolution/A to BPTI (P,-P$) proteolytic structural staph nuclease Lys-5LYS-48 2SNS 1.5 1.59 2.06 13,14 19 Lys-49 1.81 ribonuclease Lys-31 Arg-33 5RSA 2.0 3.07 2.54 15 20 trypsinogen Arg-117 Lys-145 ITGN 1.65 2.3 1 1.35 16 16,21 elastase Arg-125 3EST 1.65 1.61 17 22 calmodulin Lys-77 3CLN 2.2 3.00 18 23 Data The dataset of tryptic-limited proteolytic sites used in this study is listed in Table 1.These nick sites are well documented in the literature, are not general degradative sites, and are cut by trypsin in native or near-native conditions. The proteins from which they are taken have been solved to a high resolution. Methods Least-squares Fitting and Superpositions The least-squares fitting algorithm of M~Lachlan~~ was used throughout this work, fitting on backbone atoms exclusively (a-carbons, main-chain nitrogens and carbonyl carbons). For the superpositions of inhibitors upon inhibitors, and nick sites upon BPTI, the equivalent atoms from P3-Pi were fitted and superposed. To assess the significance of the root-mean-square deviations (RMSDs) returned, the distribution of RMSDs obtained when fitting ‘random loops’ against each other was calculated.This was done by selecting all ‘loops’ from a dataset of 20 non-homologous proteins which had comparable C”-C“ distance matrices to that of a search template:’ in our case BPTI, using an in-house database management program 3D-SCAN, written by S. P. Gardner. A large tolerance value of 1.9 A was applied to each C”-C” distance in the search, resulting in all segments of structure with broadly ‘U-shaped’ conformations being extracted, i.e. effectively random ‘loops’.Initially, many sequentially adjacent ‘loops’ were extracted, and only the first ‘loop’ in any such grouping was further considered. The distributions of RMSDs obtained when fitting the P3-P5 segments of ‘loops’ against each other were then calculated. The RMSD values obtained when fitting inhibitors against inhibitors and tryptic nick sites against BPTI were then compared to this distribution. The steric clashing obtained when ‘docking’ the nick sites into trypsin was assessed by superposing the entire nick site protein onto BPTI, using the resulting rotation and translation matrices after fitting of the P3-Pi segment of the nick site region onto the BPTI equivalent. The BPTI co-ordinate set used was that from the Brookhaven entry 2PTC, where the inhibitor is bound by trypsin.After superposing the nick site protein into BPTI co-ordinate space, the resulting intermolecular atomic contacts were calculated. Proteinase Recognition Loop-closure Modelling and Filtering To attempt to model the BPTI conformation into the tryptic nick site proteins listed in Table 1, the loop closure algorithm of Sklenar et aZ.,26 as implemented by F. Eisenmenger, was chosen. The full details of the algorithm and its derivation are reported else- where,26927and only an overview of its application to this problem is detailed here. To model in the BPTI conformation, the main-chain torsion angles # and t,b of the nicksite region in question are reset to the equivalent values of the BPTI binding loop.If n residues in the loop are considered, then the #, t,b of residues 2,3,4. . . n are reset, a total of 2(n -1). This results in the ‘loop’ being broken at its C-terminal end, which must be closed. The algorithm accomplishes this by finding a set of torsion angles which close the loop, which are very close to the BPTI torsion angles set into the loop. Hence, a minimal change is introduced to the inhibitor conformation that is set upon closure of the loop. Side chains are ignored during the closure and are replaced in their original conformations upon completion. In practice, such closures are not always possible for a number of reasons. The # angle of prolines is expected to remain in the range -30 O to -90°, and any closures which introduce a proline q5 value outside this range are rejected. Also, due to the fact that higher order terms are omitted from a Taylor expansion in the algorithm, the procedure is iterative and the closure process is repeated to find a satisfactory closure.If more than 20 steps are required to do this, the closure is rejected. Alternatively, the algorithm may not be able to find a solution at all, and the closure is rejected. The closure algorithm itself forms filter 1in a series of filters designed to reject closures where the modelled conformation is either too distant from the initial BPTI conformation, or intrinsically unfavourable itself. Filter 2 rejects closures where the average torsion angle change across the loop exceeds 40 O or the average RMS shift of the co-ordinates exceeds 15.0 A.Filter 3 rejects closures which have RMSDs higher than 1.O A after fitting to BPTI from P, to P; . Filter 4 rejects closures which introduce greater than 20 atom-atom contacts of less than 2.5 A into the loop. For each nick site, 90 closures were attempted, ranging from 6 to 14 residues in length. For each closure length from 6 to 14, all closures were attempted when the PI residue was situated within the closure length itself. For example, for a six-residue closure, closures were attempted for P6-P,, P5-Pi, P,-P;, P3-Pj, P2-PL and P,-Pi. In all cases, the main-chain peptide torsion angle w was kept fixed. The resulting intermolecular contacts with trypsin were calculated when the closed loop modelled nick site proteins were ‘docked’ into the trypsin active site by superposition onto P3-P; of BPTI bound by trypsin as before.Analyses of Conformational Parameters and Prediction of Nick Sites The conformational parameters of accessibility, protrusion index, mobility (crystallo- graphic temperature factors) and secondary structure parameters were calculated and analysed. Accessibility was calculated using an implementation of the Lee and Richards algorithm2* using a probe of radius 1.4 A and atomic van der Waals radii as used by Ch~thia.~~Atomic accessibilities were summed over each residue, and then divided by the accessibility that specific residue has in an extended Ala-X-Ala tripeptide, to produce relative (percentage) accessibilities.The protrusion index (PI) is a measure of how much each residue protrudes from an equimomental ellipsoid calculated about the protein co-ordinates,lo-ll and takes a value from 0 to 9. The equimomental ellipsoid is scaled to take in incremental 10% quantities of the total number of atoms in the protein, so that 10 % of the atoms are in the central ellipsoid and have a PI of 0, 10% lie between the central ellipsoid and the next and have a PI of 1, and so on. The most protruding 10% atoms have a PI of 9. Temperature factors from crystallographic determinations were summed and averaged over each residue, and normalised to 1.0 for each protein. S. J. Hubbard, J. M. Thornton and S. F. Campbell Secondary structure parameters were calculated using the DSSP program.’* The assignments were quantified by scoring helix (H) as 0.6, strand (E) as 0.0, and all other Kabsch and Sander assignments as 1.0.This was done as preliminary results, and other reported results,30 showed that nick sites are rarely found in secondary structural regions, particularly not p-strand. The values for these assignments were calculated for varying window sizes for all the nick site proteins concerned, and preliminary results showed that a 10-residue window was best suited for identifying nick sites from structure. Hence, the mean value for all four conformational parameters was found for the tryptic nick sites, all lysines and arginines within the nick site proteins and all residues within the nick site proteins.The Lys/Arg subset was used as these are the only two residue types at which trypsin will cleave. In addition to those listed in Table 1, the following proteins were used for the calculation of the mean values for Lys/Arg residues and residues in general; ovalbumin (co-ordinates courtesy of A. Leslie), aspartate aminotransferase (2AAT), thermolysin (3TLN), flavocytochrome b2 (1 FCB), thioredoxin (1TRX) and glutamine synthetase (2GLS). These proteins all contain nick sites, but not necessarily tryptic ones, and were used to balance the calculations. Predictions of the nick sites were made by averaging the four conformational parameters over a 10-residue window (Ps-Pg assigned at PI), normalising these values to 1.0 for each protein, and finally averaging the four conformational parameter scores for each residue.The final prediction score was again normalised to 1.0 for each protein predicted. Results and Discussion Least-squares Fitting and Superpositions The superpositions shown in Fig. 1 (a) illustrate just how well the main-chain conforma- tion of the serine proteinase inhibitor binding loop is conserved. Although there is some variation in side-chain conformation, the main-chain conformation from P3-Pi remains conserved. BFTI makes an excellent template for comparison with the tryptic nick sites, since it is the most potent protein inhibitor of trypsin, as well as possessing the conserved recognition motif. In comparison to the inhibitor-inhibitor fits and superpositions, the nick sites show no structural resemblance to BPTI.Two example superpositions are shown in Fig. 1(b) and (c). Individual RMSDs to BPTI from P,-Pi are shown in Table 1, for each nick site. The values span a broad range from 1.35 to 3.07 A, which is indicative of the different conformations observed in the nick sites., This is confirmed when the nick sites RMSDs are compared to those observed when fitting ‘random loops’ against themselves and the inhibitors against themselves, as shown in Fig. 2. The nick site RMSDs to BPTI span the distribution over the most heavily populated region, indicating that they are no more like BPTI in structure than a random loop. Equally, the inhibitor-inhibitor RMSDs form a tight group at the lower end of the distribution indicating the strong conservation of the structure of the inhibitors.When the observed nick site P,-Pi regions are superposed onto BPTI bound into the trypsin active site, many steric overlaps are generated in the nick site-trypsin model which would prevent such ‘complexes’ from forming (see Table 2). This also suggests that gross concerted motions involving the whole recognition loop would be required to permit the docking of nick site proteins, as well as torsion angle changes proximal to the scissile peptide. As suggested previously,’ such motions as ‘hinge bending’ might be required to orient the nick site segment in such a way as to allow it to enter the trypsin active site. This is illustrated in Fig. 3(a), where the nick site segment about Arg-33 has been ‘docked’ into trypsin.Although the end of the helical segment containing Arg-33 fits into the trypsin active site reasonably comfortably, a large section of the rest Proteinuse Recognition 15004 I --I1250-1000--750--500--i 2 Fig. 2 Distribution of RMSDs of ‘random loops’ fitted against each other over six residues (P3-P$). The i-i line represents the range of RMSD values from (seine proteinase) inhibitor against inhibitor fits (see ref. 3) and n-n for nick site against BPTI fits Table 2 Intermolecular contacts generated upon superposition ‘docking’ of nick site proteins into the trypsin active site nick site C-C* contacts < 2.0 A all atom contacts < 2.0 A 2SNS LYS-5 2 310 2SNS LYS-48 0 53 2SNS Lys-49 0 29 SRSA LYS-31 4 466 SRSA Arg-33 17 957 lTGN Arg-117 23 289 lTGN LYS-145 2 228 3EST Arg-125 1 305 3CLN LYS-77 6 408 of ribonuclease clashes sterically with trypsin itself.Other nick sites followed this pattern, where they could not be ‘docked’ satisfactorily into the enzyme active site in their current conformations. There were some exceptions. The two Staphylococcal nuclease sites at Lys-48 and Lys-49, are situated at the tip of a highly exposed loop segment, protruding from the end of the molecule. When either of these sites were docked in the usual way, there was no bad steric clashing introduced between the nick site protein and trypsin. The other sites fall someway between these sites and the worst case, the ribonuclease Arg-33 site. Thus an inspection of nick site conformations shows that the types of motion required to produce a cleavable nick site are: (1) local torsion angle changes in the nick site segment, which would be expected to be large,3 and (2) grosser movements to manoeuvre the nick site segment away from the bulk of the protein.Closed-loop Modelling The problem is now to change the conformation, not just locally, of the nick site segment so that it may form a legitimate ‘complex’ with the enzyme by some modelling protocol. This must be accomplished without distorting the overall fold of the protein. The closed-loop modelling protocol described in the methods section was applied to this task.S. J. Hubbard, J. M. Thornton and S. E Campbell Fig. 3 (a) Superposition ‘docking’ of the ribonuclease Arg-33 nick site into trypsin active site. Ribonuclease is shown in bold. Both molecules are shown as C” traces. (b) Closed-loop model of ribonuclease nick site Arg-33. The main chain torsions of residues 27 to 39 were altered. The new ‘closed’ conformation of the loop is shown in bold Table 3 Loop-closure passes and modelling data protein prediction hit four letter RMSD to numbersa code and closure BPTI intermolecular P, residue number passes out of 90 after closure (P3-Pi)/A contacts ((2.0 A) (all atom) Lys/ Arg all residues 2SNS LYS-48 11 0.57 10 1 2SNS LYS-49 7 0.57 7 2 5RSA LYS-31 1 0.99 29 34 5RSA Arg-33 3 0.68 11 58 lTGN Arg-117 lTGN LYS-145 13 4 0.49 0.65 5 111 36 5 3EST Arg-125 0 - - 56 3CLN LYS-77 4 0.55 37 3 * Hit numbers refer to the ranking of that nick site amongst all residues in that protein, when considering only lysines and arginines (first column) and all residues (second column), assigning prediction scores as detailed in the methods. Proteinase Recognition Table 4 Mean conformational parameters for residue subsets from 11 nick site proteins parameter all residues LYs/ Arg tryptic nick sites accessibility”protrusion index 29.4 (50 %)4.3 (47 %) 32.3 (40%) 4.5 (43 Yo) 45.4 (1 1 %) 5.9 (23%) SST parameted B values (normalised) 0.65 (62 %) 0.54 (30%) 0.67 (61 70) 0.58 (25%) 0.86 (22 %) 0.74 (10 %) All values listed are mean values over 10 (P,-P$) residue segments which are assigned at the appropriate residue (PI), which are then averaged again for the appropriate category, i.e.tryptic nick sites, Lys/Arg, or all residues. Values in brackets represent the proportion of the dataset of 11 proteins with higher values for each mean listed. a Relative accessible surface area. Secondary structure assignment scores were assigned to Kabsch and Sander12 assignments as follows: H (helix) = 0.6, E (strand)=0.0, all others (coil) = 1.0. The numbers of successfully closed loops that passed all four filtering requirements, for each nick site, are listed in Table 3. For each nick site, with the exception of elastase 125, at least one successful closure was produced.For each site, the closure with the lowest RMSD to BPTI is listed in Table 3 with the number of intermolecular contacts generated after the usual ‘docking’ into trypsin. The RMSD values are smaller than before closure, and are comparable with the inhibitor-inhibitor RMSDs, as shown in Fig. 2. Examination of the closed-loop models, and their superpositions onto BPTI, using molecular graphics confirmed that a typical inhibitor main-chain conformation had been successfully modelled into the nick site segment. A representative closure is shown in Fig. 3(b) for the ribonuclease Arg-33 site. The short helical segment has been partially broken, and now protrudes most strongly from the protein body into a more ‘enzyme-presentable’ conformation. The numbers of ‘bad’ contacts (<2.0 A) generated upon ‘docking’ of the closed-loop nick site proteins are universally smaller than observed before changing the loop conformation.The modelling of the BPTI conformation produces a more exposed, protruding loop, that is better suited to insertion into the trypsin active site. However, for some putative modelled ‘complexes’, the level of steric clashing would still prevent the nick site from entering the enzyme active site. Certainly for the trypsinogen Lys-145 and calmodulin Lys-77 sites, grosser conformational changes such as ‘hinge-bending’ would still be required in order to better present the modelled nick site segment for cleavage. Assessment of Conformational Parameters and Nick Site Prediction As detailed in the methods section, four conformational parameters were assessed as predictors for limited proteolysis.The results of this analysis are given in Table 4. For each conformational parameter, averaged over a 10 (P,-Pi) residue window, the mean value of these smoothed parameters (assigned at P,) is evaluated for: (1) all residues at PI, (2) all lysines and arginines at PI and (3) known tryptic nick sites (from Table 1) at P, . These mean values, derived for 11 nick site-containing proteins are listed in Table 4. The percentage of residues in the 11 protein dataset with higher scores than each mean is also listed in brackets. For all parameters, the tryptic nick sites have higher mean scores than residues in general, and for lysines and arginines in general.This is particularly significant for the SST parameter assignments, as lysines and arginines possess almost identical mean scores to residues in general (62% of the dataset have higher), whilst the nick sites have a much higher score (only 22 % of the dataset scores higher). Similarly for B values, only 10 % of the dataset possesses a higher mean score than the nick sites. S. J. Hubbard, J. M. Thornton and S. F. Campbell 234 5678DO 1 o o o o o o o o o o ~ % d 1 t!~gs:,ggd I..f . . . . , . . . . . . . * . . . . . . . . . . I I . . ! . ! , . . ..... . . . . . . . . , . . . .. . . . . . . I . I.. . . . . . . . . . . ,. . . . . . . . . . . . . ,., ?\.< 2-i., , .. .. . .i.. 1.. "?"J 1 2 3 4 5 8 7 8 0 I 2 0 0 0 0 0 H----l-I Fig. 4 Nick site prediction profiles for (a) staphylococcal nuclease, (b) trypsinogen and (c) ribonuclease. The normalised overall prediction score is plotted along the sequence for each protein, and the nick sites are indicated by arrows. The secondary structure of each protein is represented beneath each profile. Boxes represent P-strand, zig-zags a -helix and straight lines coil, where the secondary structural states are defined by the method of Kabsch and Sander" Accessibility8 and mobility' (high B-values) have already been suggested as good predictors for limited proteolysis, and protrusion has also been ~onsidered.~ The SST parameters also clearly possess some discriminatory power, as shown in Table 4, and all four parameters were combined into the prediction algorithm as discussed in the methods.The overall normalised prediction scores for three nick site proteins are shown in Fig. 4. It is clear from this figure that the method predicts many peaks, not all of which contain a tryptic nick site. However, in the examples shown the predictions are quite good. The two staphylococcal nuclease sites at 48 and 49 are situated on the highest peak in the prediction profile, and indeed all the nick sites shown score above 0.5 and are situated broadly in peaks. The actual ranking that each nick site residue achieves when the prediction scores are listed in order are shown in Table 3. These rankings are shown for all residues within each given protein, and for only those residues which satisfy a given sequence template.In this case, the template is simply lysine or arginine at P, ,trypsin's primary specificity. Using this template criterion, the nick sites climb up the rankings. The template makes no difference to the two staphylococcal nuclease sites which score the top two scores regardless. For calmodulin, the Lys-77 nick site becomes the top predicted residue when only lysinesl arginines are considered. One of the trypsinogen sites also scores top when the sequence template is applied. However, none of the other nick sites achieve this which implies that there are lysines and arginines located in apparently more favourable positions for cleavage. Thus it is not possible to unambiguously assign the nick sites a6 initio from structure. In particular, the ribonuclease sites are predicted quite badly, although they lie on the fringe of a peak in Fig.4. Interestingly, the second largest peak at residue 20, is at the site where subtilisin cleaves ribonuclease A to produce ribonuclease S and S ~eptide.~'It is also interesting to note how the nick sites shown in Fig. 4 are typically located away from secondary structural segments, in particular the p strand. Conclusions Earlier work3 and the results presented here show quite clearly that the conformations of the tryptic nick sites considered, as they are found in their crystallographically Proteinase Recognition determined forms, differ markedly from the conserved inhibitor-binding loop conforma- tion.Since they would be expected to adopt such a conformation in order to be cleaved, segmental mobility must be prerequisite for proteolysis of these sites. Our preliminary studies and other reported suggest that this is also the case for limited proteolytic sites of other proteinases. In order to adopt an inhibitor-like conformation, torsion-angle changes in the nick site segment of up to 180O would be necessary local to the scissile peptide it~elf.~ Similarly, ‘docking’ the nick sites prior to the closure modelling shows that these sites could not bind into the tryptic active site in their current conformations without causing bad steric clashing between the enzyme and other regions of the nick site protein.This suggests that concerted movements of the nick site segments themselves might be necessary to correctly present the nick site segment for cleavage. They could certainly not be cleaved in their current conformations. The closed-loop modelling has shown that it is indeed possible for the nick site segments to take up the putative cleavage conformation of the conserved inhibitor recognition motif. This was the case for all nick sites modelled except elastase Arg-125. Even this site could be modelled into a reasonably satisfactory conformation when the filter conditions were relaxed. Preliminary molecular dynamics simulations suggest that the large motions required to do this are quite possible within the nick site segments.The subsequent ‘docking’ of the closed-loop models produced considerably less ‘bad’ contacts with the enzyme than had been observed previously prior to changing the conformation of the loop. This was presumably due to the extreme exposure introduced into the modelled loop as shown for the ribonuclease Arg-33 site in Fig. 3(b). However, for some of the sites, this was still not enough to reduce the numbers of bad contacts to a level that might permit the nick site to enter the trypsin active site, and ‘hinge-bending’ type motions would still be required. The predictive power of accessibility, protrusion, mobility and secondary structure is reasonably good. It can be used to highlight regions of structure that would be likely to contain a nick site, but even using a discriminatory sequence template, this method fails to identify the nick sites in question above all others in all cases.There are a number of reasons why this is so. First, some of the nick site data are disputed in the 1iterat~x-e.~’~More significantly, the observation that nick site loops change their confor- mation also permits the cleavage of relatively inaccessible sites since they may be able to locally unfold into an accessible, protruding conformation. The ribonuclease Arg-33 site appears to be a prime example of this [see Fig. 3(b)]. Hence, the ability to locally unfold (to some degree) is the main determinant for limited proteolysis. Regions capable of doing this would be likely to correlate strongly with accessibility, protrusion, and high B values, which explains the correlations with these parameters which have been observed.References 1 M. Laskowski Jr. and I. Kato, Annu. Rev. Biochem., 1980,49, 593. 2 R. J. Read and M. N. G. James, in Proteinase Inhibitors, ed. A. J. Barrett and G. Salvesen, Elsevier, Amsterdam, 1986, pp. 301-336. 3 S. J. Hubbard, S. F. Campbell and J. M. Thornton, J. Mol. Biol., 1991, 220, 507. 4 H. M. Greenblatt, C. A. Ryan and M. N. G. James, J. Mof. Biol., 1989, 205, 201. 5 M. Fujinaga, R. J. Read, A. Sielecki, W. Ardelt, M. Laskowski Jr. and M. N. G. James, Proc. Natl. Acad. Sci. USA, 1982, 79, 4868. 6 I. Schechter and A. Berger, Biochem. Biophys. Res. Commun., 1967, 27, 157. 7 D. A. Estell and M. Laskowski Jr. Biochemistry, 1980, 19, 124.8 J. Novotny and R. E. Bruccoleri, FEBS Lett., 1986, 221, 185. 9 C. Vita, D. Dalzoppo and A. Fontana, in Macromolecular Biorecognition : Principle and Biofechnological Applications, ed. I. M. Ckaiken, E. Chiancone, A. Fontana and P. Neri, Humana Press, Clifton, NJ USA, 1988. 10 W. R. Taylor, J. M. Thornton and W. G. Turnell, J. Mol. Graphics, 1983, 1, 30. S. J. Hubbard, J. M. Thornton and S. E Campbell 11 J. M.Thornton, M.S. Edwards, W. R. Taylor and D. J. Barlow, EMBO J., 1983,5,409. 12 W. Kabsch and C. Sander, Biopolymers, 1983, 22,2577. 13 H. Taniuchi, C. B. Anfinsen and A. Sodja, Proc. Nutl. Acud. Sci. USA, 1967, 58, 1235. 14 H. Taniuchi and C. B. Anfinsen, J. Biol. Chem., 1968, 243, 4778. 15 B. G. Winchester, A.P. Mathias and B. R. Rabin, Biochem. J., 1970, 117, 299. 16 W.Bode, H.Fehlhammer and R. Huber, J. Mol. Biol., 1976, 106, 325. 17 C. Ghellis, M.Tempete-Gailourdet and J. N. Yon, Biochem. Biophys. Res. Commun., 1978, 84, 31. 18 W. Drabikowski, H. Brzeska and S. Y. Venyaminov, J. Biol. Chem., 1982, 257, 11584. 19 F. A. Cotton, E. E. Hazen Jr. and M. J. Legg, Proc. Nutl. Acud. Sci. USA, 1979, 76, 2551. 20 A. Wodawer, R. Bott and L. Sjolin, J. Biol. Chem., 1982, 257, 1325. 21 A. A. Kossiakoff, J. L.Chambers, L. M. Kay and R. M. Stroud, Biochemistry, 1977, 16,654. 22 E. Meyer, G. Cole, R. Radahakrishnan and 0. Epp, Acta. Crystallogr., Sect. B, 1988, 44, 26. 23 Y.S. Babu, C. E.Bugg and W. J. Cook, J. Mol. Biol., 1988, 204, 191. 24 A. D. McLachlan, J. Mol. Biol., 1979, 128, 49. 25 T. A. Jones and S. Thirup, EBMO J., 1986,5, 819. 26 H. Sklenar, R. Lavery and B. Pullman, J. Biomolec. Struct. Dynum., 1986, 3,967. 27 H,Sklenar, Doct. Sci. Thesis, Academy Science GDR, Berlin, 1989. 28 B. Lee and F. M.Richards, J. Mol. Biol., 1971, 55, 379. 29 C. Chothia, J. Mol. Biol., 1976, 105, 1. 30 A. Fontana, in Highlights of Modem Biochemistry, ed. A. Kotyk, J. Skoda, V. Paces and V. Kostka, VSP International Science Publishers, Zeist, The Netherlands, 1989, pp. 171 1-1726. 31 F. M. Richards and P. J. Vithaythil, J. Biol. Chem., 1959, 234, 1459. 32 P. E. Stein, A. G. W. Lesli, J. T. Finch, W. G. Turnell, P. J. McLaughlin and R. W. Carrell, Nature (London), 1990,346,99. Paper 1/06410J; Received 19th December, 1991
ISSN:1359-6640
DOI:10.1039/FD9929300013
出版商:RSC
年代:1992
数据来源: RSC
|
3. |
Three-dimensional profiles for analysing protein sequence–structure relationships |
|
Faraday Discussions,
Volume 93,
Issue 1,
1992,
Page 25-34
David Eisenberg,
Preview
|
PDF (1032KB)
|
|
摘要:
Faraday Discuss., 1992, 93, 25-34 Three-dimensional Profiles for Analysing Protein Sequence-Structure Relationships David Eisenberg," James U. Bowie, Roland Luthy and Seunghyon Choe Molecular Biology Institute and Department of Chemistry and Biochemistry, UCLA, Los Angeles, CA 90024, USA In the method of 3D (three-dimensional) profiles, each residue position in a protein is characterized by its environment and is represented by a row of 20 numbers in a table, the profile. These numbers are the statistical preferences (called 3D-1D scores) of each of the 20 amino acids for this environment. A profile is computed from the coordinates of a protein model, and it gives a score S for any amino acid sequence folded as the model. To date 3D profiles have found three applications.The first is to identify other protein sequences which are folded in the same general pattern as the structure from which the profile was prepared. These are sequences which have high scores for the profile computed from the model. The second is to assess the validity of protein models, however determined. Correct models are found to give profiles that have high scores for their own amino acid sequences, and incorrect models are found to have lower scores. The example of the X-ray structure determination of diphtheria toxin is discussed. The third application is to assess which is the stable oligomeric state of a folded protein. Several examples suggest that the highest profile score for a sequence is achieved when the protein is aggregated into its most stable oligomeric state.Profile analysis was introduced in 1987' as a method of comparing amino acid sequences, but it has taken on new life in the 1990~~~~ as a method for relating three-dimensional (3D) structures of proteins to their amino acid sequences. Here we summarize previous applications of 3D profiles and describe an extension: identification of the stable oligomeric form of a protein. 3D Profiles The 3D profile of a protein is a table, computed from the coordinates of a 3D structure. The profile may be thought of as a generalization of the amino acid sequence that can fold in a given way. This is illustrated by the portion of the 3D profile for myoglobin shown in Fig. 1. Each position in the folded structure is represented by a row of the profile.As given in the second column, each position is characterized by one of 18 environmental classes, determined computationally from the coordinates of the protein.* The environment of a residue is determined by three parameters: the area of the residue which is buried; the fraction of side chain area that is covered by polar atoms (0and N); and the local secondary structure. The next 20 numbers in each profile row are the statistical preferences (called 3D-1 D scores) for the environmental class of that position of the 20 coded acids. The final two numbers in each row of the profile are the so-called gap penalties, discussed in ref. 1, but not considered further here. To determine the profile score S for the fit of any sequence to the myoglobin profile, we merely sum over all positions the 3D-1D score in each position which corresponds to the amino acid of the sequence.25 amino acid type gap penalty position environment in fold class ACDE F G ... R S T V W Y opn exp 1 12 -46 22 3 -190 113 ... -32 32 12 -9 1 -214 -94 2 0.02 2 -66 -5 -128 -135 105 -166 ... -80 -117 -76 60 102 112 2 0.02 3 46 -44 44 59 -220 68 ... -34 15 -17 -110 -135 -210 200 200 4 6 -93 28 56 -143 -50 ... 50 -18 -5 -48 -114 -79 200 200 5 46 -44 44 59 -220 68 ... -34 15 -17 -110 -135 -210 200 200 w 6 6 -93 28 56 -143 50 ... 50 -18 -5 -48 -114 -79 200 200 tl 7 -69 -10 -162 -71 90 -149 ... 6 -147 -150 68 50 85 200 200 ts 8 46 -44 44 59 -220 68 ...-34 15 -17 -110 -135 -210 200 200 9 6 -93 28 56 -143 -50 ... 50 -18 -5 -48 -114 -79 200 200 2 10 -66 -73 -197 -174 132 -253 .. . -167 -273 -129 66 100 18 200 200 Fig. 1 Example of part of a 3D profile. The example shows the first 10 positions of the 3D profile of sperm-whale myoglobin. The environmental class2 is given for each position, followed by the 3D- 1 D scores for that position. The complete profile has 153 rows, one for each residue position in the structure, and 20 3D-1D scores for each position. Reproduced with permission from J. U. Bowie, R. Luthy and D. Eisenberg, Science, 1991,253, 164. Copyright 1991 by the AAAS D. Eisenberg et al. Z score Fig. 2 The results of a search in an amino acid sequence data base for sequences compatible with the 3D profile for myoglobin of Fig.1. The 2 score for each sequence is the number of standard deviations above the mean score. Note that the myoglobins have the highest scores, followed by other globins, followed by non-globins. Reproduced with permission from J. U. Bowie, R. Luthy and D. Eisenberg, Science, 1991, 253, 164. Copyright 1991 by the AAAS In summary, for purposes of this discussion, the essential features of a 3D profile are: (1) The 3D profile can be computed from the 3D coordinates of a structure, but does not depend in any direct way on the sequence of the protein; (2) in the profile, each residue position of the folded structure is assigned to an environmental class, which depends on the accessibility, polarity, and secondary structure of the position; (3) associated with each of the 18 environmental classes is a characteristic amino acid composition, expressed by a 3D-1D scores for each of the amino acids in that environ- ment; and (4) the compatibility of any amino acid sequence to the 3D structure can be scored by the profile, simply by adding the 3D-1D scores that correspond to the amino acid found at each position of the sequence.Detection of Amino Acid Sequences Folded in the Same Way An application of 3D profiles, discussed in ref. 1 and only briefly reviewed here, is to detect in a database of amino acid sequences those sequences which are compatible with a given protein fold. This compatibility search is carried out as follows. From the coordinates of a protein, a 3D profile is computed, such as the 3D profile of myoglobin shown in part in Fig.1. Then every amino acid sequence in the database is scored against the profile. Those receiving the highest scores are found generally to be sequences folded in the same way as the folded protein. The results for a compatibility search with the 3D profile of sperm-whale myoglobin are shown in Fig. 2. The myoglobins score highest, with haemoglobins following, and then all the other sequences. Thus the 3D profile prepared from the atomic coordinates of myoglobin, and with no direct information on the sequence, is able to detect from a large database of amino acid sequences (some 26 000 sequences) those that are folded in generally the same way as sperm-whale myoglobin.3D PrOJiles In this first example, we have compared profile scores for the compatibility of various amino acid sequence to a given 3D profile. In the remaining examples, we will compare profile scores for the compatibility of a given sequence to various 3D profiles. Assessment of Protein Models As the preceding example suggests, the compatibility of a protein model with its sequence can be measured using a 3D profile. In a study of profiles from over 200 protein 3D structures, we have found that protein models known to be correct give high 3D profile scores for the amino acid sequence of the m0de1.~ In contrast, models known to be wrong give low 3D profile scores for the amino acid sequence of the model. This finding is illustrated in Fig.3 for the correct and a misfolded structure of haemerythrin: the profile for the correct model gives a profile score S=37.8 (the sum of the 3D-1D scores for the haemerythrin sequence for the 113 positions of the profile); the profile for the misfolded model gives a profile score S= 14.9. Note that the 'data' against which the two structures are judged consist of the amino acid sequence: the true structure of haemerythrin is judged to be correct in this test because its profile is more compatible with the amino acid sequence of haemerythrin than is the profile of the misfolded haemerythrin model. To learn if profile scores are high for all well determined protein structures, we computed profiles for all distinct coordinate sets in the Brookhaven Protein Data Bank.4 The results are given in Fig.4, where the log of the profile score S for each protein model is given as a function of the log of the number of amino acids in its sequence. High-resolution, well refined structures are shown by the large dots, and these are seen to give particularly high scores. The heavy line is the least-squares fitting to these dots. Also shown in Fig. 4 are profile scores for NMR structures (indicated by diamonds), model structures (circles), and seven structures known to be wrong in part or whole (squares), Squares for the four structures known to be largely wrong fall below the dotted line. Squares for three other structure known to be wrong in part are shown by the three squares that fall between the two lines.Thus incorrect protein models have profile scores smaller than correct models. Detection of Incorrect Segments in Otherwise Correct Structures For structures wrong in part, the incorrect segment can often be detected by plotting the average profile score in a 21-residue window that is moved through the ~equence.~ In such profile-window plots for correct segments, the score stays well above 0, whereas for wrong segments the score drops to near zero or below. Detection of an incorrect segment of model is illustrated for diphtheria toxin in Fig. 5. Diphtheria toxin is a 535-residue proenzyme which has been crystallized596 and studied by X-ray diffraction in a joint study between our laboratory and that of Dr. R.J. Collier. In building the initial model into a 3 A resolution electron density map, the direction of the chain in the C-terminal, antiparallel @structure domain was at first reversed. As the resolution of the electron-density map was increased and the map clarified, it became evident that the model in this C-terminal region was incorrect, and it was rebuilt. Fig. 5 gives profile-window plots for both the initial and final models. The figure shows clearly that the final model in the C-terminal domain has residue environments that are considerably more compatible with the diphtheria toxin sequence of that region than does the initial model. In passing we note that many other tests for the correctness of protein structures have been proposed.These include, for X-ray structures R- factor tests: real-space R-factor tests' and other for NMR structures R-factor tests;10y11 as well as other and tests. As with the other empirical tests, the profile criterion D. Eisenberg et al. Sequence Incorrect 3D Profile Resduetype A L S =. Ba -57 15 -96 €&I -69 101-146J1-. 73 -52 34 Y 58 85 -59 L S L Residuetype _.__) A LS 12 -119 32 12 -119 32 15 -70 26 Y -94 -94 -51 0 ' I /\ S"' Sinc score Fig. 3The use of a 3D profile to verify protein models, illustrated with the correct and misfolded structures of haemerythrin, both having 113 residues. The model on the left is the X-ray-derived structure.26 A 3D profile calculated from its coordinates matches the sequence of haemerythrin with a score Scor= 38.The right-hand model is the misfolded haemerythrin model, of Novotny et aL27 A 3D profile calculated from it matches its sequence poorly, with score StnC=15. The actual profile consists of 113 rows (one for each position of the folded protein). In this schematic example only three rows are shown, those for positions 33 (where the residue is L), 34 (L), and 35 (S). The first column of the profile gives the environmental class of the position, computed from the coordinates of the model,2 and the next 20 columns give the amino acid preferences (called 3D-1D scores) at that position. In this schematic example, there are only four columns of 3D-1D scores shown, those for residues, A, L, S and Y. In the correct profile on the left, position 33 is computed to be in the buried polar a class (B,a);positions 34 and 35 are computed to be in the buried moderately polar a class (B2a) and the partially buried a polar class, Pla, respectively.The scores for the residues L, L and S for these three positions are 15, 101 and 34. The profile for the misfolded model assigns positions 33, 34 and 35 to the environmental classes E, E and P2a,giving 3D-1D scores for the residues L, L and S of -1 19, -1 19 and 26, respectively. That is, in the incorrect structure, the leucine residues are exposed, giving low 3D-1D scores and leading to a summed total score S'"" which is much smaller than S""' when all other 3D-1D scores are summed along with the three shown here. Reprinted with permission from R.Luthy, J. Bowie and D. Eisenberg, Nature (London),1992,356,83. Copyright 1992 Macmillan Magazines Limited of correctness can be applied to a model, however determined. Unlike some of the other tests, the profile method can reveal a problem in one segment of the structure. Dependence of Profile Score on Oligomeric State In computing the profile scores for structures deposited in the Protein Data Bank, we encountered eight apparent exceptions to the behaviour of well refined structures shown 30 Pro$les 3-+ All structures Best X-ray structures 2-0 Model structures 0 NMR structures a Misfolded structuresCalculated score = exp(-0.833+1.008'ln(lenglh) 1 oo9 Cutoff = 0.45'calculated score 8-7-6-e, 5-6 4--x 3-Y 1 2 3 4 5 6 78S1 2 3 4 567 100 protein length Fig.4 3D profile scores (indicated by a small +) for protein coordinate sets in the Brookhaven protein data bank4 as a function of sequence length, on a log-log scale. All + signs, which are not enclosed, represent X-ray determinations. Scores for highly refined X-ray determinations are indicated by large dots 0.These are structures determined at resolutions of at least 2 and with R-factors ~20%.The heavy line is fit by least squares to these highly refined structures and is given by Scale=0.944 exp [-0.776-t 1.009 In (length)]. Scores for NMR structures are indicated by a small + enclosed by a diamond. Scores for computationally determined models are indicated by a small + enclosed by a circle.Five misfolded structures are indicated by a small + in a square. Environmental classes for 3D profiles of oligomeric proteins were generally computed from oligomeric structures, rather than protomers. The difference is that the accessible surface areas of residues positioned at interfaces are greater for the protomers, producing a poorer fit of the profile to the sequence. This matters little for large structures, but much for small structures, as discussed in the text. Reprinted with permission from R.Luthy, J. Bowie and D. Eisenberg, Nature (London), 1992,356, 83. Copyright 1992 Macmillan Magazines Limited in Fig. 4. These were protein models, in some cases well refined at high resolution, that gave low profile scores for their own sequences when the protein was considered as a monomeric chain.These eight apparent exceptions are listed in Table 1,and their profile scores are plotted against sequence length in Fig. 6. In all but one of these eight cases, when the protein is considered as an oligomer, the profile score increases to a value characteristic of well refined structures, as shown in Fig. 6. Why does the profile score depend on the oligomeric state of a protein? This dependence enters through the difference of accessibility of amino acid side chains in the monomeric and oligomeric states. Suppose a protein, such as melittin in Table 1, is normally a tetramer in aqueous solution. When the 3D profile is computed from the tetrameric structure, the apolar residues at the tetrameric interface are classified as being in buried environments (because the residues are buried in the tetramer). However, if the monomeric chain is dissected from the tetramer, and the 3D profile of the monomer computed, these same environments are classified as exposed (because in the monomer they are exposed).The amino acid sequence of melittin is far more compatible with the profile prepared from the coordinates of the tetramer (S= 15) than with the profile from the monomer (S= 1.1). This is so because the apolar residues in the sequence fit D.Eisenberg et al. o.8 1 Diphtheria Toxin -corrected -0.2 I I I I I 0 100 200 300 400 500 Fig. 5 3D profile-window plots for two atomic models of the structure of diphtheria toxin.The vertical axis gives the average 3D-1D score for residues within a 21 residue sliding window, the centre of which is at the sequence position indicated by the horizontal axis. From examination of incorrect struct~res,~ we observed that faulty segments often have profile window plots that dip near or below a score of zero. Inthe initial model for diphtheria toxin, the backbone followed essentially the correct path, but it was reversed in direction from residues 410 to the C-terminus (residue 535). The corresponding region of the profile window plot (dotted line) is lower than that for the rest of the model, and not much above a score of zero in some places. With additional data and better phases for X-ray reflections, the model was rebuilt, and the profile-window plot recalculated (solid line). The scores for the rebuilt model are more acceptable than for the initial model Table 1 Profile scores, S,for oligomeric proteins in various states of oligomerization protein PDB code length oligomer 5 1 AL1 12 monomer 0.6 tetramer 4.5 hexamer 6.0 lattice 7.5 avian pancreatic polypept ide" 1PPT 36 monomer 6.1 dimer 14 lattice 19 b~ngarotoxin'~ 2ABX chain A 74 monomer dimer 14 16 chain B 74 monomer 20 dimer 22 cram bin2' lCRN 46 monomer 8.7 unit cell 11 lattice 21 chymotrypsin inhibito8'p22 2CI2 65 monomer 8.8 unit cell 23 gl~cago n~~ lGCN 29 monomer 2.6 unit cell 4.6 lattice 12 insulin24 3 INS chain A 21 monomer 2.3 chain B 31 A in AB complex A in (AB)6 complemonomer x 8.3 3.5 10 B in AB complex B in (AB), comple x 11 16 melittin'' 2MLT 26 monomer 1.1 dimer 6.4 tetramer 15 30 Prujiles o monomers I dimers + tetramers x hexamer or unit cell m lattice 7-1 Pea ecttn,lqn%;; 1 B,chain 1 Chymo;ryp& Ac iain Glucagon inhibitor I Crambin Melittin Alpha-1 Avian pancreatic Bungarotoxin polypeptide 0.14 I_ 2 3 4 5 6789’ 2 10 100 protein length Fig.6 Profile scores us. length of sequence for the eight structures in the Protein Data Bank for which the profile score depends significantly on oligomeric state. In each case, the profile score for the monomer is smaller than for the oligomer the corresponding positions of the tetrameric profile.In fact, at physiological concentra- tions, melittin is a tetramer. Thus the higher profile score for the melittin tetramer than for melittin monomer is consistent with the observation that melittin does form a tetramer. Another case of low profile scores is for the insulin A and B chains when they are considered as isolated monomers. However, the profile scores computed for both chains when considered in the actual AB insulin complex are high values. Thus the profile scores are consistent with the idea that neither the A nor the B chain of insulin would maintain its structure in solution if removed from the actual AB complex. Also the scores are higher in the (AB)6 hexamer than in the AB complex.The designed protein a-1presents another interesting case. This peptide was designed based on a tetrameric mode1,I6 but the crystal structurei7 revealed a more complex organization. All peptides in the crystal have identical environments, and the structure can be interpreted equally well in several ways. The ‘molecule’ of a-1 can be considered a monomer, a tetramer, a hexamer or even a molecular crystal composed of tightly packed peptides forming both tetramer contacts and hexamer contacts. Profiles were computed for a-1,considering the molecule first as a monomer, then as a dimer, then as a tetramer, then as a hexamer and then as a lattice of packed hexamers and tetramers. As larger numbers of peptides are included in the model from which the profile is computed, the accessibility of positions in the structure are reduced and the environ- mental classes change.From Fig. 6 it is evident that the 3D structure of the a-1monomer is incompatible with the sequence of the peptide, but that the structures of the tetramer, hexamer, and molecular crystal (lattice) give higher profile scores. These profile scores suggest that the stable oligomeric state of a-1(having the structure found in the crystal) is one of the higher oligomers. As a control calculation, we examined the change in profile score for the monomeric proteins hen lysozyme (1LZ1) and sperm-whale myoglobin (1MBO) when they are packed as they are found in their common crystal forms. The lysozyme score increases D.Eisenberg et al. only from 75 to 77; the myoglobin score actually drops a bit from 86 to 72. This means that the environmental classes of myoglobin and lysozyme are not appreciably changed when these proteins pack into a crystal, unlike a-1and some of the other examples of Table 1 and Fig. 6. The lack of change of environmental classes in myoglobin and lysozyme is consistent with a lack of change of structure when these molecules crystallize. Conclusions When presented with a model of a protein, however derived, we must assess how well the model reflects the actual conformation of the protein under physiological conditions. The 3D profile method provides an objective test. In our experience, it is unusual for a profile from a well determined structure to score poorly with its own sequence.In cases where a profile does score poorly, there are several possibilities. (1) There are errors in the model. If the structure was determined by X-ray crystallography, the chain may have been mistraced in some segments, or if by NMR, there may be too few distance constraints to define the structure. Profile window plots provide a useful tool for locating possible errors. (2) The protein exists normally as a higher oligomer. This possibility can be tested by determining 3D profile scores from models of the higher oligomeric forms. (3) The structure is stabilized by special forces. These could be disulfide bonds, or other protein chains (as in insulin), or lattice forces (as in 0-1 or crambin).The 3D profile method rests on the observation that soluble proteins bury many hydrophobic side chains and not many polar residues. If a protein is stabilized by other forces, this usual distribution can be altered, and the resulting, unusual distribution of polar and non-polar residues will give a low profile score. We thank NIH and NSF for support of this research. References 1 M. Gribskov, A. D. McLachlan and D. Eisenberg, Roc. Natl. Acad. Sci. USA, 1987, 84, 4355. 2 J. U. Bowie, R. Luthy and D. Eisenberg, Science, 1991, 253, 164. 3 R. Luthy, J. Bowie and D. Eisenberg, Nature (London), 1992, 356, 83. 4 F. C. Bernstein, T. F. Koetzle, G. J. Williams, E. F. Meyer, M. D. Brice, J. R. Rodgers, 0. Kennard, T. Shimanouchi and M. Tasumi, J. Mol.Biol., 1977, 112, 535. 5 R. J. Collier, E. M. Westbrook, D. B. McKay and D. Eisenberg, J. Biol. Chern., 1982, 257, 5283. 6 G. Fujii, S. Choe, M. Bennett and D. Eisenberg, J. Mof. Biol., 1991, 222, 861. 7 A. Brunger, Nature (London), 1992,355, 472. 8 T. A. Jones, J-Y. Zou, S. W. Cowan and M. Kjeldgaad, Acta Crystallogr., Sect. A, 1991, 47, 110. 9 C-I. Branden and T. A. Jones, Nature (London), 1990,343,687. 10 M. Nilges, J. Habazettl, A. T. Briinger and T. A. Holak, J. Mol. Biol., 1991, 219, 499. 11 J. A. Gonzalez, C. Rullmann, A. M. J. J. Bonvin, R. Boelens and R. Kaptein, J. Magn. Reson., 1991, 91, 659. 12 D. Eisenberg and A. D. McLachlan, Nature (London), 1986, 319, 199. 13 J. Novotny, A. A. Rashin and R. Bruccoleri, Proteins: Struct., Funct.Genetics, 1988, 4, 19. 14 G. Baumann, C. Frommel and C. Sander, Protein Eng., 1989,2, 329. 15 M. Hendlich, P. Lackner, S. Weitckus, H. Floeckner, R. Froschauer, K. Gottsbucher, G. Casari and M. Sippl., J. Mol. BioL, 1990, 216, 167. 16 D. Eisenberg, W. Wilcox, S. M. Eshita, P. M. Pryciak, S. P. Ho and W. F. DeGrado, Proteins: Strucr., Funct. Genetics, 1986, 1, 16. 17 C. P. Hill, D. H. Anderson, L. Wesson, W. F. DeGrado and D. Eisenberg, Science, 1990, 249, 543. 18 T. L. Blundell, J. E. Pitts, 1. J. Tickle, S. P. Wood and C-W. Wu, Roc. Natl. Acad. Sci. USA, 1981, 78, 4175. 19 R. A. Love and R. M. Stroud, Protein Eng., 1986, 1, 37. 20 W. A. Hendrickson and M. M. Teeter, Nature (London), 1981,290, 107. 21 C. A. McPhalen, I. Svendsen, 1. Jonassen and M. N. G. James, Proc. Natl. Acad. Sci. USA, 1985,82,7242. 22 G. M. Clove, A. M. Gronenborn, M.N. G. James, M.Kjaer, C. A. McPhalen and F. M.Pousen, Protein Eng., 1987, 1, 313. 30 profiles 23 K. Sasaki, S. Dockerill, D. A. Adamiak, I. J. Tickle and T. Blundell, Nature (London), 1975,257,751. 24 N. W. Isaacs and R. C. Agarwal, Acta Crystallogr., Sect. A, 1978, 34, 782. 25 T. C. Terwilliger and D. Eisenberg, J. Biol. Chem., 1982, 257, 6010. 26 R. E. Stenkamp, L. C. Sieker and L. H. Jensen, Acta Crystaflogr., Sect. B, 1978, 38, 784. 27 J. Novotny, R. Bruccoleri and M. Karplus, J. Mol. Biol,, 1984, 177, 787. Paper 21002291; Received 15th January, 1992
ISSN:1359-6640
DOI:10.1039/FD9929300025
出版商:RSC
年代:1992
数据来源: RSC
|
4. |
Protein hydration in aqueous solution |
|
Faraday Discussions,
Volume 93,
Issue 1,
1992,
Page 35-45
Kurt Wüthrich,
Preview
|
PDF (1311KB)
|
|
摘要:
Faraday Discuss., 1992,93, 35-45 Protein Hydration in Aqueous Solution Kurt Wuthrich,* Gottfried Otting and Edvards Liepinsh Institut fur Molekularbiologie und Biophysik, Eidgenossische Technische Hochschule- Honggerberg, CH-8093 Zurich, Switzerland Proton nuclear magnetic resonance was used to study individual molecules of hydration water bound to the protein basic pancreatic trypsin inhibitor (BPTI) and to the nonapeptide oxytocin in aqueous solution. The experi- mental observations are nuclear Overhauser effects (NOE) between protons of individual amino acid residues of the protein and those of hydration water. These NOEs were recorded by two-dimensional (2D) and three dimensional (3D) NOE spectroscopy (NOESY) in the laboratory frame, and by the corresponding experiments in the rotating frame (ROESY).The studies show that there are two qualitatively different types of hydration sites. Four water molecules in the interior of the BPTI molecule are in identical locations in the crystal structure and in solution. Their NOEs with the protein protons are characterized by large negative cross-relaxation rates uNOE ,which indicates that the residence times of the water molecules in these hydration sites are longer than ca. 10ns. Additional experiments with extrinsic shift reagents established an upper limit of 20 ms at 4 "C for these residence times. Surface hydration of both the globular protein BPTI and the flexibly disordered polypeptide oxytocin is by water molecules with residence times in the subnanosecond range, as evidenced by small positive uNOE values observed for their NOEs with nearby polypeptide protons.Short residence times prevail for all surface hydration sites, independent of whether or not they are occupied by well ordered, X-ray observable water in the protein single crystals. Three-dimensional protein structures can be determined either by X-ray diffraction in single crystals' or by nuclear magnetic resonance (NMR) spectroscopy in solution.2 High-resolution crystal structures of globular proteins include typically numerous water molecules in defined hydration sites. The Brookhaven Protein Data Bank3 thus includes the locations of over 30 000 water oxygen atoms. The large majority of these water sites are on the molecular surface.In addition, a small number of water molecules may be located in the interior of a protein and represent an integral part of the molecular architecture. Although the NMR method for protein structure determination has been available since 1985: hydration water molecules proved to be evasive to detection in aqueous solution and the observation of individual water molecules in a globular protein was first reported in 1989, when NOE cross-peaks between individual polypeptide protons of the basic pancreatic trypsin inhibitor (BPTI) and protons of four interior water molecules could be identifiedS4 These four water molecules are also present in the three available X-ray crystal structures of BPTIY5-' with conserved hydrogen-bonding partners and zero solvent accessibility to water molecules outside the protein molecule. Later on, identification of one, eleven and four interior water molecules, respectively, was reported from NMR observations in aqueous solution of a scorpion toxin from Androc-tonus australis Hector,' interleukin l p' and reduced human thioredoxin.lo Most recently, individual hydration water molecules in surface sites could be identified." The results obtained made clear why the detection of surface hydration water molecules by NMR 35 Protein Hydration in Aqueous Solution in solution is particularly difficult, since the surface waters have very short residence times and hence greatly reduced NOE intensities with the protein protons. The present manuscript gives an overview of the information accessible by high-resolution NMR observation of individual hydration water molecules in aqueous protein solutions, using the NMR solution data and X-ray crystal data available for BPTI and the polypeptide hormone oxytocin as illustrations.General Considerations on the Observation of Individual Molecules of Hydration Water by 'HNMR On grounds of principle one would anticipate that the proton chemical shifts of protein- bound water molecules are influenced by the protein environment in a similar fashion as those of the individual amino acid residues upon incorporation into a globular protein structure.2 On the basis of the resulting unique chemical shifts for different individual hydration waters, direct observation of distinct 'H NMR lines corresponding to these water molecules should in principle be possible and thus provide direct evidence for their presence.The experience gained with all systems studied so far indicates, however, that the proton resonances of protein hydration waters in aqueous solution are usually at the same chemical shift as the bulk water. Experiments with paramagnetic chemical shift reagents (see below) showed that this is due to rapid exchange between hydration water and bulk water, which averages the differences in chemical shift induced by the different chemical environments in the bulk solution and the protein hydration sites. The same applies for nearly all hydroxyl and carboxylate protons of the amino acid side chains of Ser, Thr, Tyr, Asp and Glu in proteins, for which it is well known that they have intrinsically different chemical shifts from that of H20.*Since different rate processes with largely different timescales govern the averaging of the chemical shifts or the nuclear spin relaxation that leads to the appearance of NOEs, specific inter- molecular NOEs with individual molecules of hydration water may still be observed in the situation where the resonance lines of the protein OH groups and the hydration waters are merged with that of the bulk water. However, the fact that all water protons and most hydroxyl protons have the same chemical shift then requires that apart from the 'H NMR assignments the solution structure of the protein must be available at high resolution before unambiguous assignments of NOEs between protons of the protein and hydration water molecules can be obtained.In addition, one must rule out that the observed effects arise from chemical exchange of protons, and that the interactions could be with OH groups of the protein rather than with protein-bound water molecules. Different possible mechanisms of incoherent magnetization transfer can be distin- guished from differences in qualitative aspects of two-dimensional nuclear Overhauser experiments in the rotating frame (ROESY). A negative sign of a ROESY cross-peak relative to the diagonal peaks shows that the magnetization transfer is by direct cross- relaxation, since both transfer by first-order s in diffusion pathways or by chemical exchange would lead to positive cross-peaks.14 Once a cross-peak between a protein proton and a proton at the bulk water chemical shift has thus been demonstrated to correspond to a direct NOE, distinction between intermolecular NOEs with hydration water and intramolecular NOEs with OH groups can in favourable cases be achieved on the basis of the solution structure of the protein, from which the location of all the side-chain hydroxy and carboxy groups is known.Because sizeable NOEs can be observed only for proton-proton distances shorter than ca. 4.0& an NOE with a hydration water molecule is indicated whenever the interacting protein proton is at a distance >4.0A from the nearest OH group. Along similar lines, a distinction may be made between the two situations that either one hydration water molecule interacts with two nearby groups of polypeptide protons, or that two different water molecules interact with the different groups of protein protons.K. Wuthrich, G. Otting and E. Liepinsh 9 1 -1 I . : : : : : : : ; : : 1 3 5 7 9 1 1 PH Fig. 1 Logarithmic plots of approximate exchange rate constants, kintr,for solvent-accessible, labile protons of polypeptides in H20 solution at 25 "C 0s. pH. Broken lines represent lower limits for kintrin situations where pK, data were available only for base catalysis. The individual curves are identified with the proton types and, where applicable, the residues types (Im stands for imidazole ring NH, Gua for guanidinium NH, bb for backbone).Reprinted by permission of John Wiley & Sons, Inc., from K. Wiithrich, NMR of Proteins and Nucleic Acids, John Wiley, New York, 1986. @ Copyright 1986 John Wiley & Sons Inc. An unambiguous identification of intramolecular NOEs with side-chain OH groups is achieved when the chemical exchange of the hydroxyl protons can be made sufficiently slow to enable the observation of separate signals away from the water line. The slowest exchange rates are expected near the pH value where the acid-catalysed and base- catalysed exchange have approximately the same rates. Fig. 1shows plots of the exchange rates us. pH for different types of labile protons occurring in polypeptide chains.The curves were computed from the pK, values of the reactants assuming diffusion-limited rate constant^.'^ Direct experimental exchange-rate rnea~urements'~ confirmed that the hydroxy protons of the seryl and threonyl residues in BPTI exchange most slowly near pH 6.0, and the tyrosyl hydroxy protons were found to exchange most slowly near pH 5.0. At these pH values and a temperature of 4"C, all labile hydroxy protons of the tyrosyl, seryl and threonyl residues of BPTI give rise to individual resonances that are well separated from the water signal, and the NOEs with these protons can be observed as well resolved cross-peaks. However, even under these most favourable conditions the NOEs with the hydroxy protons are usually also observed at the water chemical shift due to magnetization transfer by chemical exchange.NMR Experiments and Experimental Conditions used for the Observation of Hydration Water Molecules In experiments used for the observation of intermolecular NOEs between water protons and polypeptide protons, no solvent suppression by preirradiation of the dominant water signal can be empl~yed.~ Instead, the water magnetization must be eliminated after the mixing period during which the NOE transfer of magnetization from the water protons to the protein protons has taken place. For observation of surface hydration water molecules the NOEs must be recorded with short mixing times and at low temperature (see below). The most suitable 2D and 3D NOESY and ROESY experiments Protein Hydration in Aqueous Solution co9 IIIzz IY 3 t I I I I I 1 I i 1 10 9, 8 7 4ppm) 4 3 2 1 Fig.2 Cross-sections taken at the w1 chemical shift of the water resonance through two-dimensional nuclear Overhauser enhancement (NOE) spectra recorded in the laboratory frame (NOESY) (a) and the rotating frame (ROESY) (b). The spectra were recorded at 600 MHz on a Bruker AM600 spectrometer, using a 20 mmol dm-3 solution of BPTI in 90 % H20-10%D20,pH 3.5,4 "C, with mixing times of 50 ms. Both experiments were conducted as 'soft-NOESY' and 'soft-ROESY' experirnents,l8 respectively, with a single semi-selective inversion pulse centered at the frequency of the water resonance in the middle of the evolution period." In these experiments most of the diagonal peaks are eliminated, which results in a reduction of tl noise and the ensuing baseline artifacts.The water resonance was suppressed with spin-lock pulses as described in ref. 16. The NOE cross-peaks with the water resonance are identified above the NOESY cross-section, using the proton type, the one-letter amino acid symbol and the sequence number in the polypeptide chain for measurements with short mixing times use spin-lock purge pulses to suppress the water signal. These NMR technical details have been described in detail else~here.'~*'~ In 2D NOESY and ROESY experiments the NOEs with the protons at the bulk-water chemical shift are all located on the 1D cross-section through the water line, which is then quite crowded with lines.Improved resolution can be obtained with homonuclear or heteronuclear 3D NMR experiments. Homonuclear 'H 3D NOESY-TOCSY and 3D ROESY-TOCSY are particularly powerful experiments to assign overlapping cross- peaks with the water line, but they have relatively poor sensitivity when compared to heteronuclear 3D NMR experiments. Heteronuclear correlated 3D NOESY and 3D ROESY experiments have a very good sensitivity with l3C-or "N-enriched proteins, but with these experiments information is obtained only about NOEs with those protein protons that are bound to I3Cor "N, respe~tively?*'~*'~ The results on BPTI and oxytocin described in this article resulted exclusively from homonuclear 'PI NMR experiments. Chemical exchange cross-peaks between the water signal and the resonances of labile polypeptide protons can be minimized by measuring at low temperature and suitably chosen pH values (Fig. 1).With BPTI, the amide proton exchange rates are slowest around pH 3.5, whereas the exchange of the lysyl side chain NH: groups is slower at lower pH values and, as mentioned in the preceding section, the hydroxy protons of the side chains of tyrosine, threonine and serine are best observed at pH values around 5.5. Fig. 2 shows cross-sections parallel to the @,-axis taken at the o1frequency of the water resonance in 2D NOESY and ROESY spectra of BPTI. The spectra were recorded K. Wuthrich, G. Otting and E. Liepinsh I I 1 I 4 3 2 1 0 %(PPm) Fig.3 2D cross-sections taken at the o1frequency of the water signal through a homonuclear 3D 'H NOESY-TOCSY spectrum (A) and a 3D 'H ROESY-TOCSY spectrum (B). Same sample and temperature as in Fig. 2, mixing times 50 ms (NOESY), 25 ms (ROESY) and 27 ms (TOCSY), 500 MHz 'H frequency. In A, negative countour levels are plotted with dashed lines. In B, only negative levels were plotted. The peaks are identified with the assignment of the proton interacting with the water resonance. Reproduced with permission from G. Otting, E. Liepinsh, B. T. Farmer I1 and K. Wuthrich, J. Biomol. NMR, 1991, 1, 209 Protein Hydration in Aqueous Solution at 4 "C and pH 3.5. The chemical shift of the water is 5.01 ppm, which does not coincide with any of the proton resonances of BPTI.Positive peaks in the ROESY cross-section [Fig. 2(b) J are due to chemical exchange of the hydroxy protons of tyrosyl residues near lOppm, and to exchange of the side-chain NH; protons of Lys-41 and Lys-46 at 7.2 and 7.6 ppm, respectively. The NOESY cross-section of Fig. 2( a) shows positive cross- peaks for chemical exchange as well as for NOEs with long-lived interior water molecules and with the hydroxy protons. Negative NOESY cross-peaks are observed for a few methyl resonances near 0.8 ppm and the sCH3 group of Met-52 at 2.2 ppm, which are in contact with surface hydration water. It is readily seen from Fig. 2 that the negative NOESY cross-peaks are relatively weak and could easily be dominated by overlapping positive cross-peaks.Therefore, short mixing times must be used for their observation to avoid cancellation by spin diffusion of positive magnetization from the cross-peaks with the hydroxy protons and the interior water molecules to the other resonances of the protein. The chances to observe the weak negative NOESY cross-peaks with surface hydration water are significantly enhanced by the improved spectral resolution in three-dimensional NMR spectra. Fig. 3 shows the spectral region (02=0.5-4.7 ppm, to3=4.7 pmm) of 2D cross-sections taken through a 'H 3D NOESY-TOCSY spectrum and a 'H 3D ROESY-TOCSY spectrum of BPTI, respectively, at the o1frequency of the water resonance. The peaks on the diagonal come from the transfer of magnetization from the water line to the protein resonances during the NOESY or ROESY mixing, respectively.The diagonals of Fig. 3 A and B thus correspond to the cross-sections shown in Fig. 2(a) and (b). Off-diagonal peaks arise because the magnetization precessing at the 0, frequency after the NOESY or ROESY mixing period is further transferred to scalar coupled protons during the TOCSY mixing period. These TOCSY-relayed peaks are then detected at the o3frequency of the scalar coupled protons. Therefore, the well separated off-diagonal peaks provide a convenient way to determine the sign and the assignment of the corresponding diagonal peaks which represents the direct NOEs with the water line. Measurements of the Lifetimes of Hydration Water Molecules with Respect to Exchange with the Bulk Water A lower limit for the exchange rates of the four interior water molecules in BPTI was obtained from experiments with the paramagnetic shift reagent CoCl, .20 Water molecules bound to Co2+ experience large chemical shifts and some broadening of the 'H NMR lines due to the interactions with the unpaired electrons.The exchange of water molecules in and out of the hydration sites of Co2+ is very fast, so that these paramagnetic effects are averaged over the bulk water.21 A 30 mmol dm-3 concentration of Co2+ thus causes a 'H shift for the bulk water of ca. 0.25 ppm.20 Although the Co2+ ions have no access to the interior water molecules in BPTI, the NOESY cross-peaks with the interior waters were observed at the bulk water chemical shift also after the addition of Co2+, indicating rapid exchange between interior hydration sites and the bulk water.For the exchange rate constant a lower limit of km>50s-' was thus established.20 An upper limit of k, < lo's-' for the interior water molecules is obtained from the fact that the cross-relaxation rate with nearby polypeptide protons, cNoE,is negative (note that negative cNoEvalues result in positive NOESY cross peaks and vice versa). In contrast, the small positive aNoEvalues observed between the water resonance and polypeptide protons on the protein surface indicate residence times of the surface hydration waters in the subnanosecond time range. These exchange rate limits were derived from explicit models describing the motions of the vector connecting a water proton with a proton of the protein.For such model considerations it is of pivotal K. Wuthrich, G. Otting and E. Liepinsh importance that the cross-relaxation rates in the laboratory frame, uNoE,or in the rotating frame, uRoE,differ in their functional dependence on the spectral densities, J( w),2.12.22 aNOE - 6J(2w0)-J(0) (1) #OE - 3J( w,) +2J(O) (2) where wo is the Larmor frequency of the protons. Eqn. (1) and (2) show that uRoEis always positive because the spectral densities J(w) have finite positive values at all frequencies.22 In contrast, the sign of aNoEdepends on the explicit functional forms of J(20,) and J(O), which in turn are related to the rate processes that govern the modulation of the dipole-dipole coupling defined by the length and orientation of the interproton vector.The sign and value of the ratio uNoE thus becomes the quantity:uRoE of prime interest. Analyses of rigid-sphere models, where the surface hydration water is assumed to be part of the protein molecule, and 'wobbling in a cone' where the hydration water molecules are assumed to be flexibly bound to particular hydration sites of the protein, showed that the experimental observation of positiveoNOE. aROE values could be rationalized only with the assumption that translational diffusion of the protein and the water is a dominant additional factor in the modulation of the dipole-dipole couplings. Positive aNoEvalues are then predicted for relative diffusion coefficients, D, greater than 3 x cm2s-' 25 (note that the self-diffusion coefficient of pure water at 4 "C is ca.12x cm2s-' 26). The diffusion coefficients can be translated into residence times of the hydration water molecules on the protein surface using the Einstein-Smoluchowski relation -r=x2/D (3) Defining an average displacement, 0,of 4.0A as the criterion for complete water proton exchange in and out of a hydration site, uNoEis positive for lifetimes shorter than ca. 500ps. This value is much shorter than the lifetime of a proton in a water molecule with respect to exchange by hydr~lysis.~~We therefore conclude that positiveaNOE values (i.e., negative NOESY cross peaks) manifest rapid exchange of complete water molecules between the bulk solvent and the protein hydration sites.Survey of the NMR Data on Surface Hydration of the Polypeptide Hormone Oxytocin and the Globular Protein BPTI Oxytocin has a flexible conformation in aqueous solution and contains no interior water molecules. Because of its small size even its one-dimensional 'H NMR spectrum is well resolved, so that the NOESof the individual polypeptide protons with the water resonance are well separated in 2D NOESY and 2D ROESY experiments." In aqueous solution at temperatures below 6°C the overall rotational tumbling of oxytocin is in the slow-motional regime,22with a rotational correlation time longer than 2 ns. Correspondingly, all NOESY cross-peaks between different polypeptide protons are positive. In contrast, negative NOE cross-peaks between the water line and the peptide signals were observed throughout in a NOESY spectrum recorded at 6"C, showing that the residence times of the hydration water molecules are shorter than 500 ps.The fact that all 'H resonances of oxytocin have negative NOESY cross-peaks with the water line (apart from some positive cross-peaks due to chemical exchange) shows that there are no hydration sites which would be occupied by much more stably bound waters than others. The aforementioned conclusions were confirmed by experiments in the temperature range 0--25 "C, using oxytocin in a mixed solvent of 60 % H20-40 % [2H]acetone.14 At all temperatures, positive NOESY cross-peaks were observed between different peptide protons.In contrast, the sign of the NOE cross-peaks between the water protons Protein Hydration in Aqueous Solution and the protons of the polypeptide is negative above 0 "C and positive at -25 "C. At some intermediate temperature the NOESY cross-peaks vanish, while the intensity of the ROESY cross-peaks increases continuously with decreasing temperature. For different polypeptide protons the sign inversion of aNoEoccurs at different temperatures, showing that there are variations of the hydration water residence times in different sites on the protein surface. From these low-temperature experiments there was again no evidence for hydration water that would be bound to oxytocin with a lifetime exceeding 500 ps at temperatures above 0 "C. Based on the experience gained with oxytocin, one might expect that most or all polypeptide protons near the surface of globular proteins should show negative NOESY cross-peaks with the water, So far, in BPTI only a limited number of these expected NOEs have been observed (Fig.3). We attribute the apparent absence of many of the expected NOESY cross-peaks to the low sensitivity of the experiments used. Most of the negative cross-peaks seen in Fig. 2 and 3 are with intense resonances of BPTI, such as those of methyl groups. This sensitivity criterion applies, however, only to weak NOEs from short-lived hydration waters. Long-lived surface hydration waters would produce much stronger, positive NOESY cross-peaks, comparable to those seen for the internal waters.It is therefore very unlikely that NOEs with surface hydration water molecules bound with residence times >500 ps would have escaped detection by both NOESY and ROESY. Discussion: Comparison of the X-Ray Diffraction Data on Individual Hydration Water Molecules in BPTI Crystals and the Corresponding NMR Data obtained in Aqueous Solution NMR experiments in solution and X-ray diffraction experiments with protein single crystals are sensitive to different aspects of protein hydration. The sign and intensity of the protein-water NOEs observed by NMR reflect primarily the residence times of the water molecules near the protein protons. In contrast, X-ray diffraction probes the total fraction of time that a water molecule spends at a particular point in space, but is largely insensitive to the residence time at that site on any particular visit.28 On this basis it is of special interest to investigate how the presence or absence of ordered, X-ray observable hydration water in the single crystals can be correlated with the residence times of water molecules in corresponding hydration sites as observed by NMR in solution. BPTI has the same molecular architecture in solution and in crystals, which includes the four internal water molecule^.^-^^^^ In Plate 1the crystal-structure atomic coordinates 5PT13*6were used to generate molecular models, visualizing selected surface properties of the protein in the crystals, and the X-ray and NMR observations on surface hydration.The 'back view' shown is representative of the complete protein surface (the 'front view' was previously presented in ref. 11). With the exception of the protein-protein contact sites in the crystal lattice [yellow in Plate l(a)], the entire protein surface must be covered with hydration water molecules. Only part of the hydration water has so far been observed by either of the two methods considered here. In the BPTI crystal structure, ca. 40 % of the protein surface is involved in protein-protein contacts, 25 % is covered with X-ray-observable hydration waters attributed to the central protein molecule, and 15 % make contacts with X-ray observable water molecules attributed as hydration waters to neighbouring protein molecules [Plate l(b)].The remaining 20 % of the protein surface must be in contact with water molecules that are not sufficiently well ordered to be seen by X-ray diffraction experiments [comparison of Plate l(a), where white indicates water-accessible hydrogen atoms in the crystal lattice, with Plate l(b), which shows clearly that numerous accessible sites are not covered with X-ray-observable waters). It is further worth noting that more than 40 % of the X-ray- (facing page 42) (c) Plate 1. Stereo views of a space-filling CPK representation of the crystal structure 5P"I of BPTI,6 using the following colour codes to visualize salient features of the structure in crystals and in solution. (a) Protein surface. Hydrogen, carbon and sulfur atoms are coloured grey, nitrogens blue and oxygens red; hydrogen atoms with more than 20% solvent-accessible surface area in the crystal structure are white and hydrogen atoms within a distance of S3.08, from neighbouring protein molecules in the crystal lattice are yellow.(b)Hydration observed in crystals. All polypeptide atoms are grey, the hydration water molecules attributed to this protein molecule in the crystal are green, additional water molecules attributed to hydration sites on neighbouring protein molecules yet located within 3.0 8, are blue-green. (c) NMR observations in aqueous solution. Hydrogen atoms for which 'H-'H NOE cross-peaks with the water resonance were observed are either brown (positive NOESY cross-peaks) or yellow (negative NOESY cross-peaks); polypeptide atom groups with proton chemical shifts at the water resonance, ie.the side-chain hydroxy protons, the N-terminal amino protons and the carboxyl oxygens are coloured magenta; interior hydration waters and the surface-bound water molecule W129, which was observed identically in all three crystal structures5-' (see text) are green; the hydrogen bonding partner of W129, i.e. the amide proton of Ile-19, is red K. Wuthrich, G. Otting and E. Liepinsh observable water molecules are in contact with two protein molecules in the crystal lattice and that all observed waters are in contact with at least one protein molecule, but that most of the observed surface hydration water molecules are also accessible for contact with X-ray unobservable water in the crystal.In aqueous solution all polypeptide hydrogen atoms giving rise to negative cNoE values with the water protons [brown in Plate 1( c)] are located either near the N-terminus, a carboxy group, a hydroxy proton [magenta in Plate l(c)], or near one of the four interior water molecules [green in Plate 1(c)]. From the ensemble of all the experimental observations described in the preceding sections, we arrive at two important points to be made on surface hydration in solution: (i) nearly all positive NOESY cross-peaks between the water resonance and resonances of polypeptide protons on the protein surface are due to the hydroxy protons of serine, threonine and tyrosine, rather than to stably bound water molecules; (ii) the surface hydration sites containing well ordered, X-ray-observable water molecules in the crystal structure have similar residence times for hydration waters in solution as other surface areas for which the hydration water is disordered in the crystals and not observable by X-ray diffraction.Point (i) is supported by the aforementioned experiments in the pH range 5.0-6.5, which showed that almost all of the polypeptide protons near the protein surface that have positive NOESY cross-peaks with the water resonance at more acidic pH give rise to strong NOEs with the hydroxy proton resonances of Ser, Thr and Tyr. Point (ii) results from two independent observations. First, out of a total of ca. 60 X-ray-observable surface hydration waters there are only six water molecules that have conserved hydrogen-bonding partners in all three single-crystal structures of BPTI.5-7 Of those, the hydration sites W143 and W129 [Plate l(c)] could so far be characterized by the NMR data.In the crystal structures, W143 is in hydrogen-bonding distance to the amide proton of Ala-25, which has vanishing cross-peak intensity with the water line in the NOESY experiment and a weak NOE cross-peak in the ROESY spectrum. W129 is within hydrogen-bonding distance of the amide proton of Ile-19, which shows a negative NOESY cross-peak with the water. From these data upper limits for the residence times of <500ps for W143 and <300ps for W129 can be established, i.e. values in the same range as for other surface hydration waters.Secondly, in the crystal structure 5PT16 more than 50% of the X-ray-observable water molecules are within 3.0 A of a backbone carbonyl oxygen, 40 % are near a charged group, and only seven out of a total of 63 water molecules are in contact with the uncharged OH groups of the eight residues of Ser, Thr and Tyr. If the X-ray-observable surface hydration sites were characterized by outstandingly long residence times of the hydration water molecules in solution, correspondingly large, positive NOESY cross-peaks would be observed. This was clearly not the case, since all but one of the positive NOESY cross-peaks with the water resonance could be unambiguously attributed to intramolecular NOEs with OH protons of Ser, Thr and Tyr, or the N-terminal amino protons.There were no strong positive NOESY cross-peaks left that could be attributed to the preferred binding sites for any of the ordered hydration water seen near carbonyl oxygens or charged groups in the crystal structure of BPTI. From this evidence it can again be excluded that water molecules in these sites in the solution structure have significantly longer residence times than the other surface hydration waters. In conclusion the presently available data3-" indicate that the results obtained by X-ray diffraction in protein crystals and NMR spectroscopy in aqueous solution agree in one aspect of protein hydration: Interior waters, which are part of the protein molecular architecture, are observed in identical locations of the protein molecule in solution and in crystals.In contrast, the evidence surveyed in this article implies that the extent to which the surface hydration water molecules are ordered and hence observable by X-ray diffraction in protein crystals, cannot be related to the residence times of water molecules in the corresponding sites in solution. This is possibly the most direct experimental Protein Hydration in Aqueous Solution documentation to date of the fact that extreme care must be exercised when using X-ray crystal data on solvent-accessible protein surface areas as a basis for discussions on structural and functional properties of these proteins in the physiological body fluids. Although single-crystal X-ray diffraction can give rather precise data on the spatial arrangement of surface amino acid side chains and surface hydration water which contrasts with the comparatively poorly defined, motionally disordered state usually seen for these molecular regions by NMR in sol~tion,~'-~~ it appears questionable in the light of the presently described evidence that the crystal data on the molecular surface can be related in straightforward ways with solution properties such as protein folding and protein-protein or protein-substrate interactions.This is further supported by the consideration that interior hydration waters and surface hydration sites are clearly distinguished by the widely different residence times of the water molecules manifested in the sign of uNoEin solution, whereas there is a priori no X-ray evidence to distinguish between the properties of interior waters and highly occupied surface hydration sites.For future research a general structural characterization of hydration sites that give rise to 'interior-like' behaviour of the bound water molecules is therefore of interest. Further- more, while all surface hydration sites observed so far in solution" are characterized by residence times shorter than 500 ps, differences in residence times in the range <500 ps have been evidenced even for the flexibly disordered polypeptide o~ytocin.'~ Although a quantification of these differences would at present be premature, further refinements of the description of protein surface hydration in solution can thus be expected from future improvements of the sensitivity of the NOE experiments.Financial support by the Schweizerischer Nationalfonds (31.25174.88) is gratefully acknowledged. We thank Mrs. E. Huber for the careful processing of the manuscript. References 1 T. L. Blundell and L. N. Johnson, Protein Crystallography, Academic Press, New York, 1976. 2 K. Wuthrich, NMR of Proteins and Nucleic Acids, Wiley, New York, 1986. 3 F. C. Bernstein, T. F. Koetzle, G. J. B. Williams, E. F. Meyer Jr., M. D. Brice, J. R. Rodgers, 0.Kennard, T. Shimanouchi and M. Tasumi, J. Mol. Biol., 1977, 122, 535. 4 G. Otting and K. Wuthrich, J. Am. Chem. Soc., 1991, 111, 113. 5 J. Deisenhofer and W. Steigemann, Acra Chrystallogr., Sect.B, 1975, 31, 238. 6 A. Wlodawer, S. Walter, R. Huber and L. Sjolin, J. Mol. Biol., 1984, 180, 301. 7 A Wlodawer, J. Nachman, G. L. Gilliland, W. Gallagher and C. Woodward, J. Mol. Biol., 1987,198,469. 8 G. Otting and K. Wuthrich, in Water and Ions in Biomolemlar Systems, ed. D. Vasilesco, J. Jaz, L. Packer and B. Pullman, Birkhauser, Basel, 1990, pp. 141-147. 9 G. M. Clore, A. Bax, P. T. Wingfield and A. Gronenborn, Biochemistry, 1990, 29, 5671. 10 J. D. Forman-Kay, A. M. Gronenborn, P. T. Wingfield and G. M. Clore, J. Mol. Biol., 1991,220, 209. 11 G. Otting, E. Liepinsh and K. Wuthrich, Science, 1991, 254, 974. 12 A. A. Bothner-By, R. L. Stephens, J. Lee, C. D. Warren and R. W. Jeanloy, J. Am. Chem. Soc., 1984, 106, 811. 13 M.Eigen, Angew. Chem., 1963,75, 489. 14 E. Liepinsh, G. Otting and K. Wuthrich, submitted. 15 B. A. Messerle, G. Wider, G. Otting, C. Weber and K. Wuthrich, J. Magn. Reson., 1989,85, 608. 16 G. Otting, E. Liepinsh, B. T. Farmer I1 and K. Wuthrich, J. Biomol. NMR, 1991, 1, 209. 17 K. Wuthrich and G. Otting, Int. J. Quantum Chem., in the press. 18 R. Briischweiler, C. Griesinger, 0. W. SIdrensen and R. R. Ernst, J. Magn. Reson., 1988,78, 178. 19 G. Otting, L. P. M. Orbons and K. Wuthrich, J. Magn. Reson., 1990,89, 423. 20 G. Otting, E. Liepinsh and K. Wuthrich, J. Am. Chem. Soc., 1991, 113, 4363. 21 T. J. Swift and R. E. Connick, J. Chem. Phys., 1962, 37, 307. 22 A. Abragam, Principles of Nuclear Magnetism, Clarendon, Oxford, 1961. 23 R. Richarz, K. Nagayama and K. Wuthrich, Biochemistry, 1980, 19, 5189. 24 T. Fujiwara and K. Nagayama, J. Chem. Phys., 1985,83, 3110. 25 Y. Ayant, E. Belorizky, P. Fries and J. Rossett, J. Phys. (Paris), 1977, 38, 325. 26 K. T. Gillen, D. C. Douglas and M. J. R. Hoch, J. Chem. Phys., 1972, 57, 5117. K. Wuthrich, G. Otting and E. Liepinsh 27 S. Meiboom, J. Chem. Phys., 1961,34, 375. 28 W. Saenger, Annu. Rev. Biophys. Biophys. Chem., 1987, 16, 93. 29 K. Berndt, P. Guntert, L. Orbons and K. Wuthrich, unpublished results. 30 C. Branden and J. Tooze, Introduction to Protein Structure, Garland, New York, 1991, p. 284. 31 M. Billeter, A. D. Kline, W. Braun, R. Huber and K. Wuthrich, J. Mol. Biol., 1989, 206, 67. 32 I(. Wuthrich, Science, 1989, 243, 45. 33 K. Wiithrich, J. Biol. Chem., 1990, 265, 22059. Paper 21002941; Received 17th January, 1992.
ISSN:1359-6640
DOI:10.1039/FD9929300035
出版商:RSC
年代:1992
数据来源: RSC
|
5. |
Functional zinc-binding motifs in enzymes and DNA-binding proteins |
|
Faraday Discussions,
Volume 93,
Issue 1,
1992,
Page 47-65
Bert L. Vallee,
Preview
|
PDF (2170KB)
|
|
摘要:
Faraday Discuss., 1992,93,47-65 Functional Zinc-binding Motifs in Enzymes and DNA-binding Proteins Bert L. Vallee" and David S. Auld Center for Biochemical and Biophysical Sciences and Medicine and Department of Pathology, Harvard Medical School, 250 Longwood Avenue, Boston, MA 02115, USA Zinc is now known to be an integral component of a large number and variety of enzymes and proteins involved in virtually all aspects of metabol- ism, thus accounting for the fact that this element is essential for growth and development. The chemistry of zinc, superficially bland, in reality has turned out to be ideally appropriate and versatile for the unexpected develop- ment of multiple and unique chemical structures which biology has used for specific life processes.The present discussion will centre on those distinctive zinc-binding motifs that are critical both to enzyme function and the expression of the genetic message. X-Ray diffraction structure determina- tion of 15 zinc enzymes belonging to IUB classes I-IV provide absolute standards of reference for the identity and nature of zinc ligands in their families. Three types of zinc enzyme binding motifs emerge through analysis of these: catalytic, coactive or cocutulytic, and structural. In contrast to zinc enzymes virtually all DNA-binding proteins contain multiple zinc atoms. With the availability of NMR and X-ray structure analyses three distinct motifs now emerge for those: zincJingers, twists and clusters. The feature(s) of zinc chemistry on which the attractiveness of this element for biological systems might be based have not always been self-evident.The completely nsn-toxic nature of zinc is, of course, important and is likely related to its chemical properties. It is stable and inert to oxidation/reduction. In the neighbouring transition-metal elements, redox changes are major sources of changes in coordination geometries, rate of ligand substitution and amphoteric properties. The restriction of these variables in the case of zinc provides stability in a biological medium whose potential is in flux. Zinc is amphoteric and exists in both metal hydrate and hydroxide forms at pH values near neutrality. Its coordination sphere is flexible. While zinc can adopt coordination numbers varying from two to eight, four, five and six coordination seem most frequent in biological systems.Collectively, these physicochemical features are important means for translating chemical structure into multiple biological functions. Thus, biology has evolved very varied permutations of zinc complexes with proteins to stabilize their structures and conformations to serve both in enzyme function and the expression of genetic messages. Distinctive motifs have been defined for each of these. A combination of N, 0 and S donor atoms derived jointly from three His, Glu or Asp, and Cys residues bind to zinc in catalytic sites to activate a water molecule."* Two, three or four S atoms of Cys together with two, one, or zero N atoms of His, respectively, form tetradentate zinc sites which exclude H20.These critically govern protein-protein and protein-DNA (or RNA) interactions, modifying local structure, folding and conformation. In the procollagenases and progelatinases, zinc is bound to donor atoms contributed by four amino acids. Replacement by water of the only 47 Functional Zinc-binding Motifs site structure Zn ligands catalytic 1 H>E>>D,C coactive a2-3 structural 1c Fig. 1 Zinc-binding sites in zinc enzymes cysteine-S atom among them converts their tetradentate zinc into a tridentate zinc atom.3 The resultant Cys/ H20 ligand-exchange reaction activates these zymogens. The zinc clusters of metallothionein shown by NMR4 and X-ray’ structure determina- tions are examples of biological zinc coordination that are unique thus far to biology.The demonstration of such clusters in a DNA-binding protein GAL46-8 reinforces earlier surmises that the zinc cluster structure and other characteristics of metallothionein imply its involvement in the transmission of the genetic The discovery of zinc-sulfur clusters predict yet additional and analogous alternative structure based on nitrogen and oxygen donors; their physiological potentials are best appreciated in the context of the emerging cell biochemistry. Catalytic, Structural and Coactive Zinc Sites in Zinc Enzymes X-Ray diffraction structure determination both of zinc complex ions and enzymes now provide absolute structural standards of reference for the identity and nature of zinc ligands in proteins.Fifteen zinc enzymes belonging to IUB classes I, 11, I11 and IV provide unambiguous details and virtually conclusive data which identify both their zinc ligands and coordination geometries. Analysis of the results demonstrates the existence of coordination geometries that are characteristics for the three types of zinc motifs that have been recognized in enzymes: catalytic, coactive or cocatalytic and structural (Fig. 1). B. L. Vallee and D. S. Auld bO CA I, II B -L DD-CPD TL, NP AE CPD A, B PL c AP ADH ionization polarization S I /Zn.I displacement Fig. 2 (a) Zinc ligand spacing in catalytic zinc sites. (b) Schematic of the functions of the H20 ligand in catalytic zinc sites of zinc enzymes. S, substrate; B, base Catalytic Zinc Sites Three His, Glu, Asp, or Cys residues provide the zinc ligands of the 12 known catalytic zinc sites in enzymes [Fig.2(a)]. Three histidines are typical of the lyases, human carbonic anhydrase I and 11,13714the hydrolase Bacillus cereus /3-lactamase1' and the DD-carboxypeptidase of Streptomyces albus G.16t Two histidines are characteristic of the hydrolases bovine carboxypeptidase A and B,I8*l9 thermolysin, the neutral protease of B. thermoproteolyticus,20 B. cereus neutral protease,21 Pseudomonas aeruginosa elas-tase,22B. cereus phospholipase C23and E. coli alkaline ph~sphatase.~~ The catalytic zinc site of alcohol dehydrogenase is the only one so far where there is only one hi~tidine;~~it is also unique among these enzyme active sites that two cysteines participate (Cys-46 and Cys-174).In addition, residue 47 accommodates the phosphate of NAD(H).25t Glutamate is the oxygen donor in six and aspartate in one of these enzymes. Overall, zinc prefers the imidazolyl nitrogen by far in its catalytic sites [Fig. 2(a)]. t The crystal structure of murine adenosine deaminase complexed with a transition-state analogue 6- hydroxy-I, 6-dihydropurine ribonucleoside, HDRP, has revealed the presence of a zinc atom17 not previously known to be important in the activity of the enzyme. The zinc atom is coordinated to His-15, His-17, His-214 and the OS(2) of Asp-295 and the O(6) of HDPR. While the complex of the free enzyme is not known at present, a mechanism has been proposed where Asp-295 would act as a general base and the zinc would serve as an electrophile to activate a water ligand.* Alcohol dehydrogenase is the only coenzyme-dependent zinc enzyme whose three-dimensional structure is known.25 The conjoint involvement of both zinc and NADH in the catalytic process calls for a suitable alignment of amino acid residues that can provide for both metal chelation and coenzyme binding sites. Remarkably, this has been accomplished in alcohol dehydrogenase by (i) using residues 46 and 47 as zinc and NADH binding ligands, respectively, (ii) providing two cysteines as ligands to the active site zinc, and (iii) elongating the short spacer between L, and L, from ~3 to 20 amino acids.The active zinc site of alcohol dehydrogenase is also the only one among the 12 catalytic sites here cited as structural standards that comprises only one histidine residue. Functional Zinc-binding Motifs H20,the fourth ligand (L4) of active site catalytic zinc atoms, is not solely a critical feature but likely the very objective of this motif’s design: the activation of water, so that it can be ionized, polarized or displaced [Fig. 2(b)]. Ionization of the activated water or its polarization brought about by a base form of an active-site amino acid provides hydroxide ions at neutral pH, and its displacement results in Lewis acid catalysis on the part of the catalytic zinc atom. The structure of the active site implies that both the identity of the three protein ligands and their spacing underlie mechanistic pathways to activate water, determining details of the ensuing catalytic reactions accomplished in conjunction with other active-site residues.Short and Long Spacers The regularity of amino acid spacing between the ligands of catalytic zinc atoms is striking.’,2926In 11 of the 12 enzymes listed, ‘short’ spacers consisting of from one to three amino acids, separate the first two ligands, L1 and L2 [Fig. 2(a)].Apparently, when properly oriented, L, and L2 can form a bidentate zinc complex. It is equally characteristic, that a ‘long’ spacer, from ca. 20 to ca. 120 residues, separates L3 from either L, or L2, generally in the C-terminal end of the protein. This long spacer arm could contribute to the induction of the active catalytic site, substrate-binding groups and hydrogen bonds to form the active centre.The long spacers intervening between the residues that bind the catalytic zinc atoms imply a much more flexible coordination geometry of the active site complex than that which is characteristic of structural sites; there the interligand distances are relatively shon, as might be consistent with the possibility that zinc stabilizes both overall protein structure and/or local conformation. Seemingly, the multiplicity of possible coordina- tion numbers combined with the adaptability of zinc geometries can alternately impart local, structural and functional rigidity or flexibility, respectively.On the one hand, this could poise the zinc atom for catalysis, creating an entatic state susceptible to change through substrate and/ or product interactions during catalysis. lY2On the other, such adaptability could generate multiple conformational states which would both suitably align and organize those amino acid side chains and hydrogen bonds that participate in the catalytic process and substrate binding as well as accommodate an outer-sphere ligand whose coordination might activate water. Flexible zinc coordination could thus prove instrumental to the induction of conformationally ‘elastic’ substrate-binding pockets which would have catalytic potential. Thus, differences in spacer lengths might be determinants of substrate specificity, the functions of water and the details of catalytic mechanisms.’326 Remarkably, short and long spacers characteristic of the catalytic sites of monozinc enzymes recur in sites of coactive zinc-metal bridges (see below).These characteristic spacings have also been noted in zinc-binding sites at the interface of protein-protein interactions.? These structural features of zinc metalloenzymes further call attention to the impor- tance of protein folding and conformation which are the basis of and maintain the $ When the serine protease, Tonin, is crystallized from a buffer containing Zn2+, the resultant crystal contains one mole of zinc which is bound to the active site His-57, His-97 and Hi~-99.~’ Apparently, His-97 and -99 constitute a nucleus for a bidentate zinc complex.Once formed, it can disrupt the interaction between the catalytic His-57 and Ser-195 pair, thereby inhibiting enzymatic catalysis. It seems novel that in the crystal structure a neighbouring protein molecule provides Glu-148 to serve as the fourth ligand to zinc. A similar type of interaction has been suggested as the basis of the strong binding of human growth hormone to the human prolactin receptors (K, =33 pmol dm-3).28 Mutagenic studies are the basis of the proposed zinc binding site comprising the contribution of His-18 and -21, Glu-174 by the growth hormone and His-188 by the prolactin receptor. In both of these instances one protein molecule provides a short and long spacer while the second molecule supplies the fourth ligand, a situation reminiscent of the chemical basis of the Velcro mechanism in the activation of the latent zymogen form of matrix metalloproteinases.2 B.L. Vallee and D. S. Auld --11994 96 BovineI SAEL LVHW MouseI * SGEL LVHW HorseI * SAEL LVHW RabbitI * SAEL LVHW Monkey* SSEL I VHW BovineII AAEL LVHW MouseII * AAEL LVHW RabbitII* AAEL LVHW SheepII * AAEL LVHW Chickenll * DAEL I VHW Human111 AAEL LVHW BovineIll AAEL LVHW HorseIll* AAEL LVHW Fig. 3 Zinc ligands of carbonic anhydrases. Lightly shaded boxes denote the X-ray standard of reference. The asterisks denote those enzymes for which zinc was not measured directly. See ref.2 for sequence citations. Reprinted with permission from B.L. Vallee and D. S. Auld, Biochemistry, 1990, 29, 5647. Copyright 1990 American Chemical Society structure and function of proteins. In zinc enzymes, the primary and secondary structure enveloping the short and long spacer arms might contain information on and give directions to the creation of zinc complexes with suitable coordination geometries and numbers. Zinc, and perhaps other metals, may in fact turn out to monitor and probe the folding process. Such considerations will no doubt bear on the design and synthesis of enzyme model systems. Overall the catalytic potentials of zinc enzymes seem closely related to the nature of the short and long amino acid spacers and the environment that they create for metal ligands.Their incorporation into synthetic designs should reflect the potentials of catalysis and specificity inherent in zinc enzymes. Families of Catalytic Zinc Sites The results of these X-ray structure determinations serve as standards of reference for other enzymes with similar functional characteristics in families of both identical and closely related enzymes from many different species. 192 Both their structural identities (or the converse) and conformations of their active and structural enzymatic site zinc ligands can be recognized by comparing their amino acid sequences with those of the enzymes whose three-dimensional structures have been determined. Thus, as an example,'32 the crystal structures of carbonic anhydrase 113 and III4 from human erythrocytes serve to identify the catalytic zinc ligands and their surroundings for 15members of this family of zinc lyases (Fig.3). A p-strand encompassing residues 88-108 supplies two of the zinc ligands, His-94 (L,) and His-96 (L2), separated by a single amino acid spacer. For 15 different carbonic anhydrases, the seven amino acids surrounding these ligands are 95% similar in chemical properties. The third ligand, His-119 (L3) is contributed by a 22 amino acid long spacer arm that contains a P-sheet extending from residue 113 to 126. Four of the eight amino acids surrounding it are identical for the 15 sequenced carbonic anhydrases and the remaining four show a high degree of similarity. Functional Zinc-binding Motifs Coactive or Cocatalytic Zinc Sites We have named a second type of zinc enzyme binding site coactive or cocatulytic.These zinc sites occur in enzymes which contain two or more metal atoms. The metal atoms are always in close proximity to one another and function as a unit for catalytic activity. In these zinc sites at least one amino acid ligand forms a bridge between two zinc atoms or between a zinc atom and a different metal atom. In the past literature, the zinc (or other metal) atoms which turn out to form such bridges have been described functionally as ‘modulating’ or ‘regulatory’. Thus far, in alkaline phosphatase the second (and third) metal atoms have been identified to be zinc and magnesium,24 two zinc atoms in phospholipase C,23zinc and possibly magnesium in bovine lens leucine aminopep- tida~e.*~’~’In all of these aspartate and/or glutamate are the bridging amino acids.? In bovine erythrocyte superoxide dismuta~e~~ copper is the active site metal while zinc is the ‘modulating’ metal, both are bound to the same histidine which thereby is the bridging amino acid.Among these enzymes, Asp and His serve most frequently as the zinc binding amino acids, seven and six times, respectively, while Glu has been encountered only once thus far under these circumstances. When magnesium is one of the metal atoms the stability of the bridged metal is generally less than in the case of zinc. In alkaline phosphatase, three metal atoms are clustered (see Fig. 1). The interatomic Zn( 1)-Zn(2) distance is 3.94 I$ while the Zn(2)-Mg distance is 4.88 A.Magnesium is coordinated in a slightly distorted octahedron interacting with the Os(2) of Asp-51, one of the oxygens of Glu-322, the hydroxyl of Thr-155, and three water molecules. This metal is bridged to the second zinc by the 06(1) of Asp-51. Asp-369, His-370 and Ser-102 complete the coordination of the second Zn site. In phospholipase C the zinc-aspartate zinc bridge was detected once the enzyme was c stallized in 10 emol dm-3 Zn2+. The interatomic Zn(1)-Zn(2) distance is ca. 4.4x, while the Zn(2)-Zn(3) distance is 3.3 A. The third zinc is bound to the amino and carbonyl groups of Trp- 1 ,the nitrogen of His- 14, an oxygen [Os(2)] of the bridging ligand Asp-122 as well as an oxygen of a water molecule which also bridges the two metal The second zinc is bound to Os( 1) of Asp-122, His-1 18, His-69 and Asp-55.In lens aminopeptidase the second zinc site is bound less tightly being coordinated to the €-amino group of Lys-250, the carboxylate oxygens of Asp-273 and the second carboxylate oxygen of the bridging ligand Gl~-334.~’$ This metal is bridged to the more tightly bound first zinc site 2.88 A away by the O‘(1) of Glu-334. The carboxylate oxygens of Asp-255 and Asp-332, as well as the latter’s carbonyl oxygen complete the coordination of the first zinc site. In superoxide dismutase, the active site Cu and Zn lie 6.3 A apart. The zinc binds to His-61, -69 and -78 and the O’(1) of Asp-81 in tetrahedral geometry with strong distortion toward a trigonal pyramid having the buried Asp-81 at the apex.33 The imidazole side chain of His-61 acts as a bridge to the catalytic copper sites which also coordinates His-44, -46 and -118 in an uneven tetrahedral distortion from a square plane.Structural Zinc Sites The structural zinc atom of equine alcohol dehydrogenase is bonded to four cysteine~,~~ as is the case for zinc in the regulatory subunit of aspartate ~arbamoyltransferase.”~ In t Zinc and magnesium have also been implicated in the 3-5’ exonuctease activity of Escherichia coli DNA polymerase In this case zinc binds to Asp-355, Glu-357 and Asp-501. The magnesium is bridged to this site by Asp-355. $ In the first report of the zinc binding site29 zinc was believed to be bound to one carboxylate oxygen of Asp-273 and the second carboxylate oxygens of two bridging ligands Asp-255 and Glu-334.B. L. Vallee and D. S. Auld [Thermofydnl IVVA~ELT~AVTJ AminopeptidaseN* VIAHuman AminopeptidaseN* V IGE. coti AminopeptidaseM* VIARat AntigenBP1/6C3* VVAMouse LTA4Hydro'ase VIAHuman Fig. 4 Putative zinc ligands of aminopeptidases. See ref. 2 for sequence citations the linear sequence these cysteines are close to one another, separated by 2, 2 and 7 or 4, 22 and 2 amino acids, respectively. For both ADH and ATCase the zinc tetrahedrally coordinated to the four cysteines prevents access of water or substrate to its coordination sphere. In both these instances the role of zinc apparently is to maintain the structure of the protein in its immediate vicinity.The effect of zinc in a structural site might be comparable to that of disulfide bonds and/or calcium. The predominance of sulfur ligands in these sites coincides with earlier views from both inorganic and geochemistry about the predilection of zinc for sulfur ligands. Metalloexopeptidases(Unknown X-Ray Structure) Monozinc aminopeptidases (EC 3.4.1 1.2) catalysing the hydrolysis of N-terminal amino acid residues of proteins, peptides and amino acid amides have been identified in and isolated from a wide range of tissues and bacteria, but their X-ray and NMR structures have not yet been reported. The monozinc human intestinal aminopeptidase consists of 967 amino acids with a domain of ca.300 amino acids, remarkably similar both to that in E. coli aminopeptidase N and rat-kidney aminopeptidase M (Fig. 4). In their domains the linear arrangement of two histidines and one glutamic acid is closely similar to that in the zinc-binding site of thermolysin while there is no overall homology of this family to thermolysin (Fig. 4). The short spacer between His-388 (L,) and His-392 (L2) of the intestinal aminopep- tidase consists of three amino acids, identical with that of thermolysin; while the long spacer between amino acids His-392 (L2) and Glu-411 (L3) comprises 18 instead of 19 amino acids2 This structure analysis of one enzyme family has proven to predict that of another. Leukotriene A4 hydrolase (3.3.2.6) exhibits 21% sequence identity with that of rat-kidney aminopeptidase M,35 but the former was neither known to be a peptidase nor an esterase.Closer inspection revealed that the homology is remarkable and limited to primarily one domain which might represent a putative active zinc site:2 a short spacer of three amino acids separates His-295 (L,) from His-299 (L2), and Glu-318 might be the third ligand (L3). If so, a 'long' spacer of 18 amino acid residues would separate L2 from L3, identical to the long spacer of monozinc aminopeptidases (Fig. 4). Based on these considerations, leukotriene A4 hydrolase was analysed, and indeed proved to Functional Zinc- binding Motifs contain 1mol of zinc and no other metals? This enzyme has now proven to exhibit aminopeptidase activity toward alanine and leucine 4-nitr0anilide;~ lysine 4-nitroanilide and leucine P-naphthylamide and is inhibited by the aminopeptidase inhibitor bestatin as well as captopril, an inhibitor of angiotensin-converting However, while LTA, hydrolase is an aminopeptidase, neither porcine kidney nor bovine intestinal aminopeptidase are able to catalyse the conversion of LTA, into LTB4.37 Site-directed mutagenesis studies of the recombinant mouse LTA, has provided evidence for His-295, His-299 and Glu-318 being the zinc ligands.Thus, individual mutations of these residues to Tyr-295, Tyr-299 and Gln-3 18, respectively, produce proteins which no longer bind zinc nor display LTA, or aminopeptidase activity.,’ The cell surface of glycoprotein BP-l/6C3 is 36% and 34% amino acid, identical with human and rat aminopeptidase, respectively, and 21% identical with E.coli aminopeptidase and human LTA, hydr~lase.~’ Its zinc binding motif is also the same as that of these mono zinc aminopeptidases (Fig. 4). Functional studies of this protein suggest its possible identity with aminopeptidase A, which acts on peptides with N-terminal acidic amino Inhibition of its activity by metal-binding agents such as 1,lO-phenanthroline, EDTA and 2,2’-bipyridyl further indicate that it may be a zinc metallopeptidase. Metalloendopeptidases (X-Ray Structure Unknown) In higher vertebrates, collagen is the most abundant protein; it accounts for ca. one third of the total, but most proteinases (including the neutral proteases) do not cleave the triple collagen helix.Collagenases have been postulated to be responsible for the physiology of the rearrangement, synthesis and degradation of connective tissues in growth and development, and the pathology of arthritis, emphysema, lupus, tumor metastasis, osteomalacia, wound healing, bone resorption, uterine involution and yet other conditions. The substrate specificity of the bacterial collagenases (EC 3.4.24.7) and neutral proteinases differs markedly, much as they share similar zinc (and calcium) contents and pH activity optima. Six collagenases from Clostridium histolyticum are zinc enzymes. Their molecular weight ranges from 68 080 to 125 000 u43 and they contain 0.8-1.1 mol of zinc per mol of monomeric protein; their calcium contents vary from 1.9 to 6.8 mol mol-’.* In one particular domain of all known matrix metalloproteinases (collagenases, stromelysins, transins and human pump 1) a short spacer of three amino acids separates two histidines; these have been thought to correspond to those of the zinc-binding site of thermolysin.4’ The identities of the three residues preceding and succeeding the presumed L, and L2s are consistent with such deductions.In the thermolysin family, a 19-residue spacer separates His-146 (L2) from Glu-166 (L3). However, in the matrix metalloproteinases no Glu exists at the expected position nor at any one near it.2946 The location and identity of L3 in the matrix metalloproteases remains speculative; the assignment awaits the structure determination of their active-site zinc ligands.Activation of Matrix Prometalloproteinases: ‘Velcro’ mechanism Metalloproteinases that catalyse the hydrolysis of the major components of the extracel- lular matrix are synthesized as inactive precursors and then converted to the active form. Selective enzymatic cleavage of peptide bonds proceeding either by a ‘one-by-one’ or a ‘zipper’ mechanism47 has been observed in enzymes and hormone precursors, vaso- active products, proteins involved in growth and development, blood coagulation, fibrinolysis, digestion, complement activation and yet others.48 The mechanism of activation of the prometalloproteases does not conform with either one of these hypothetical schemes but distinctly differs from these.B. L. Vallee and D. S. Auld Fig. 5 Schematic of the Velcro mechanism for the activation of the matrix metalloproteinases. The small arrow indicates region of the autocatalytic cleavage site. Ligands to the functional zinc are thought to be two histidines, while the unidentified third ligand L3 is symbolized by the crosshatched circle In a highly conserved region, PRCGV(N)PDV,2949 the propeptides of all procol- lagenases and progelatinases contain one single Cys. This cysteine is believed to form a mercaptide with the single zinc atom of the enzyme which it renders inactive. Fibroblast procollagenase can be activated by trypsin, stromelysin, plasmin and plasma kallikrein, organomercurials, salts such as NaI, NaSCN, detergents, oxidants such as NaOCl, heavy metals such as Au' compounds and Hg", and through thiol exchange Apparently, the dissociation and/or displacement of that cysteine from the zinc atom generates enzymatic a~tivity.~ Thus, cysteine seems to act like Velcro by 'sticking' to the zinc atom through its S group, thereby blocking and preventing the participation of zinc in enzymatic action (Fig.5).2 The removal of the cysteine thereby transforms the zinc, tetradentate with respect.to protein ligands, to become tridentate in that perspective, H20 becoming the fourth ligand in this ligand-exchange reaction. The cysteine of the activation peptide blocks access of H20 to the active site zinc atom while its removal by either physiological or pathological processes and the subsequent ligand exchange for H20 or substrate complete the activation process.In the recombinant human fibroblast proenzyme a Cys can be replaced by Ser;" this yields an active enzyme, entirely consistent with the fact that Cys plays its role by binding to zinc in the latent form of the enzyme. Mutations of several of the residues of the PRCGV(N)PD ~equence~~,~' in rat transin suggests that in point of fact both the Arg and the Cys are essential for the maintenance of ina~tivity.~' While details of the process leading to activation still require delineation, the removal of the SH group from the zinc clearly is the principal chemical event that induces activity (Fig. 5). This provides another important example of the adaptability and versatility of zinc chemistry to biological needs.Zinc Fingers, Twists and Clusters A crucial role of zinc in DNA and RNA synthesis and cell division became apparent in the seven tie^.^^-^' Soon thereafter Wu found that Xenopus transcription factor IIIA (TFIIIA), which activates the transcription of the 5s RNA gene, contains 2-3 mol zinc per mol of protein,56 focusing interest on a role for zinc specifically in the transcription Functional Zinc-binding Motifs process. Soon thereafter the primary structure of TFIIIA was shown to contain nine repeat sequences of cu. 30 amino acids, in each of which two Cys and two His residues are ~onserved.~~’~~ The 7s particle was then demonstrated to contain 7-1 1 zinc atoms,59 a stoichiometry which has been the subject of debate60*61 (but see below).The two Cys and two His residues per 30 amino acid unit were proposed to form a tetrahedral coordination complex with each of nine zinc atoms generating peptide domains (zinc fingers62) that interact with DNA. The TFIIIA sequence and the features of its zinc complexes have become the primary model for zinc proteins that bind to DNA. Numerous sequences have been thought to reflect both structural and functional similarities, leading to revision of past ideas and predictions of new mechanisms for the DNA-protein interactions. The descriptive name ‘zinc finger’ has been adopted widely to encapsulate the implications of and inferences relating to the model.It has proven to be a convenient linguistic vehicle to convey the relevant chemical, biological, and mechanistic facts and hypotheses. The designation has also served to describe virtually any relatively short protein sequence that contains four or more Cys and/or His residues which is believed to function as a nucleic acid binding d0main.6~~~~ Previous efforts to identify and define the zinc binding sites of zinc enzymes were based on the three-dimensional structures of one or two members of a given family as determined by X-ray or NMR methods. The results were then compared with sequences of members of the families to which they belong but on which no structure analysis had been performed as yet’*2 (see above). This scenario could not be followed for the DNA-binding zinc proteins, of course, until such structural data were obtained.This now has become a reality. As a consequence, three distinct motifs of zinc-binding sites in DNA-binding proteins have emerged: (1) zinc fingers, (2) zinc clusters and (3) zinc twists (Fig. 6). While many and perhaps even most zinc enzymes contain a single zinc atom, virtually all DNA-binding zinc proteins contain multiple zinc atoms. As a consequence, in the latter, interatomic Zn-Zn distances become important structural features characteristic of the mode of Zn binding in the various motifs. Interatomic Zinc-Zinc Distances In GAL4 the six Cys residues bind to two zinc atoms forming a zinc thiolate cluster in which the zinc-zinc distance is ca. 3.5 A (Fig.6). The glucocorticoid (GR) and oestrogen (OR) receptor DNA-binding domains also contain two zinc atoms, but these are located at two separate and distinct sites. In each of these, four different Cys ligands bind to one zinc atom, and the resultant interatomic zinc-zinc distance is ca. 13 A. The three-dimensional structure reported for the Zif 268-DNA three-finger zinc protein reveals that the interatomic distance of these zinc atoms is cu. 27 A. In all of these, the chemical details and stoichiometry of the zinc binding sites are characteristic but differ completely, of course, from those of single zinc atoms at active sites of enzymes. Based on these admittedly very limited data, thus far the interatomic zinc-zinc distances would seem to be diagnostic of the characteristic motifs encountered for DNA-binding zinc proteins.Zinc Fingers The strictest classification, as proposed by Frankel et would confine the term zinc finger to those proteins in which there are multiple repeats of cu. 30 amino acids each that (1) have two Cys and two His residues and their spacing conserved, (2) conserve the two aromatic residues and leucine that are found in the TFIIIA repeats and (3) whose three-dimensional structure of each repeat is presumed to resemble a ‘finger’. Other classifications, such as ‘classical’ and ‘non-classical’ zinc fingers, have also been B. L. Vallee and I). S. Auld domain structure protein r (Zn-Zn)/A 27* finger GR 13 ER 12 zinc cluster GAL4 3 Fig.6 Zinc-binding sites in DNA-binding proteins. * indicates Zn-Zn distance for first zinc finger to the second finger and ** from second to third finger proposed and subclassifications have been based on sequence homologies.66 However, ‘zinc finger’ has rapidly both become a part of the language and a concept conveying even more general ideas. X-Ray crystallography of a native zinc-finger proteid7 together with NMR data on single or double synthetic ‘fingers’ provide direct evidence for the structure of zinc fingers that serve as a prediction of function. Small peptide domains have been synthesized to serve as zinc-finger models: ADR 1-2,68 Xfn-31,69 SW15-2,” rn~k2.~*These NMR studies show that zinc is bound tetragonally to the two Cys and two His residues.An a-helix is observed in all four peptides but its length varies from 5 to 11 amino acids and includes both, one or zero His ligands. The structural features of the two Cys zinc ligand region seems to vary considerably from one peptide to another. The cysteines are part of two &strands arranged in a hairpin structure of the Xfin ~eptide,6~as approximately antiparallel P-sheet for the SW15 ~eptide,~’some form of a turn for ADR168 or no &sheet or -turn for the mKr2 peptide?2 A definitive understanding of how these zinc-finger domains serve in site-specific recognition of a particular DNA-binding protein has come from the X-ray diffraction of the DNA-binding domain derived from the mouse immediate early protein Zif268 (also known as Krox 24, NGFI-A and Egrl).The three zinc finger peptide was crystallized with a consensus DNA-binding site and the structure was solved at 2.1 A?’ Functional Zinc-binding Motifs Fig. 7 (a) Zinc fingers of Zif 268. The amino acids involved in DNA recognition are labelled in the shaded circles. (b)Zinc twist for glucocorticoid receptor. The amino acids involved in a-helix formed between the two zinc sites are shown as shaded circles. The lettered circles are the amino acids important to contacting the DNA Each zinc-finger domain consists of an antiparallel &sheet containing two Cys and an a-helix containing two His held together by coordination of the Cys and His residues to the central zinc ion and by a set of hydrophobic residues.The interatomic distance between Zn(1) and Zn(2) is 26.6A and that between Zn(2) and Zn(3) is 27.4A (C. Pabo, personal communication). Importantly, in each zinc finger one zinc atom is coordinated to two His and two Cys as its base. Each of the three zinc fingers uses amino acid residues from the N-terminal portion of its a-helix to make contact with base pairs in the major groove [Fig. 7(a)]. These residues derive from the central peptide loop between the second Cys and first His of each ‘finger’ and include the amino acid immediately preceding the a-helix and the second, third and sixth ones of the a-helix. Fingers 1 (Arg-18 and -24) and 3 (Arg-74 and -80) use the arginine guanidinium group to interact with N(7) and O(6) of a G base. These interactions are stabilized further by the S carboxylates of Asp-20 and Asp-76.These fingers recognize identical DNA base subsites GCG, while finger 2 recognizes a different subsite TGG. It does this by using Arg-46 and His-49 to interact with the G base, again stabilized by the S carboxylate of Asp-48. In this manner a relatively simple recognition pattern develops. B. L. Vallee and D. S. Auld The residue immediately preceding the a-helix contacts the third base on the primary strand subsite (5' --G), the third residue on the a-helix can contact the second base (5' -G,) and the sixth residue can contact the first base (5'G --). The results of these studies demonstrate how zinc makes a direct and unsuspected contribution to the overall binding energy.The tetrahedral geometry around the zinc ion orientates the finger for site specific interaction. Zinc also promotes a specific interaction of the first ligating His to the DNA backbone; the N" of His coordinates the zinc while the Na hydrogen bonds to the phosphodiester oxygens. As multiple zinc finger proteins are studied, additional permutations may become apparent. Zinc Twists The glucocorticoid (GR) and oestrogen (OR) receptors belong to the nuclear receptors class of proteins.72i73 Upon binding a hormone, the receptor translocates from the cytoplasm to the nucleus where it binds to a specific DNA sequence (glucocorticoid or oestrogen responsive elements, GRE or ORE) and thereby modulates transcription. A 150amino acid segment of the human 87 000 u GR contains two zinc atoms and extends ca.40 amino acids beyond both ends of the minimal DNA-binding domain.74 The 'H and '13Cd NMR solution structures of the 71 amino acid segment of the DNA-binding domain of GR75-78 and 'H-NMR structure of a corresponding 84 amino acid fragment of OR79 show two zinc atoms each coordinated to four cysteines in these particular receptors separated by ca. 13 and 12 A, respectively. Remarkably, these zinc-binding sites are not independent zinc finger-like substruc- tures, but rather fold to form a single structural domain.75979 The DNA-binding recogni- tion site derives from the a-helix that is part of the linking peptide region between the two zinc sites accounting for our choice of calling this a zinc twist [Fig.7(6)]. A second a-helix which is anchored by the C-terminal coordinating cysteines of the second zinc site perpendicularly crosses the first helix near its midpoint, stabilizing it by hydrophobic interaction between the two helices. The globular peptide domain of the resultant zinc twists has a DNA recognition site between the two zinc atoms, each of which are coordinated to four Cys [Fig. 7(6)]. This in marked contrast to the arrangement encountered in zincfingers where the DNA recognition site comes from within the zinc coordination site where zinc is coordinated to two His and two Cys [Fig. 7(a)]. X-Ray crystallography analysis of the DNA-binding domain of GR complexed to either a GRE consensus sequence containing a symmetric but abnormal spacing of four nucleotides, GRE4, or a natural three nucleotide spaced GRE3, shows how the presence of DNA profoundly influences the structure of GR." In both cases the presence of DNA causes the GR domain to dimerize, a situation which is not encountered in the NMR solution studies when GRE is not present, even when [GR] = 1 mmol dm-3.In addition, if the three nucleotide spaced GRE3, is used, each DNA recognition site of the dimer does interact properly with the major groove of the DNA while one half of the dimer is out of register when the abnormal GRE4, is used to form the complex. The mode of interaction of GR with DNA is in general agreement with the molecular models for dimer interaction with DNA as proposed on the basis of NMR studies of uncomplexed monomeric DNA-binding domains of GR and OR.^^*^^ The two zinc atoms are coordinated tetrahedrally to the expected eight cysteines with the central a-helix of the zinc twist, forming a complex with the major groove of the DNA.These studies further stress the importance of Arg-466, Val-462 and Lys-461 in this interaction [Fig. 7( 6)1. Arg-466 makes two hydrogen bonds to a G(4) base in a manner similar to that seen for Zif 268. A methyl group of Val-462 makes a van der Waals contact with the 5-methyl group of the T(5)base while Lys-461 directly hydrogen bonds to N(7) of G(6) and indirectly provides hydrogen bonds through a water molecule to O(6) of G(7) Functional Zinc-binding Motifs and O(4) of T(6).The X-ray diffraction studies further reveal how the second zinc- binding site influences the dimerization process. Ala-477, Arg-479 and Asp-48 1 from the five amino acid loop between the first two cysteines and Ile-483, Ile-487 and Asn-491 from the nine amino acid loop between the second and third cysteines make critical dimer interface contacts." The function of zinc in the nuclear receptor family is clearly twofold. It stabilizes the helix involved in DNA recognition and aids in orientating the peptide fold of the second zinc site which is critical to the dimerization process. Zinc Clusters: Metallothioneins The metallothionein class of zinc proteins has become increasingly important. Their structure is known but their function is not, though they have been suggested to play an important role in the transmission of genetic information and metabolic and an involvement in detoxification has been a popular tenet (cited in ref.81). Recent data suggest that thionein, the metal-free protein, rather than metallothionein, may be the functional speciesg2 (see below). The molecular weight of metallothioneins is low (ca.6700), but their metal content is high, usually 7 (or 6) mol mol-' including zinc, cadmium, and/or copper. There are 20 cysteines among the 62 (or 61) amino acids, but no cystine, heterocyclic or aromatic amino acids are present. Thirty-five of the residues of mammalian metallo- thioneins are invariant, including the N-terminal 1-7 and the central 29-38 amino acid residues.In Class I mammalian metallothioneins all 20 Cys residues are preserved, and the positions of nearly all basic residues are preserved almost completely. The blocked N-terminal residue is usually Met. The grouping of the numerous Cys residues into chelating Cys-Cys, Cys-X-Cys and Cys-X-Y-Cys (X, Y =residues other than Cys) predispose metallothionein to the binding of 'soft' metal ions in the form of mercaptides with characteristic spectra.83 Chelating agents and acidification remove the metals while abolishing the spectra.83 Thionein has been monosubstituted with zinc, cadmium, mercury, lead, bismuth, tin, cobalt, nickel, iron and technetium to form the corresponding metallothioneins whose spectra have been characterized." In all classes of metallothioneins the metal complexes are organized in metal-thiolate clusters, as determined by NMR for rabbit-liver Cd7-MT (isoform 2a),84 rat-liver Cd7-MT (isoform 2)4 and human liver Cd7-Mp5 and by X-ray crystallography for rat-liver Cd5Zn5-MT (isoform 2).5 A cluster structure has not been described hitherto for naturally occurring inorganic zinc and/or cadmium complexes.This mode of coordination, proven by a variety of spectroscopic technique^^^^^-'^ and X-ray diffraction analysis' is in the form of tetrahedral metal tetrathiolate bonds with some of the thiolate ligands sharing a divalent metal In mammalian metallothioneins eight Cys serve as doubly coordinated bridging and 12 as singly coordinated terminal thiolate ligands. The early overall measurement of ca.3 SH ligands per metal atom83 is in accord with this. The three-metal cluster, B, preferentially loses cadmium which tends to accumulate in the four-metal cluster A [Fig. 8( a)].'i Indications are that the thermodynamic stability and/or kinetic lability of the two metal thiolate clusters differ, resulting in random distribution of the two clusters, presumably accounting for resultant c~operativity.~~ It is conceivable that these cluster features serve to regulate the metabolic roles of these metals, as indicated by fluctuations in the clusters and consequent intramolecular metal exchange." S ontaneous metal exchange among binding sites occurs within seconds in cluster B?'. ''13 Cd NMR studies have shown that Cd ions move extremely rapidly between different three-metal cluster binding sites via interprotein metal exchange?* Zinc has also been shown to be transferred from Xenopus Zuevis transcription factor 111 A to the metal free thionein clusters,82 indicating that thionein actively sequesters and redistributes zinc.The differences in zinc content of TFIIIA, i.e. 2-356y60and B. L. Vallee and D. S. Auld 61 Zn3 cluster Zn4 cluster Fig. 8 (a) Zinc thiolate cluster of metallothionein. (b) Zinc thiolate cluster of GALA. Cys-11 and -28 are bridging ligands. Fig. 8(a) reprinted with permission from B. L. Vallee and D. S. Auld, Biochemistry, 1990,29, 5647. Copyright 1990 American Chemical Society 7-1 1mol m~l-l,~~ has been attributed to analytical and experimental parameters.However, the findings of Zeng et aLg2raise the possibility that the difference in zinc content constitute physiological differences brought about by thionein action. In that case, thionein might serve as a chemical monitor of TFIIIA action by modulating the number of zinc fingers actively engaged in the transmission process at any given time. The de novo biosynthesis of thionein could be accompanied by a redistribution of zinc among various intracellular proteins including the zinc enzymes, zinc fingers, zinc twists and zinc cluster motifs. Studies of these systems and their interaction with thionein could point to a general role of thionein/metallothionein in the regulation of such zinc-dependent systems and reveal a hierarchical order.g2 Zinc Cluster: GAL4 The GAL4 transcription factor from Saccharomyces cerevisiae is an 881 amino acid transcription factor that in the presence of galactose regulates the expression of genes encoding the galactose metabolizing enzymes.The GAL4 (1-147) fragment containing the DNA-binding domain has been expressed in E. coli. Atomic absorption analysis reveals 1.14-2.47 mol of Zn per mole of protein?3394 A GAL4 fragment purified from cells grown in the presence of Cd yielded 2.07mol of Cd per mol of pr0tein.9~ while the E. coli expressed LAC9 (84-228) transcription factor which regulates lactose and galactose metabolism in Kluyveromyces lactis contain 2.1 mol of zinc per mole of protein.95 '13Cd and EXAFS analysis of the two Cd GAL4 (1-147) protein showed that the Cd binds tetrahedrally to four cysteine sulfur ligands, suggesting that one or more shared cysteines could exist since the DNA-binding domain contains a conserved ~sequence of 6 cys (~.93*94 Subsequent 'H and '13Cd NMR studies of either two zinc or two cadmium GAL4 (62*)697 and GAL4 (7-49)96397 have shown that the two metal atoms coordinate to the six cysteines in a binuclear zinc thiolate cluster [Fig.8( b)]. The interatomic zinc-zinc distance for this zinc cluster is ca.3.5 A, reflecting a structure of this zinc coordination site that is distinctly different from those occurring in zincjngers and zinc twists.8 Cys-11 and -28 are the bridging ligands while the Cys-14/21 and Cys-31/38 pairs bind as monodentate ligands to the first and second zinc atoms, respectively.7797 Much as the Cys spacings (2, 6, 6 and 2 amino acids) in the first zinc coordination site of GR and that of GAL4 are identical, the structures of their zinc complexes differ Functional Zinc-binding Motifs Table 1 Zinc-binding sites in proteins in enzymes in DNA-binding proteins catalytic Zn-L3-H20 2x1,cluster Zn2Cysf5 coactive Zn-L3-H20 Zn, cluster Zn3Cys9 Zn4 cluster zn4cYsll{ Zn-Asp-M }structural ZnCys, Zn2 twist (znCys4)2Zn, finger ( ZnCys2His2), significantly [Fig.7(b)& 8(6)]. The spacings of cysteines (5, 9, 2 and 4 amino acids) for the second zinc site of GR differ from all of the above. Furthermore, it is apparent that some Cys (e.g. some found in the GR and OR receptors) neither are involved in binding zinc nor do they affect the capacity of some Cys residues to form bridging ligands between two zinc atoms.The primary sequences of GAL4 and GR are necessary but not sufficient to permit definitive structural predictions. Considering so far the limited number of structure determinations of DNA-binding zinc proteins, broad extrapolations from primary to tertiary structure and their relationship to function would seem premature. The DNA-recognition sites of zinc clusters awaits further NMR and X-ray diffraction studies of this factor when complexed to a DNA-binding sequence. However, an a-helix has been suggested to occur between residues 12 and 19?7 In the GAL4 family of fungal transcription factors this region contains several conserved Lys or Arg residues.8 The side chain of such amino acids could, of course, interact with DNA bases and/or the phosphate backbone. Summary The very presence of zinc in biology, questioned but 50 years ago, has recently become the hot spot of enzymatic catalysis, genetic expression and cellular messengers.The chemically stable but stereochemically flexible, non-toxic nature of zinc combined with its amphoteric properties has permitted it to orchestrate a number of zinc-binding motifs critical to life processes (Table 1). For zinc enzymes catalytic, coactive or cocatalytic and structural zinc sites exist. For DNA-binding proteins zinc fingers, twists and clusters exist. Analysis of the results of X-ray crystallographic structural determinations of 15 zinc enzymes has led to the identification of the properties of 12 catalytic, four coactive and two structural sites.Zinc forms complexes with nitrogen and oxygen just as readily as with sulfur, and this is reflected in catalytic sites having a binding frequency His >> Glu > Asp = Cys, three of which bind to the metal. Zinc is coordinated by three or four protein ligands (Asp > His >> Glu) for the coactive sites and four ligands (Cys only) for structuraf sites. Water is always a ligand to the catalytic zinc. The zinc-bound water is activated for ionization, polarization or displacement by the identity and arrangement of ligands coordinated to zinc. The systematic spacing between the ligands is striking.In non-coenzyme dependent zinc enzymes a short spacer (1-3 amino acids) enables formation of a primary bidentate zinc complex, whereas the long spacer contributes flexibility to the coordination sphere, which can poise the zinc for catalysis as well as bring other catalytic and substrate-binding groups into apposition with the active site. The coactiue zinc sites occur in enzymes containing two or more metals in close proximity. A bridging Asp, Glu or His connects to the two metals so they can work in concert to bring about catalysis. The structural zinc sites are compact domains containing only Cys ligands arranged in a tetrahedral coordination so as to affect local conformation. B. L. Vallee and D.S. Auld 63 In contrast to zinc enzymes, virtually all DNA-binding proteins contain two or more zinc atoms. As a consequence, the interatomic Zn-Zn distance becomes an important structural feature of the various zinc-binding motifs encountered. The distance between successive zinc atoms in the zincjnger protein Zif 268 is ca. 27 A. In this case, the zinc is firmly bound to two Cys and two His with DNA-recognition sites being formed from an &-helical zinc peptide loop occurring within each zinc coordination site. Zinc orientates the ‘finger’ for site-specific interaction with the DNA. In marked contrast, interatomic distances of Zn(1) to Zn(2) for the zinc twists, observed in the GR and OR receptors is ca. 13A. The DNA-recognition site now comes from an a-helical peptide that is part of the linker peptide between the two zinc sites which are each coordinated to four Cys.Zinc is important both to anchoring the recognition peptide for its interaction with DNA and in orientating the peptide fold of the second zinc site which is crucial to the dimerization process. In zinc clusters (GAL4 transcription factor) six Cys residues bind to two zinc atoms forming a zinc thiolate cluster in which the Zn( 1)-Zn(2) distance is ca. 3.5 A. Clearly, the results on GAL4, GR, OR and Zif 268 extend to members of those protein families. Other proteins may yet be shown to contain multiple zinc atoms which might encompass variable numbers and types of ligands, e.g. glutamic or aspartic acid and histidine.When such proteins contain multiple zinc centres, their zinc-zinc distances may well prove important indices both of overall structure and function in any given zinc protein. As additional structures become available, the detection of yet further motifs revealing other permutations of zinc chemistry may be expected. References 1 B. L. Vallee and D. S. Auld, Proc. Natl. Acad. Sci. USA, 1990, 87, 220. 2 B. L. Vallee and D. S. Auld, Biochemistry, 1990,29, 5647. 3 E. B. Springman, E. L. Angleton, H. Birkedal-Hansen and H. E. Van Wart, Roc. Natl. Acad. Sci. USA, 1990, 87, 364. 4 P. Schultze, E. Worgotter, W. Braun, G. Wagner, M. VaSBk, J. H. R. Kagi and K. Wuthrich, J. Mol. Biol., 1988, 203, 251. 5 A. H. Robbins, D. E. McRee, M. Williamson, S.A. Collett, N. H. Xuong, W. F. Furey, B. C. Wang and C. D. Stout, J. Mol. Biol., 1991, 221, 1269. 6 T. Pan and J. E. Coleman, Proc. Natl. Acad. Sci. USA, 1990, 87, 2077. 7 T. Pan and J. E. Coleman, Biochemistry, 1991, 30,4212. 8 B. L. Vallee, J. E. Coleman and D. S. Auld, Proc. Natl. Acad. Sci. USA, 1991, 88, 999. 9 B. L. Vallee, Experientia Suppl., 1979, 34, 19. 10 B. L. Vallee, in Zinc Enzymes, ed. I. Bertini, C. Luchinat, W. Maret and M. Zeppezauer, Birkhauser, Boston, MA, 1986, pp. 1-15. 11 B. L.Vallee, Experientia Suppl., 1987, 52, 5. 12 B. L. Vallee, Methods Enzymol., 1991, 205, 3. 13 K. K. Kannan, B. Notstrand, K. Fridborg, S. Logren, A. Orlsson and M. Petef, Proc. Natl. Acad. Sci. USA, 1975, 72, 51. 14 A. Liljas, K. K. Kannan, P.C. BergstCn, 1. Waara, K. Fridborg, B. Strandberg, V. Carlbom, L. Jarup, S. Logren and M. Petef, Nature (London), 1972, 235, 131. 15 B. J. Sutton, P. J. Artymiuk, A. E. Cordero-Borboa, C. Little, D. C. Phillips and S. G. Waley, Biochem. J., 1987, 248, 181. 16 0. Dideberg, P. Charlier, G. Dive, B. Joris, J. M. Frkre and J. M. Ghuysen, Nature (London), 1982, 299,469. 17 D. K. Wilson, F. B. Rudolph and F. A. Quiocho, Science, 1991, 252, 1278. 18 F. A. Quiocho and W. N. Lipscomb, Adu. Protein Chem., 1971, 25, 1. 19 M. F. Schmid and J. R. Herriott, J. Mol. Biol., 1976, 103, 175. 20 B. W. Matthews, J. N. Jansonius, P. M. Colman, B. P. Schoenborn and D. Dupourque, Nature New Biology, 1972, 238, 37. 21 R. A. Pauptit, R. Karlsson, D. Picot, J.A. Jenkins, A.-S. Niklaus-Reimer and J. N. Jansonius, J. Mol. Biol., 1988, 168, 525. 22 M.M. Thayer, K. M. Flaherty and D. B. McKay, J. Biol. Chem., 1991, 266, 2864. 23 E. Hough, L. K. Hansen, B. Birknes, K. Jynge, S. Hansen, A. Horvik, C. Little, E. Dodson and Z. Derewenda, Nature (London), 1989,338, 357. Functional Zinc-binding Motifs 24 E. E. Kim and H. W. Wyckoff, J. Mol. Biol., 1991, 218,449. 25 C. I. BrandCn, H. Jomvall, M. Eklund and B. Furugren, in Enzymes, ed. P. D. Boyer, Academic Press, New York, 3rd edn., 1975, vol. 11, p. 103. 26 B. L. Vallee and D. S. Auld, FEBS Lett., 1989, 257, 138. 27 M. Fujinaga and M. N. G. James, J. Mol. Biol., 1987, 195, 373. 28 B. C. Cunningham, S. Bass, G. Fuh and J. A. Wells, Science, 1990, 250, 1709.29 S. K. Burley, P. R. David, A. Taylor and W. N. Lipscomb, Proc. Natl. Acad. Sci. USA, 1990,87,6878. 30 S. K. Burley, P. R. David and W. N. Lipscomb, Roc. Natl. Acad. Sci. USA, 1991,88, 6916. 31 L. S. Beese and T. A. Steitz, EMBOJ., 1991, 25. 32 V. Derbyshire, N. D. F. Grindley and C. M. Joyce, EMBO J., 1991, 17. 33 J. A. Tainer, D. Getzoff, K. M. Beem, J. S. Richardson and D. C. Richardson, J. Mol. Biol., 1982, 160, 181. 34 R. B. Honzatko, J. L. Crawford, H. L. Monaco, J. E. Ladner, B. F. P. Edwards, D. R. Evans, S. G. Warren, D. C. Wiley, R. C. Ladner and W. N. Lipscomb, J. Mol. Biol., 1982, 160, 219. 35 B. Malfroy, H. Kado-Fong, C. Gros, B. Giros, J.-C. Schwartz and R. Hellmiss, Biochem. Biophys. Res. Commun., 1989, 161, 236. 36 J.2. Haeggstrom, A. Wetterholm, R. Shapiro, B. L. Vallee and B. Samuelsson, Biochem. Biophys. Res. Commun., 1990, 172,965. 37 J. 2. Haeggstrom, A. Wetterholm, B. L. Vallee and B. Samuelsson, Biochem. Biophys. Res. Commun., 1990, 173,431. 38 L. Oming, G. Krivi and F. Fitzpatrick, J. Biol. Chem., 1991, 266, 1375. 39 L. Oming, G. Krivi, G. Bild, J. Gierse, S. Ayken and F. H. Fitzpatrick, J. Biol. Chem., 1991,266, 16507. 40 J. F. Medina, A. Wetterholm, 0.Ridmark, R. Shapiro, J. 2.Haeggstrom, B. L. Vallee and B. Samuelsson, Roc. Natl. Acad. Sci. USA, 1991, 88, 7620. 41 Q. Wu, J. M. Lahti, G. M. Air, P. D. Burrows and M. D. Cooper, Proc. Natl. Acad. Sci. USA, 1990, 87, 993. 42 Q. Wu, L. Li, M. D. Cooper, M. Pierres and J. P. Gorvel, Roc. Natl. Acud.Sci, USA, 1991, 88, 676. 43 M. D. Bond and H. E. Van Wart, Biochemistry, 1984, 23, 3077. 44 M. D. Bond and H. E. Van Wart, Biochemistry, 1984, 23, 3085. 45 H. E. Van Wart and H. Birkedal-Hanson, Proc. Natl. Acad. Sci. USA, 1990,87, 5578. 46 B. L. Vallee and D. S. Auld, in Matrix Suppl. 1, ed. H. Birkedal-Hanson, 2.Werb, H. Welgus and H. Van Wart, Gustav Fischer Verlag, Stuttgart, 1992, pp. 5-19. 47 K. Linderstrom-Lang, in Lane Medical Lectures, Stanford Univ. Publ., Univ. Ser., Med. Sci. 1952, vol. 6, pp. 1-115. 48 H. Neurath, Chem. Scr., 1986, 27B, 221. 49 R. Sanchez-Lopez, R. Nicholson, M.-C. Gesnel, L. M. Matrisian and R. Breathnach, J. Mol. Biol., 1988, 263, 11892. 50 L. J. Windsor, H. Birkedal-Hansen, B. Birkedal-Hansen and J. A. Engler, Biochemistry, 1991,30,641.51 A. J. Park, L. M. Matrisian, A. F. Kells, R. Pearson, Z. Yuan and M. Navre, J. Biol. Chem., 1991,266, 1584. 52 B. L. Vallee, Experientia, 1977, 33, 600. 53 B. L. Vallee, in Biological Aspects of Inorganic Chemistry, ed. D. Dolphin, Wiley, New York, 1977, pp. 37-70. 54 D. S. Auld, Ada Chem. Ser., 1979, 172, 112. 55 B. L. Vallee and K. H. Falchuk, Philos. Trans. R. SOC.London, Ser. B, 1981, 294, 185. 56 J. S. Hanas, D. J. Hazuda, D. F. Bogenhagen, F. Y.-H. Wu and C.-W. Wu, J. Biol. Chem., 1983,258,14120. 57 A. M. Ginsberg, B. 0. King and R. G. Roeder, Cell, 1984,39,479. 58 R. S. Brown, C. Sander and P. Argos, FEBS Lett., 1985, 186, 271. 59 J. Miller, A. D. McLachlan and A. Klug, EMBO J., 1985, 4, 1609. 60 2.Shang, Y.-D.Liao, F. Y.-H. Wu and C.-W. Wu, Biochemistry, 1989, 28, 9790. 61 M. K. Han, F. P. Cyran, M. T. Fischer, S. H. Kim and A. Ginsburg, J. Biol. Chem., 1990, 265, 13792. 62 A. Klug and D. Rhodes, Trends Biochem. Sci., 1987, 12, 464. 63 J. M. Berg, Proc. Natl. Acad. Sci. USA, 1988, 85, 99. 64 J. M. Berg, Science, 1986, 232, 485. 65 A. D. Frankel, D. S. Bredt and C. 0. Pabo, Science, 1988, 240, 70. 66 T. L. South, P. R. Blake, R. C. Sowder, 111, L. 0. Arthur, L. E. Henderson and M. F. Summers, Biochemistry, 1990, 29, 7786. 67 N. P. Pavletich and C. 0. Pabo, Science, 1991, 252, 809. 68 G. Pkaga, S. J. Horvath, A. Eisen, W. E. Taylor, L. Hood, E. T. Young and R. E. Klevit, Science, 1988,241, 1489. 69 M. S. Lee, G. P. Gippert, K. V. Soman, D.A. Case and P. E. Wright, Science, 1989, 245, 635. 70 D. Neuhaus, Y. Nakaseko, K. Nagai and A. Klug, FEBS Lett., 1990,262, 179. 71 M. D. Carr, A. Pastore, H. Gausepohl, R. Frank and P. Roesch, Eur. J. Biochem., 1990, 188,455. 72 R. M. Evans, Science, 1988, 240, 889. 73 W. Wahli and E. Martinez, FASEB J., 1991, 5, 2243. B. L. Vallee and D. S. Auld 74 L. P. Freedman, B. F. Luisi, Z. R. Korszun, R. Basavappa, P. B. Sigler and K. R. Yamamoto, Nature (London), 1988,334, 543. 75 T. Hard, E. Kellenbach, R. Boelens, B. A. Maler, K. Dahlman, L. P. Freedman, J. Carlstedt-Duke, K. Yamamoto, J.-A. Gustafsson and R. Kaptein, Science, 1990, 249, 157. 76 T. Hard, E. Kellenbach, R. Boelens, R. Kaptein, K. Dahlman, J. Carlstedt-Duke, L. P. Freedman, B.A. Maler, E. I. Hyde, J.-& Gustafsson and K. Yamamoto, Biochemistry, 1990, 29, 9015. 77 T. Pan, L. P. Freedman and J. E. Coleman, Biochemistry, 1990, 29,9218. 78 E. Kellenbach, B. A. Maler, K. R. Yamamoto, R. Boelens and R. Kaptein, FEBS Lett., 1991,291,367. 79 J. W. R. Schwabe, D. Neuhaus and D. Modes, Nature (London), 1990,348,458. 80 B. F. Luisi, W. X. Xu, 2. Otwinowski, L. P. Freedman, K. R. Yamamoto and P. B. Sigler, Nature (London), 1991,352,497. 81 J. H. R. Kagi and Y. Kojima, Experientia Suppl., 1987, 52, 25. 82 J. Zeng, B. L. Vallee and J. H. R. Kagi, Proc. Natl. Acad. Sci. USA, 1991, 88, 9984. 83 J. H. R. Kagi and B. L. Vallee, J. Biol. Chem., 1961, 236, 2435. 84 A. Arseniev, P. Schultze, E.Worgotter, W.Braun, G. Wagner, M. VaShk, J.H.R.Kagi and K. Wiithrich, J. Mol. Biol., 1988, 201, 637. 85 B. A. Messerle, A. Schaffer, M. VaShk, J. H. R. Kagi and K. Wuthrich, J. Mol. Biol., 1990, 214, 765. 86 J. D. Otvos and I. M. Armitage, Roc. Natl. Acad. Sci. USA, 1980,77, 7094. 87 M. VaS6k and R. Bauer, J. Am. Chem. SOC., 1982, 104, 3236. 88 J. H. R. Kagi, S. R. Himmelhoch, P. D. Whanger, J. L. Bethune and B. L. Vallee, J. Biol. Chem., 1974, 249, 3537. 89 A. Avdeet, A. J. Zelazowski and J. S. Gamey, Znorg. Chem., 1985,24, 1928. 90 M. VaSik, G. E. Hawkes, J. K. Nicholson and P. J. Sadler, Biochemistry, 1985, 24, 740. 91 J. D. Otvos, H. R. Engeseth, D. G. Nettesheim and C. R. Hilt, Experientia Suppl., 1987, 52, 171. 92 J. Otvos, S-M. Chen and X. Liu, in Metal Ion Homeostasis: Molecular Biology and Chemistry, Alan R. Liss, Inc., New York, 1989, pp. 197-206. 93 T. Pan and J. E. Coleman, Proc. Natl. Acad. Sci. USA, 1989, 86, 3145. 94 J. F. Povey, G. P. Diakun, C. D. Gamer, S. P. Wilson and E. D. Laue, FEBS Lett., 1990, 266, 142. 95 Y.-D. C. Halvorsen, K. Nandabalan and R. C. Dickson, J. Biol Chem., 1990,265, 13283. 96 P. L. Gadhavi, A. R. C. Raine, P. R. Alefounder and E. D. Laue, FEBS Lett., 1990,276,49. 97 P. L. Gadhavi, A. L. Davis, J. F. Povey, J. Keeler and E. D. Laue, FEBS Lett., 1991, 281, 223. Paper 2/00024E; Received 20th December, 1991
ISSN:1359-6640
DOI:10.1039/FD9929300047
出版商:RSC
年代:1992
数据来源: RSC
|
6. |
Structure and mechanism ofD-xylose isomerase |
|
Faraday Discussions,
Volume 93,
Issue 1,
1992,
Page 67-73
David M. Blow,
Preview
|
PDF (593KB)
|
|
摘要:
Faraday Discuss., 1992, 93, 67-73 Structure and Mechanism of D-Xylose Isomerase David M. Blow, Charles A. Collyer,?Jonathan D. Goldberg and Oliver S. Smart$ Blackett Laboratory, Imperial College of Science Technology and Medicine, London S W7 2BZ, UK The action of xylose isomerase depends on the presence of two divalent cations. Crystal structure analyses of the free enzyme, and of the enzyme bound to a variety of substrates and inhibitors, have provided models for a number of distinct intermediates along the reaction pathway. These models, in turn, have suggested detailed mechanisms for the various chemical steps of the reaction: a ring opening catalysed by an activated histidine, a hydride- shift isomerization, and a ring closure which may be facilitated by a polarised water molecule.Xylose isomerase catalyses the isomerisation of the aldose, xylose, to the ketose xylulose. Many xylose isomerases, especially from Actinomycetes, are also efficient catalysts for isomerisation of the hexoses, glucose and fructose, and are very stable enzymes even at 50 "C. This gives them industrial importance in the processing of corn syrup for use in soft drinks. They also have potential application in production of ethanol from waste cellulose sources. Crystallographic analysis of this enzyme, complexed with a variety of substrates and substrate analogues, has provided a detailed hypothesis for the reaction mechanism of this cation-dependent enzyme. The molecular structure of xylose isomerase is based on the eight-stranded a/Pbarrel structure,1y2 first observed in triose-phosphate is~merase,~ and subsequently found in many other enzymes.In each case the substrate binds in the centre of the cu/p barrel at its carboxy-terminal end, but there is little further similarity between the two enzymes. While triose phosphate isomerase is a simple monomer, xylose isomerase is a tetramer, in which pairs of barrels are roughly coaxial, with their two substrate-binding sites in proximity, and each peptide chain has a long carboxy-terminal extension which enfolds the other pair of barrels which make up the tetramer. Unlike triose phosphate isomerase, xylose isomerase is a metalloenzyme in which two cation-binding sites exist on each s~bunit.~Site 1 is a tight-binding site while site 2 binds its cation more loosely and, as will be shown, more flexibly.Each cation is coordinated octahedrally with ligands to four carboxylate groups and two substrate hydroxyls in site 1, and four carboxylates, a histidine and a water molecule in site 2. One carboxylate, Glu-216,$ coordinates both cation sites. The presence of two basic amino acids in the active site provides an equal number of positive and negative charges in the vicinity. These two cation sites lie to one side of a large cavity at the carboxy-terminal end of the eight 6-strands of the alpbarrel. The two cavities of the pair of barrels which face each other are adjacent, and share a common access route to bulk solvent. The side of the cavity facing the cation sites is hydrophobic, composed of several aromatic amino acid side chains, including t Present address: Farmitalia, Carlo Erba, via dei Gracchi, Milano, Italy. $ Present address: Dept of Crystallography, Birkbeck College, Malet St., London WClE 7HX, UK.Ei Amino acid numering is for xylose isomerase of Arthrobacter B3278, throughout. 67 68 Structure and Mechanism of Xylose Isomerase Table 1 Pyranose-substrates and substrate analogues used as ligands 2 X Y Z xylose OH 0 H glucose OH 0 CH20H 5-thioglucose OH S CH20H 1-deoxynojirimycin H NH CHZOH 1 Fig. 1 Electron density for 5-thioglucose bound to the active site. All electron-density maps are -Fcalc is the refined structure in the absence of the ligand displayed as Fderiv maps, where FcaIc one (Phe-25) from an adjacent subunit, but near to this hydrophobic surface is an activated histidine (His-53) which is polarised by a buried aspartate (Asp-56).X-Ray analysis of crystalline complexes of the xylose isomerase of Arthrobacter B3278 with a variety of ligands (Table l),and using various cations, has allowed us to create realistic models which represent a number of steps in the isomerisation of an aldose to a ketose.’ Binding 5-thioglucose to enzyme crystals provided the first model of a complex of the enzyme with a closed-ring aldose (Fig. 1). The electron density in the presence of 1-deoxynojirimycin indicated a similar ring orientation. In these structures cation 1 coordinates to O(3) and O(4) of the pyranose ring while cation 2 makes no direct interaction with the sugar.In the 5-thioglucose and 1-deoxynojirimycin complexes, His-53, which is polarized by the side chain of Asp-56, provides a correctly oriented nucleophile to catalyse ring opening by transfer of a proton from O(1) to O(5) (Fig. 2). When xylose bound to enzyme crystals, the electron density plainly showed the sugar in an open-chain conformation’ (Fig. 3). Similar electron density distributions were observed on binding the five- and six-carbon polyols, xylitol and sorbitol, to the crystal^.^ These electron-density distributions were interpreted as showing cation 1 bound to O(2) H OH D.M. Blow et al. Fig. 2 Scheme for ring-opening catalysed by His-53' Fig.3 Electron density for xylose bound at the active site and O(4) of the open-chain sugar. The quality of crystals of the Arthrobacter enzyme limited the attainable resolution to 2.3 A at least (often in practice, to 2.5 A), and at this resolution details of the conformation could not be assigned. However, Whitlow et aL,6 in a similar experiment using crystals of the enzyme from Streptomyces rubiginosus, complexed with an equilibrium mixture of xylose-xylulose, were able to obtain diffrac- tion data to 1.6 A resolution, and noted a planar conformation at the site we assumed to be occupied by C(2), indicating the open-chain ligand to be predominantly in the ketose form. In the above experiments the cation at site 2 still had no direct ligands to the substrate.However, when xylose was incubated with enzyme crystals in the presence of A13+,a somewhat different structure was observed' (Fig. 4). Under the conditions of the experiment, there was evidence that the site 1 cation is A13', while the loosely bound cation at site 2 was Mg2+. Small structural changes of the enzyme around site 2 were evident, including a significant rearrangement of Asp-254 and smaller movements of Glu-216 and His-219, which accompanied a 1A movement of the second cation to site 2', where it coordinates O(1) and O(2) of the substrate. This A13' complex is considered to be a transition-state analogue in which the cation at 2' is equidistant from 0(1) and O(2) and 2', 0(1), C(1), C(2) and O(2) are all approximately coplanar.Cation 1 also coordinates 0(2), while NHl( Lys-182) coordinates O(l), strongly polarising both C-0 bonds towards C+-O-. One hydroxyl proton [say OH(2)] is assumed to be lost (possibly to a water molecule), creating conditions where the hydrogen H(2) can readily transfer Structure and Mechanism of Xylose Isomerase Fig. 4 Electron density for xylose incubated with the enzyme in the presence of A13+ and Mg2’ 1)2+ It ,.o +dH ‘HLyS-182-N,AHH M(2I2+ Fig. 5 Scheme for isomerisation by hydride shift5 D. M. Blow et al. Fig. 6 Electron density for 2,5-dideoxy-2,5-imino-~-glucitol(DDIG) bound to the active site HO H HO H a-0-xylulduranose HOCH2H CH2OH HOCHp CH20H HOCH2 OH ‘~H4 DDIG ~~H aFF ‘HWHzOHHO CH-Enz HO H HO H 3-deoxy-3-fluoromethylene -&glucose Fig.7 Schematic formulae for sugar analogues. The dashed line indicates the pseudosymmetry of DDIM. The structure labelled ‘3-deoxy-3-fluoromethylene-~-glucose’is the form considered to be bound to the enzyme12 as a hydride ion between the carbonium-like C(2) and C(1). This makes a plausible model for the isomerization by a hydride-shift mechanism (Fig. 5). Lee et aL7 demonstrated that when deuteriated glucose ([2-2H]glucose) is used as a substrate, the rate of isomerisation is reduced by a factor of four. This shows that the rate-limiting step of the enzyme-catalysed reaction is the isomerisation, and that it requires a hydrogen transfer from C(2). This experiment was done using xylose isomerase from Clostridium thermosulfurogenes, but we have observed the same effect using enzyme from Arthrobacter.’ It has proved difficult to visualise directly the complex of the enzyme with a furanose such as xylulose or fructose in the closed-ring form.Crystallographic experiments using 5-thio-~-fructose(kindly synthesized by F. J. Montgomery and P. Grice) and 1,4-dideoxy- 1,4-imino-~-arabinitol (a gift from Dr R. Nash’) did not show electron density for the Structure and Mechanism of Xylose Isomerase Y;I I I I Fig. 8 Proposed scheme for the ring-opening, isomerization and ring-closure reactions ligand in a unique bound state.' 2,5-Dideoxy-2,5-imino-~-mannitol(DDIM) and 2,5- dideoxy-2,5-imino-~-glucitol(DDIG) were synthesized by N.G. Ramsden and G. W. J. Fleet, and DDIG (Ki=50 mmol drnh3) was found to inhibit the enzyme much more strongly than DDIM." DDIG and DDIM are analogues of a-~-and p-D-fructofuranose (aFF and PFF), in which the ring oxygen is replaced by an imino group, and in which O(2) is absent. The electron density of DDIG when bound to the enzyme is shown in Fig. 6. This structure cannot simply be interpreted as a model for the binding of aFF to the enzyme, since insertion of the extra oxygen at O(2)leads to a severe steric clash with the six-membered ring of Trp-15. It may be noted that the DDIM molecule has almost precise two-fold symmetry about an axis through the ring heteroatom and the mid-point of C(3)-C(4) (Fig. 7: the symmetry could be precise if the pucker obeyed it).In the case of DDIG, the correspond- ing symmetry is destroyed by the opposite chirality at C(2). Looked at in this way, PFF is rather like DDIM with an oxygen added at O(2) (and of course, an oxygen in the ring). aFF bears the same relation to DDIG. Makkee et all1 showed that aFF is the form of fructose produced by the enzyme. In nature, the product of the enzyme is D-xylulose, and following Makkee we might assume it is a-D-xylulofuranose. This molecule may be considered somewhat closer to two-fold symmetry than DDIG or aFF (Fig. 7).Carrel1 et aLI2 analysed a structure formed when the inhibitor 3-deoxy-3-fluoromethylene-D-glucose(Fig. 6) was bound to xylose isomerase from Streptomycesrubiginosus.The ligand was observed covalently bound to the enzyme, in a p-furanose form. The conformation observed was similar to that created if the observed conforma- tion of bound DDIG is rotated 180" about the pseudo-two fold axis. We have hypothesised" that aFF is bound in an analogous manner, namely similar to the observed conformation of bound DDIG, rotated 180" about this axis. In such a D. M.Blow et al. conformation, there are no steric clashes. This conformation brings a firmly bound water molecule (Wat-519), observed in other states of the enzyme, within hydrogen- bonding distance of 0(2), where it may act as a base to facilitate ring closure/opening. Cation 2 coordinates Wat-519 while cation 1coordinates both O(5)and O(2). 0(1) and O(2)occupy positions close to those observed in the various open-chain conformations.Fig. 8presents a complete scheme for the ring-opening, isomerisation and ring-closure reaction. Ligands have been observed in all the proposed conformations except the aFF form which is still hypothetical. Another possibility is that a product might leave the active site in the (open-chain) ketose form, so that no state corresponding to Fig. 8( f)exists. Molecular mechanics calculations have shown that the active site cavity is large enough to accommodate the required changes from the pyranose to the open-chain conformation, and that there is no major energy barrier for the conformational change.' Preliminary energy calculations suggest a low energy for a-D-xylulofuranose in the proposed conformation of Fig.8 (f). References 1 F. R. Salemme, Prog. Biophys. Mol. Biol., 1983,42, 95. 2 A. M. Lesk, C. I. Branden and C. Chothia, Proteins, 1989, 5, 139. 3 D. W. Banner, A. C. Bloomer, G. A. Petsko, D. C. Phillips, C. I. Pogson and I. A. Wilson, Nature (London), 1975,255,609. 4 K. Henrick, C. A. Collyer and D. M. Blow, J. Mol. Biol, 1989, 208, 129. 5 C. A. Collyer, K. Henrick, and D. M. Blow, J. Mol. Biol., 1990, 212, 211. 6 M. Whitlow, A. J. Howard, B. C. Finzel, T. L. Poulos, E. Winborne and G. L. Gilliland, Proteins, 1991, 9, 153 7 C. Lee, M. Bagdasarian, M. Meng, and J. G. Zeikus, J. Biol. Chem., 1990,265, 19082. 8 0.S. Smart, J. Akins and D. M. Blow, Proteins, in the press. 9 R. J. Nash, E. A. Bell and J. M. Williams Phytochem., 1985, 24, 1620. 10 C. A. Collyer, J. D. Goldberg, H. Viehmann, D. M. Blow, N. G. Ramsden, G. W. J. Fleet, F. J. Montgomery and P. Grice Biochemistry, submitted. 11 M. Makkee, A. P. G. Kieboom and H. van Bekkum, Recl. Trav. Chim. Pays Bus, 1984, 103, 361. 12 H. L. Carrell, J. P. Glusker, V. Burger, F. Manfre, D. Tritsch and J-F. Biellmann, Proc. Natl. Acad. Sci USA, 1989,86,4440. Paper 1/06423A; Received 18th December, 1991
ISSN:1359-6640
DOI:10.1039/FD9929300067
出版商:RSC
年代:1992
数据来源: RSC
|
7. |
Three-dimensional structure of galactose oxidase: an enzyme with a built-in secondary cofactor |
|
Faraday Discussions,
Volume 93,
Issue 1,
1992,
Page 75-84
Nobutoshi Ito,
Preview
|
PDF (1133KB)
|
|
摘要:
Faraday Discuss., 1992, 93, 75-84 Three-dimensional Structure of Galactose Oxidase: An Enzyme with a Built-in Secondary Cofactor Nobutoshi Ito, Simon E. V. Phillips, Conrad Stevens, Zumrut B. Ogel, Michael J. McPherson, Jeffery N. Keen, Kapil D. S. Yadav? and Peter F. Knowles Department of Biochemistry and Molecular Biology University ofLeeds, Leeds LS29JT, UK Galactose oxidase is a copper-containing enzyme, which catalyses stereos- pecific oxidation of primary alcohols. The three-dimensional structure of the enzyme has been determined in this study by X-ray crystallography at high resolution. The molecule is almost entirely composed of @-structures and consists of three domains. The arrangement of 28 @-strands in the second domain is of particular interest, having seven four-stranded antiparallel p -sheets with pseudo-sevenfold symmetry.The copper site has square-pyramidal coordination with two histidines, one tyrosine and one exogenous ligand at the equatorial sites and another tyrosine at the axial site. The most intriguing structural feature is a covalent bond between C"' of Tyr-272, which is one of the equatorial ligands, and Sy of Cys-228. This unexpected thioether bond, and Trp-290 stacked above it, strongly supports the presence of a tyrosine free radical in the enzyme as a 'built-in' secondary cofactor. Calculation of the molecular surface shows a small pocket at the copper site and suggests a substrate-binding model, which can explain the substrate specificity. A model for the catalytic mechanism, involving a tyrosine free radical and basic tryptophan, is also proposed.Introduction Galactose oxidase (GOase; EC 1.1.3.9) is an extracellular enzyme produced by the fungus, Dactylium dendroides. The enzyme catalyses the oxidation of primary alcohols (e.g. at the O(6) position of D-galactose) using molecular oxygen, to produce the corresponding aldehyde and hydrogen peroxide. Although the biological role of the enzyme is not known, its unique features have attracted much attention in terms of the catalytic mechanism and potential biotechnological applications.' These features include: (i) structural simplicity; GOase consists of a single polypeptide chain of 68000 u and contains one copper ion with no dissociable cofactor; (ii) wide but stereospecific substrate specificity; GOase catalyses the oxidation of primary alcohols ranging from small molecules to large polysaccharides, yet some, like D-glucose and L-galactose, are not substrates; (iii) remarkable stability; GOase is active in 6 mol dm-3 urea.A fundamental question about the catalytic mechanism of GOase is how an enzyme with a single copper ion and no other dissociable cofactor identified can catalyse the two-electron oxidation of the substrate. Two models have been put forward; one involving a tyrosine free radical2 and the other proposing covalently bound pyrroloquino- line quinone (PQQ)3 as cofactor. Our high-resolution crystal structure of the enzyme4 has ruled out the presence of PQQ as the secondary cofactor and revealed an unexpected t Current address: Department of Chemistry, University of Gorakhpur, India.75 3 -D Structure of Galactose Oxidase covalent bond between Tyr-272 and Cys-228, which gives strong support for the tyrosine free radical model. The idea of a protein free radical originated from an observation that GOase is isolated as a mixture of active and inactive form.5y6 The active form, which is EPR-silent, is obtained by one-electron oxidation of the inactive form, which shows an EPR-signal typical of Cu". In early studies5 this was attributed to the oxidation of the copper to CulI1,which has since been disproved.7y8 Instead, Whittaker and his colleagues have proposed a model where the inactive form is activated by oxidation of an amino acid side chain to a free radical, which is antiferromagnetically coupled to the copper so that neither the free radical nor the copper gives observable EPR signals.6 The presence of a tyrosine free radical is confirmed at least in the apoenzyme (i.e.the copper-depleted enzyme), in which the interaction between the copper and free radical is decoupled.' Several enzymes have been reported to contain a free radical as a redox cofactor derived directly from their own polypeptide chains.Of these, the best studied examples are ribonucleotide reductase and cytochrome-c peroxidase. The former contains a tyrosine free radical at Tyr-122 which lies near the binuclear iron while, in the latter, a radical derived from Trp-191 has been identified.l2-l4 Although the presence of a protein-derived free radical in these enzymes has been clearly demonstrated, the role in the catalytic mechanism is not yet fully clear. Here we present a model for the catalytic mechanism of GOase based on the crystal structure of GOase at high resolution.The model suggests an interesting multiple role for the tyrosine free radical. Methods and Results The crystals of GOase were grown from 0.8 mol dm-3 acetate buffer (pH 4.3-4.7) with ammonium sulfate as precipitant. The space group is C2 with unit-cell parameters, a = 98.0 A, b = 89.0 A, c = 86.7 A, and p = 117.8 O. The structure was solved by multiple isomorphous replacement with anomalous scattering, using three heavy-atom deriva- tives at 2.5 A, then refined by Hendrickson-Konnert-type least squares to an R factor of 17.7% for all observed data between 10 and 1.7k A native crystal was soaked in 0.1 mol dm-3 PIPES buffer (pH 7.0) with 25 % poly(ethyleneglyco1) to remove acetate ions present in the native structure at pH 4.5 (see below) and the structure was solved and refined at 1.9 A resolution.? Some of the crystallographic data are summarised in Table 1.The primary sequence data of GOase were also determined at the same time, using both DNA and peptide sequence information.'6 Partial peptide sequence data were used to design degenerate primers to amplify the GOase gene by the polymerase chain reaction. The amino acid sequence derived from the DNA sequence shows that the mature GOase molecule consists of 639 residues. The crystal structure of GOase revealed a unique polypeptide fold (Fig. 1).This relatively large protein with 639 residues consists almost entirely of p-structure. The structure of GOase is divided into three domains; D1 (residues 1-155), D2 (residues 156-532) and 03 (residues 535-639). The N-terminal domain D1 has a typical @-sandwich structure, where a five-stranded antiparallel p -sheet faces another three stranded antiparallel /3 -sheet. D2 is the largest, with 28 /3 -strands forming seven four-stranded antiparallel @-sheets, arranged like the petals of a flower around a pseudo-sevenfold axis (Fig. 2). Such a '@-flower' has also been reported in ne~rarninidase'~and methylamine dehydrogenase, l8 although the number of P-sheets in the former case is six.D3 lies on the other side of D2 from the t The details of the structure determination as well as a fuller description of the structure obtained will be published else~here.'~ N.Ito et al. Table 1 Crystallographic data for the two forms of galactose oxidase ~ ~ -~ -~~~~ native (pH 4.5) native (pH 7.0) data collection resolution/ A 10-1.7 10-1.9 Nabs 117980 83387 nunique 57092 46302 completeness (%) 79.1 89.5 Rsym (%I 3.9 5 .O refinement R for all observed data (%) 17.7 17.0 RMSD for bond lengths 0.018 0.019 from ideal values/A model includes peptide residues 639 639 CU" ion 1 1 sodium ion 1 1 acetate ions 2 0 water molecules 316 310 Fig. 1 Stereoview of C" backbone of galactose oxidase.D1 is shown in the lower part of the molecule, and the large globular D2 is in the centre foreground, with D3 behind. The small sphere is the copper ion Fig. 2 Stereoview of C" backbone of D2. The 28 p-strands are emphasised by thick lines 3-DStructure of Galactose Oxidase Tyr-495 His-496 qq His-581 acetate ion Tyr-272 Fig. 3 Schematic drawing of the copper coordination. At pH 4.5, Tyr-272, His-496, His-581, and an acetate ion form an almost perfect square, which is distorted at pH 7.0, where the acetate ion is replaced by a water molecule at 2.8 A from the copper copper, and has a rather complicated structure, probably best described as a distorted seven-stranded p-barrel with a large hydrophobic core in the middle.The only covalent linkages between the domains are uia an extended strand (Dl-D2) and a flexible loop (D2-D3); no disulfide bridges are observed between them. However, D1 and D3 appear to have strong non-bonded interactions with D2. A number of hydrogen bonds (direct or water-mediated), salt bridges and hydrophobic interactions probably add to the remarkable stability of the enzyme. @-sheets from D1 and D2 are joined by hydrogen bonds in the manner of a parallel &sheet to construct a large nine-stranded mixed p-sheet across the domains. The interactions between D2 and D3 are very complicated and interesting. D2 has a large hole through the middle along its pseudo-sevenfold axis, and two p-strands of D3 pierce the hole from one side to the other.Because His-581, which is located at the tip of this intruding 'finger', is one of the copper ligands (see below), the dissociation of D2 and D3 requires rupture of this coordination. This is consistent with the report that copper-depleted GOase is less stable than the h~loenzyme.~' There are ca. 40 water molecules found in the gap between the 'finger' and D2. The copper site resides on the surface of D2, just on the pseudo-sevenfold axis, and is surrounded by a number of aromatic side chains. The copper coordination is square pyramidal with Tyr-272 (coordination distance 1.94 A), His-496 (2.1 1 A), His-581 (2.15 A), and an acetate ion (2.27 A) at the equatorial sites and Tyr-495 (2.69 A) at the axial site (Fig. 3).For the structure in PIPES buffer (pH 7.0), where acetate is absent, the acetate ion is replaced by water or hydroxy ion at a distance of 2.81 A, suggesting very weak coordination. Anionic inhibitors, such as NY, F-, and CN-, would easily replace it to occupy this site. Alcohol substrate and/or molecular oxygen may also bind the copper at this site. Spectroscopically, the copper site of GOase is classified as type-2, which is charac- terised by spectroscopic properties similar to simple copper salts (e.g. CUSO~).~'The presence of an external ligand (e.g. water), in addition to protein ligands, seems to be a common feature of type-2 copper sites in the enzymes whose structures have been reported, such as Cu,Zn superoxide dismutase21 and nitrite reductase.22 The copper site of GOase is, however, unique in having tyrosine residues directly coordinated to the copper.To our great surprise, one of the copper ligands, Tyr-272, is covalently bound to the sulfur of Cys-228 at C" (Fig.4). The quality of the 1.7 A refined map is such that the presence of this unusual bond is irrefutable. The presence of the thioether bond is consistent with other observations from biochemical and spectroscopic data. Although the structure refinement was carried out without any stereochemical restraint on this bond, the sulfur of Cys-228 refined to an ideal position for ortho-substitution of the phenol ring. The bond has a cis conformation [the torsion angle C'(l)-C"(l)-Sy-CB is 7"), which suggests that it may have partially double-bond character.The indole ring of Trp-290 lies above the Tyr-272-Cys-228 bridge, as if to protect the thioether from the solvent (Fig. 4). The six-membered ring of the indole lies exactly above the sulfur atom of Cys-228. The close distance of the two groups suggests a N. It0 et al. L R -228 Tyr-2 Tyr-27 Fig. 4 Stereoview of the unusual thioether bond between Tyr-272 and Cys-228, with Trp-290 stacked above it. The copper ion is also shown as a large sphere strong T-T interaction between them, which is probably involved in the catalytic mechanism. Discussion Role of the Tyr-Cys-Trp Complex and the Free Radical The unprecedented linkage between Tyr-272 and Cys-228 is almost certainly involved in the catalytic mechanism. An extended aromatic system could help the delocalisation of the free radical (spin density), which would stabilise the free radical.One of the characteristic features of GOase among other free radical proteins is the unusually low redox potential of the radical. The redox potential of the free radical in GOase determined by redox titration is 0.41 V (at pH 7.5),5 compared with 0.94 V reported for a free tyr~sine.~~ The thioether linkage very likely contributes to this low potential. The requirement of the copper(1r) ion for radical generation is not yet clear. The apoenzyme treated with K,Fe(CN)6 has been reported to show an EPR signal from a free radical,” which presents a contrast to ribonucleotide reductase, where the binuclear iron centre is essential for radical formation.The GOase used in that study was, however, prepared as the holoenzyme then converted to the apoenzyme by removing the copper and would presumably have the Tyr-Cys bridge already formed. One possibility is that the enzyme synthesized in the fungal cell requires the copper ion for the generation of the free radical during the first turnover. This radical would form the Tyr-Cys bridge, reducing the redox potential and making it possible to create the free radical without the copper centre thereafter. The stacking indole of Trp-290 seems both to protect the free radical from the solvent, where unwanted reductants might inactivate the enzyme by reducing the free radical, as well as to stabilise the free radical by helping delocalisation. A further, potentially more important role, however, might be played by this residue, as described later.Binding Model for Alcohol Substrate When a Connolly surface24 was calculated for the crystal structure, a small depression was found on the surface at the copper site (Fig. 5). The pocket, which is occupied by an acetate ion and a water molecule in the native structure, shows structural complemen- tarity to D-galactose, and a model of this substrate was readily built into the active site using computer graphics. Although this substrate-binding model has not yet been confirmed by experiment, it has several attractive features. (i) The structural complementarity is almost perfect. The D-galactose molecule in the model adopts a plausible ‘chair’ conformation with the most favourable torsion angle around the C( 5)-C(6) bond.25 (ii) Interactions between the substrate and enzyme are reasonable. O(6)of D-galactose, the site of oxidation, is 3-D Structure of Galactose Oxidase Fig.5 Orthogonal views of the active site pocket in GOase. In each figure, the left half shows a view from the solvent, looking into the pocket, while the right half is a view along the protein surface with the solvent on the left. (a) The native structure at pH 4.5. An acetate ion, which is a copper ligand, and a water molecule lie in the pocket. (6) The binding model for D-galactose. O(6)is directly coordinated to the copper, replacing the acetate ion, while O(l), where it may be connected in polysaccharides as the non-reducing terminus, is exposed to the solvent.(c) Same model as (b) for D-glUCOSe. The only change, the position of 0(4), would cause a serious steric hindrance with the protein directly coordinated to the copper, replacing a weakly coordinated water. Hydroxyl groups at C(4) and C(3) make good hydrogen bonds with Arg-330, while O(2) may form an additional hydrogen bond with Gln-406. On the other hand, the hydrophobic backbone [C(6), C(5) and C(4)J interacts with the hydrophobic wall of the pocket consisting of Phe-194 and Phe-464. (iii) The model could explain the substrate specificity of the enzyme. The pocket could accommodate D-galactose, but not D-glucose or L-galactose, neither of which is a substrate.Furthermore, since C(1) and 0(1) are exposed to the solvent, polysaccharides with a D-galactose residue at their non-reducing termini would also be substrates, in agreement with the experimental data. Trp-290 as a General Base The alcohol-substrate binding model mentioned in the previous section has another interesting feature; the pro-S hydrogen on the C( 6) methylene of D-galactose points towards the indole ring nitrogen of Trp-290 (Fig. 6). From stereochemical studies, the N.It0 et al. L p Trp-290 Fig. 6 Modelled D-galaCtOSe with the proposed free radical site. The two hydrogen atoms of the C(6)methylene group are shown, labelled as H(s) and H(r). A broken line indicates the proximity of one of them (pro-S hydrogen) to the indole nitrogen of Trp-290 pro3 hydrogen has been shown to be removed during the first step of the enzyme- catalysed o~idation.~’ If this hydrogen is removed independently as a hydrogen ion (H+) and an electron (e-), it is tempting to speculate that Trp-290 acts as a general base during the turnover.The indole nitrogen of tryptophan does not usually function as a base. However, the presence of a tyrosine free radical in the vicinity might change the electronic structure of the indole such that the ring nitrogen could act as a base at physiological pH. In fact, it has been reported that the group responsible for the oxidation of the inactive GOase to the active form has a pK, of 7.25.’ In view of the discovery of a tyrosine free radical in GOase, this ionisable group is likely to be a component of the radical site complex, Tyr-272-Cys-228-Trp-290.The phenol oxygen of Tyr-272 is expected to be bound as tyrosinate to the copper ion, and thus to have a low pKa value, but the indole nitrogen of Trp-290 would be an attractive candidate for this pK, value of 7.25. Furthermore, the reduction of the group (loss of the free radical) due to electron transfer from alcohol substrates to the radical site would cause a drastic change of the pKa of this nitrogen, which would make Trp-290 a very efficient proton acceptor. The idea that Trp-290 may act as a general base has some support from biochemical experiments involving inactivation studies of GOase by iodacetamide and N-bromosuc- cinimide (NBS).The histidine targeting reagent, iodoacetamide, has been reported by Kosman and co-workers to inactivate GOase irreversibly, and this has been interpreted as alkylation of a histidine residue at the active site, other than those histidines directly coordinated to the At the same time, these authors made several remarks worth noting: (1) The alkylation occurs only to the holoenzyme, not to the apoenzyme. In other words, it specifically requires the presence of the copper ion. (2) GOase is not reactive toward other imidazole-reactive reagents (e.g. diethylpyrocarbonate and 13) but only to iodoacetamide and dibromoacetone. (3) Prior oxidation of a tryptophan by NBS precludes the reaction of iodoacetamide with the enzyme. (4) The alkylation of the enzyme is also inhibited by acetate with a Kiof 0.2 mmol dmW3. These points could be explained by assuming that it was Trp-290 which was alkylated (or at least modified) by iodoacetamide, rather than an active-site histidine as postulated above.Taking the above points in turn: (1) The fact that apoenzyme does not react with iodoacetamide could mean the reaction requires the tyrosine free radical to be generated close to Trp-290 in order to make the indole ring reactive. (2) The two imidazole-reactive reagents that inactivate GOase, iodoacetamide and dibromoacetone, have very similar structures to those of substrates (e.g. dihydroxyacetone). It is likely, therefore, that these reagents bind to the enzyme in the vicinity of Trp-290, in a similar way to binding of acetate ion and our modelled substrate.(3) Prior oxidation of Trp-290 by NBS would certainly change the nature of the indole, and it is very likely that the modified tryptophan would not react with iodoacetamide. (4) Acetate ion, as seen in the structure at pH 4.5, binds to copper and, therefore, would inhibit the alkylation by substrate-like reagents. 3 -D Structure of Galactose Oxidase Fig. 7 Model for catalytic mechanism of GOase. The copper ion is shown with the equatorial sites and the proposed radical complex (Tyr-272, Cys-228, and Trp-290). Y-is a tyrosinate coordinated to the copper, and Y' is a tryosine free radical. W-represents the deprotonated Tv-290 The authors of the original alkylation studies26 provided some direct evidence for the alkylation of histidine residues.They reported that one histidine was lost on alkylation as indicated by amino acid analysis; this evidence is not conclusive, however, since the amino acid composition of GOase indicated by our gene-sequencing analysis differs from that reported" by amino acid analysis. It should be mentioned that the hypothesis described here is partly based on the assumption that the proton and electron are transferred inde~endently.~~ If the proton and electron are transferred simultaneously as a hydrogen atom (He), Trp-290 should be described as a free radical rather than a general base. A tryptophan radical in GOase, however, has not been reported so far, and indeed a free radical on a residue exposed to solvent would be very unstable.A Model for Catalytic Mechanism Fig. 7 shows a model for the catalytic mechanism of GOase consistent with the structural information obtained in this work as well as that from spectroscopic data. This model can be seen as an extended version of the tyrosine free radical In our model, the inactive species of GOase contains Cu" and tyrosine without the free radical at the active site. One-electron oxidation activates the enzyme by creating a free radical at Tyr-272. The radical would be stabilised by its delocalisation over the N. It0 et al. radical complex, Tyr-272-Cys-228-Trp-290. The extent of delocalisation of the free radical is not clear, but its formation would trigger deprotonation of the indole nitrogen of Trp-290.D-Galactose would bind directly to the copper at an equatorial site through its O(6) position. This binding would place the pro-S methylene hydrogen at C(6) close to the indole nitrogen of Trp-290. The abstraction of the pro-S proton would initiate the reaction, and the substrate would be oxidised, reducing both the copper and the free radical. Reduction of the copper to Cu' would probably be accompanied by a geometrical change of the copper ligands, possibly involving decrease in coordination number,28 and reduction of the free radical would force Trp-290 to retain the proton acquired from the substrate. The aldehyde product would leave the copper site at this point in the catalytic cycle, although it may remain bound to the enzyme at some other site, if the ordered mechanism, in which molecular oxygen binds before the aldehyde leaves the enzyme, is valid as suggested by steady-state kinetic^.^' Molecular oxygen, or another oxidising agent, would then bind to the copper, probably at the site previously occupied by the alcohol substrate, This would lead to oxidation of both the copper and the free radical site.In the case of molecular oxygen, the two electrons would be transferred to the oxygen from the copper and the radical site. The creation of the free radical would allow proton transfer from Trp-290 to the oxygen to form HO,, which could then be released to complete the turnover. Although the model proposed here is far from definitive and lacks direct evidence, it seems reasonable based on our present knowledge of GOase, and should stimulate further experiments designed to test it.We thank Veronica Blakeley for her skilful help in the protein production and purification. Dr. Ellie Adman is gratefully acknowledged for useful discussions. The crystallographic and computational resources are provided by SERC through the Molecular Recognition Centre at Leeds. S.E.V.P. holds an SERC Senior Fellowship. References 1 D. J. Kosman, in Copper Proteins and Copper Enzymes, ed. R. Lontie, CRC Press, Boca Raton, 1984, VOI. 1, pp. 1-26. 2 M. M. Whittaker, V. L. DeVito, S. A. Asher and J. W. Whittaker, J. Biol. Chem., 1989, 264, 7104. 3 R. A. van der Meer, J. A. Jongejan and J.A. Duine, J. Bid. Chem., 1989, 264, 7792. 4 N. Ito, S. E. V. Phillips, C. Stevens, Z. B. Ogel, M. J. McPherson, J. N. Keen, K. D. S. Yadav and P. F. Knowles, Nature (London), 1991, 350, 87. 5 G. A. Hamilton, P. K. Adolf, J. de Jersey, G. C. DuBois, G. R. Dyrkacz and R. D. Libby, J. Am. Chem. SOC.,1978, 100, 1899. 6 M. M. Whittaker and J. W. Whittaker, J. Biol- Chem., 1988, 263, 6074. 7 D. M. Dooley, Life Chem. Rep,, 1984, 5, 91. 8 K. Clark, J. E. Penner-Hahn, M. M. Whittaker and J. W. Whittaker, J. Am. Chem. SOC.,1990,112,6433. 9 M. M. Whittaker and J. W. Whittaker, J. Bid. Chem., 1990, 265, 9610. 10 A.Larsson and B-M. Sjoberg, EMBO J., 1986, 5, 2037. 11 P. Nordlund, B-M. Sjoberg and H. Eklund, Nature (London), 1990, 345, 593. 12 J.E. Erman, L. B. Vitello, J. M. Mauro and J. Kraut, Biochemistry, 1989, 28, 7992. 13 M. Sivaraja, D. B. Goodin, M. Smith and B. M. Hoffman, Science, 1989, 245, 738. 14 B. C. Finzel, T. L. Poulos and J. Kraut, J. Biol. Chem., 1984, 259, 13027. 15 N. Ito, S. E. V. Phillips and P. F. Knowles, manuscript in preparation. 16 M.J. McPherson, Z. B. Ogel, C. Stevens, K. D. S. Yadav, J. N. Keen and P. F. Knowles, J. Biol. Chem., 1992, in the press. 17 J. N. Varghese, W. G. Laver and P. M. Colman, Nature (London), 1983, 303, 35. 18 F. M. D. Vellieux, F. Huitema, H. Groendijk, K. H. Kalk, J. Jnz. Frank, J. A. Jongejan, J. A. Duine, K. Petratos, J. Drenth and W. G. J. Hol, EMBO J., 1989, 8, 2171. 19 D. J. Kosman, M. J. Ettinger, R. E. Weiner and E. J. Massaro, Arch.Biochem. Biophys., 1974, 165,456. 20 B. G. Malmstrom, Annu. Rev. Biochem., 1982, 51, 21. 3 -D Structure of Galactose Oxidase 21 J. A. Trainer, E. D. Getzoff, K. M. Beem, J. S. Richardson and D. C. Richardson, J. Mol. Biol., 1982, 160, 181. 22 J. W. Godden, S. Turley, D. C. Teller, E. T. Adman, M. Y. Liu, W. J. Payne and J. LeGall, Science, 1991, 253, 438. 23 M. R. DeFelippis, C. P. Murthy, M. Faraggi and M. H. Klapper, Biochemistry, 1989, 29, 4847. 24 M. J. Connolly, J. Appl. Crystallogr., 1983, 16, 23. 25 A. Maradufu, G. M. Cree and A. S. Periin, Can. J. Chem., 1971, 49, 3429. 26 L. D. Kwiatkowski, L. Siconolfi, R. E. Weiner, R. S. Giordano, R. D. Bereman, M. J. Ettinger and D. J. Kosman, Arch. Biochem. Biophys., 1977, 182, 7 12. 27 G. A. Hamilton, hog. Bioorg. Chem., 1971, 1, 83. 28 N. J. Blackburn, S. S. Hasnain, N. Binsted, G. P. Diakun, C. D. Garner and P. F. Knowles, Biochem. J., 1984, 219, 985. 29 L. D. Kwiatkowski, M. Adelman, R. Pennelly and D. J. Kosman, J. Znorg. Biochem., 1981, 14, 209. Paper 1/06301D; Received 13th December, 1991
ISSN:1359-6640
DOI:10.1039/FD9929300075
出版商:RSC
年代:1992
数据来源: RSC
|
8. |
Induced-fit movements in adenylate kinases |
|
Faraday Discussions,
Volume 93,
Issue 1,
1992,
Page 85-93
Georg E. Schulz,
Preview
|
PDF (983KB)
|
|
摘要:
Faraday Discuss., 1992, 93, 85-93 Induced-fit Movements in Adenylate Kinases Georg E. Schulz Institut fir Organische Chemie und Biochemie der Universitat, Albertstrasse 21, 7800 Freiburg im Breisgau, Germany Adenylate kinases have an M, around 23 000 which classifies them among the smallest phosphoryl group transferring enzymes. In order to prevent phosphoryl transfer to water, i. e. hydrolysis, these enzymes undergo induced- fit motions on substrate binding and assembleldisassemble their catalytic centres during each reaction cycle. Details of these processes have been derived from several X-ray structure analyses. The disturbance of these analyses by crystal-packing effects is discussed. Kinases are enzymes that transfer phosphoryl groups, in most cases to hydroxy groups.Since the hydroxy groups of the surrounding water compete strongly for the phosphoryl groups, a kinase has to shield off its catalytic centre.' Water exclusion could be accomplished by placing the transfer path into the molecular centre, as for example observed in glutathione reductase for an electron transfer: but whereas electrons can penetrate a well packed core, phosphoryl groups cannot. They would need a wide channel, as for instance found with the medium-sized porins, which are located in a memb~ane.~In an aqueous environment, however, a channel structure would require quite a mass of polypeptide. Actually, kinases choose to exclude water through large polypeptide rearrangements? Most kinases contain more than 350 residues, providing enough shielding material for this purpose.Among this enzyme group the nucleoside monophosphate (NMP-) kinases are exceptionally small and they catalyse the transfer to an anhydride, which is thermodynamically less favourable than the transfer to a hydroxy group. As a consequence, particularly efficient shielding is required, and large relative mass displacements are expected. Nucleoside Monophosphate Kinases The nucleoside monophosphate (NMP-) kinases catalyse the reaction Mg2+ N,TP+ N2MP .C-Mg2'+ NIDP+ N2DP where N1and N2represent nucleosides. The enzymes accept normal and deoxynucleo- tides. Their main role is the metabolically efficient recovery of the nucleoside monophos- phates which are produced in protein synthesis, for example.The enzymes are also required in all compartments with a high ATP/ADP or GTP/GDP turnover like mitochondria, chloroplasts and the cytosol of muscle cells, where they maintain the thermodynamic equilibrium between the tri- and di-phosphates. The catalysis requires a divalent cation; Mg2+ works best. This cation binds between the p-and y-phosphoryl groups of NITP,forming only minor contacts to the protein. The covalent and spatial structures of a number of NMP-kinases are now known; the best analysed is the most abundant group of adenylate (AMP-)kinases within the family.' Appreciable data are available for the guanylate kinase group6-8 and some data are available for the uridylate kina~es,~ which also function as cytidylate kinases.In general, N,TP stands for ATP. Only the mitochondria1 matrix adenylate kinase uses 85 Adenylate Kinases Fig. 1 Stereoview of the Cabackbone of porcine muscle cytosol adenylate kinase (AKl) without substrates as an example for an NMP-kinase. Several residue positions are given. Reproduced with permission from D. Dreusicke and G. E. Schulz, J. MoZ. Bid., 1988, 199, 361 GTP more efficiently than ATP, the ratio being ca. 10: 1." The adenylate and guanylate kinases are very specific for (d)AMP and (d)GMP, respectively, and less specific with respect to the base of the triphosphates. In uridylate kinase the monophosphate specificity is less pronounced as also (d)CMP is accepted. The turnover numbers of these enzymes are all in the range of 500s-*, Le. they are not slow.The M, values of the NMP-kinases range from 20 500 (guanylate kinase from yeast) to 26 100 (bovine mitochondria1 adenylate kinase, Am). The family can be subdivided into small and large variants. The chainfold of a small variant is given in Fig. 1. The large variants have an additional domain (INSERT) of 38 residues inserted in the middle of the chain, which is involved in the induced-fit movements. The small variants have a chain segment of 11 residues instead. Moreover, the large variants have disordered chain segments at their N- and C-terminal ends," which are most likely not involved in catalysis but in protein targeting. Neglecting these ends, the small and large variants form M, groups with M, = 21 000 *1000 and 24 000 *1000, respectively.The relationships between all NMP-kinases can be ascertained by amino acid sequence comparisons. For detailed alignments, however, the spatial structures have to be consulted in a number of cases.' The small size of these kinases brings them into an M, range which permits complete structure analyses by NMR in solution. Accord- ingly, this enzyme family may become a model case for studying complicated actions in catalysis. Structure Analyses of NMP-Kinases Porcine muscle cytosolic adenylate kinase (AK1) was the first family member the structure of which had been solved by amino acid sequence analysis and X-ray diffrac- ti~n.'**'~It crystallized in two interconvertible crystal forms. One crystal form exists at pH 5.5 where the enzyme is inactive; its structure has been solved with very limited accuracy at 3.3 A resol~tion.'~The structure in the other crystal form exists at pH 7.7, where the enzyme is active, and has been elucidated in great detail at 2.1 A resolution (Fig.l).15 G. E. Schulz Fig. 2 Sketches highlighting the structure classification of AMP-kinases as well as the motions (arrows) on substrate binding. The INSERT (. -.) and AMP-binding (---) domains are marked, the Gly-loop is indicated by a curved line. (a)Cytosolic enzyme AK1 without substrate but with a sulfate ion bound to the Gly-l~op.'~ (b)Mitochondria1 matrix enzyme AK3 co-crystallized with AMP." A sulfate ion is bound to the Gly-loop. The locations of INSERT in the two crystallo- graphically independent molecules differ slightly.(c) Yeast" and E. coli21922enzymes co-crystal- lized with both substrates, ATP and AMP, mimicked by Ap,A.Mg2' as identified by exchange with Mn2+ The additional fifth phosphate of Ap,A, which is disordered in the E. coli enzyme,22is marked black In order to find the substrate positions, crystal-soaking experiments with substrates, inhibitors and analogues were performed under extreme conditions in both crystal forms: applying high concentrations and rapidly collecting low-resolution data before the crystals deteriorated.16 These experiments yielded the general locations of phosphates and of AMP (at that time thought to be the ATP site), but also suggested a second nucleotide site that later turned out to be spurious. The spurious low-resolution density had been caused by bad phases and a crystal-form transition that went undetected during data c~llection.~~ Unfortunately, the spurious X-ray site received support from NMR experiments.'* In retrospect, the interconvertible crystal forms of AK1 already indicated the confor- mational flexibility of the enzyme and, in particular, they showed that crystal packing may exert some influence on the enzyme conformation because the crucial Gly-loop (see below) moved away from its place in the molecule in order to form a crystal ~ontact.'~It seems therefore most important to consider and report all packing contacts in crystal structure analyses. Moreover, the approach to find substrate positions by soaking more or less failed for these crystals, because the region of the active centre was occluded by neighbouring molecules, and because ligand binding presumably caused conformational changes, leading to crystal breakage.As a side-result of the analyses, the observed secondary structure of AK1 was checked against secondary structure predictions that are solely based on amino acid sequences. This test case came out very favourable for the prediction meth~ds,'~ and in fact much better than in later predictions for other proteins. I therefore conclude that the secondary structures of the NMP-kinases are locally determined by the amino acid sequence, thus playing into the hands of the predictors. I also suggest that they are locally determined in order to establish local stability, which is the prerequisite for an enzyme that needs to be conformationally flexible for catalysis but which also has to assume defined structures. Advances in the analyses of structures and substrate-binding sites of the NMP-kinases came with co-crystals containing substrates and substrate analogues.The second known medium-resolution structure was that of yeast adenylate kinase ligated with P',P5- bis(adenosine-5'-)pentaphosphate (AKyst :AP,A),~*an inhibitor that connects ATP and AMP by a fifth phosphate and thus mimicks both substrates. A very similar structure was later established for the adenylate kinase from E. coli, AKeco.: AP,A.~'*~~ The covalently symmetric Ap,A (see Fig.5, later) showed the binding sites of the substrates but did not allow distinction between the ATP and AMP sites. This was Adenylate Kinases subsequently accomplished with the structure of yeast guanylate kinase ligated with GMP (GKyst :GMP)' and with bovine mitochondria1 matrix adenylate kinase ligated with AMP (AK3:AMP)." The assignment was supported by finding that Mn2+ was bound as expected between the p-and y-phosphates of ATpo and that the additional fifth phosphate of Ap,A in the high-resolution structure of the E. coli enzyme (AKceo :Ap5A) was disordered? i.e. it did not have a defined place. Further support came from a comparison with the structurally related G-proteins (see below) that were co-crystallized with GDP and GTP-analogues, showing the position of N1TP?3-26 Most unexpected were the observed gross conformational differences between these crystal structures, which are sketched in Fig.2.27 Obviously, the substrate-free enzyme has an open structure, whereas the enzyme is closed, compact and globular when both substrates are bound. Domain Movements and Crystal Packing As was experienced with conformational changes observed in the crystal form transition of the porcine cytosolic AMP-kinase (AKl), crystal packing may well influence the polypeptide conformation, in some instances just by selecting from an ensemble of conformers in solution. In any case the observed differences indicate where a molecule permits movements. Therefore, it is always advisable to know more than one crystal structure of an enzyme.The observed structures can be classified using the three groups of Fig.2. The substrate-free open structure is not only found with porcine cytosolic AMP-kinase (AKl),', but also with the corresponding enzyme from carp at low resolution in a different packing scheme.28 The enzyme conformation with bound AMP is known from AK3 :AMP co-crystals which contain two complexes per crystallographic asymmetric unit, ie. in different packing environments." There is also a general correspondence of the substrate position, in particular the a-phosphate of N2MP, with GKyst :GMP.798 A detailed comparison with GKyst is not possible because the chainfold of its GMP- binding domain (&sheet) differs grossly from the AMP-binding domain (a-helices) of The compact globular conformation with both substrates bound is known from the crystal structure of AKyst :Ap5A2' and the crystal structures of two AKeco :Ap,A complexes in different packing environments (two complexes per asymmetric unit) .21*22 Several other structure analyses of AMP-kinases are under way.For a detailed analysis of the movements, the Ca backbones of AK1, AK3:AMP and AKeco :Ap5A were superimposed, showing that the main bodies of the enzymes consisting of the central parallel &sheet and surrounding helices (residues 1-37,68- 130, 142-194 of AK1, see Fig. 1) are very similar, whereas the remaining parts are different?7 These form the AMP-binding domain (residues 38-67 of AK1) which moves on AMP- binding and continues moving on binding both substrates, and domain INSERT (residues 122-159 of AKeco, the small equivalent segment of AK1 consists of residues 131-141), which rotates on binding both substrates.The observed motions are illustrated in Fig. 3. A closer inspection of the motions showed that insert is a very rigid entity which moves up to distances of 32%i (rotation 92") without being deformed.27 It consists of an irregular P-sheet as depicted in Fig. 4. Its amino acid sequence is very well conserved among the large variants of the AMP-kinase group. No such domain has been detected yet with the GMP- and UMP-kinases. With 38 residues it ranges among the smallest known rigid polypeptide domains. The structure comparisons assigned the hinges of the movement to residues 121-122 and 159-160.The conformational changes at the hinges are not just rotations around two single bonds, but a combination of numerous rotations. The observation of differing INSERT positions in AK3 :AMP (relative rotation 9") and in particular differing INSERT positions even in the compact AKeco:Ap,A (relative rotation 4") demonstrates clearly the flexibility at the hinge.27 Note that the different INSERT positions are not different rotational states around one common axis. G. E. Schulz AMP AMP AMP ATP Fig. 3 Full atom models of AMP-kinases aligned on their main bodies. Top: Comparison between the substrate-free cytosolic enzyme AK1 (left, small variant) and the mitochondria1 matrix enzyme AK3 with bound AMP (right, large variant).The AMP-binding domain moves by up to 8A starting at the position depicted at the left-hand side. Bottom: Comparison between AK3:AMP (left, rotated by 90" around the vertical axis with respect to the model above) and the E. coli enzyme with both substrates bound (right, ATP and AMP are mimicked by inhibitor Ap,A). The AMP-binding domain moves by up to another 8 A, the INSERT domain moves by up to 32 A. Reproduced with permission from G. E. Schulz, C. W. Miiller and K. Diederichs, J. Mol. BioL, 1990,213, 627 Adenylate Kinases Fig. 4 The Cabackbone structure of domain INSERT consisting of 38 residues (AK3 numbering). This domain behaves like a rigid body during the large induced-fit movements.Reproduced with permission from K. Diederichs and G. E. Schulz, J. MoZ. Bid., 1991, 217, 541 In contrast to the INSERT motions, those of the AMP-binding domain are not rigid-body movements, but rotations coupled with shearing. They are also smaller, giving rise to backbone displacements of up to 13 A. No clear hinges can be as~igned.~’ Structural Changes during Catalysis Substrate binding to the polypeptide as derived from AKeco :Ap,A is sketched in Fig. 5. The adenosine of AMP is well encapsulated by the AMP-binding domain, which explains the observed high specificity for N2MP. In contrast, the base of NITP is located at a small depression of the molecular surface, in agreement with the low specificity found for this nucleotide.The phosphates are most tightly bound. They are held by five arginines, one lysine and all peptide nitrogens of residues 18-23 of AKl (see Fig. 1, the corresponding residues of AKeco are 10-15), the Gly-loop forming a giant anion hole.29 There is no tight binding between the substrates and residues of the INSERT domain except for Arg-132 and Arg-138 (AKl numbering throughout) that are close to the hinge of INSERT. These two arginines are fixed by salt bridges to Asp-141 and Asp-140 (also near to the hinge). Among the other three arginines, Arg-97 and Arg-149 belong to the main body of the enzyme and Arg-44 to the AMP-binding domain. The Gly-loop, Lys-21 and all five arginines are conserved throughout the enzyme family. During catalysis the y-phosphoryl group of NITP is transferred to the a-phosphate of N2MP.The structure of AKeco:Ap,A shows that these two groups are in an orientation and location permitting in-line transfer through a bipyramidal pentavalent phosphate intermediate.22 Presumably, Arg-132, Arg-138 and Arg-149 (AK1 numbering) stabilize this intermediate, Mg2+ polarizes the y-phosphoryl group of NITP, facilitating the nucleophilic attack on the phosphorus, Lys-21 moves together with the transferred phosphoryl group as a closely attached companion, while the Gly-loop stabilizes the developing negative charge on the P-phosphate of N,TP (Fig. 5). The comparison between open and closed structures (Fig. 3) showed that only in the closed structure are the crucial Arg-132 and Arg-138 fixed by the aspartates.The natural conformational state of Arg-149 has not been clarified because it is obviously disturbed by the additional fifth phosphate of AP,A.*~ This suggests the following scenario for substrate binding: On binding the monophosphate, the N2MP-binding G. E. Schulz AIP AMP Gly-loop Fig. 5 Sketch of the polar main chain contacts (---) between the five phosphates of Ap,A (circles) and AMP-kinase from E. coli.22The additional fifth phosphate is cross-hatched, assigning the substrates ATP (left) and AMP (right). The Gly-loop is depicted as a series of squares representing residues 18-23 (AKl numbering, see Fig. 1, residues 10-15 in AKeco) running clockwise. The second and third glycine as well as the lysine of the sequence fingerprint are hatched, the lysine sidechain is shown.Except for the €-amino group all interactions are with the peptide nitrogens of the Gly-loop. Arginines are denoted by rectangles with positive charges. Arg-132, Arg-138, Arg-149, Arg-44 and Arg-97 (numbers 123, 156, 167, 36 and 88 in AKeco) are depicted clockwise. Arg-132 and Arg-138 form salt bridges to Asp-141 (left) and Asp-140 (numbers 159 and 158 in AKeco) given by rectangles with negative charges, respectively. The residues are conserved in the AMP-kinases domain moves to some extent, but the a-phosphate of N2MP remains rather mobile. On additional binding of NITP, Arg-132 and Arg-138 (located in the INSERT domain near the hinge) are attracted, causing the closing down of INSERT and thus deformations around Asp-140 and Asp-141 at the hinge that lead to the salt bridges (Fig.5) that are required for catalysis. Accordingly, the enzyme assembles and disassembles its very catalytic centre, i.e. the arginines stabilizing the transition state, during each catalytic cycle of ca. 2ms duration. Concomitantly, the enzyme runs through large domain motions within the cycle. The extent of the immense displacement of INSERT indicated in Fig. 3 and 4 may be a crystal-packing artefact, however, because only part of this motion is required for permitting substrate binding. Catalytic Constants and Induced Fit of Adenylate Kinases A number of AMP-kinase mutants, in particular from the E. coli enzyme, have been produced and kinetically analysed.In general, the results support the established location of the active centre.30 The observed changes in the Michaelis constants K,(ATP), K,(AMP) and the maximum velocity V,,, classify the mutants into three groups: Class-I mutations increase both K, values, class I1 decreases V,,, drastically and class I11 increases &(AMP).** Classes I1 and I11 can be understood, because the mutations are located at the transferred phosphoryl group and at the AMP-binding site, respectively. The class I mutations are more complicated as they are located at the ATP-binding site, but they affect the K, values of both substrates. Class I mutants can now be explained by assuming an equilibrium between two enzyme conformers, Einact,which does not bind substrates and E,, that does, where the mutations shift the distribution towards Einact and E,,, represent.Structurally, Einact the enzyme before and after the full formation of the active centre, which occurs simultaneously with the induced-fit motions. We therefore conclude that the class I mutants disturb the induced fit much more than they disturb the binding of ATP. Adenylate Kinases CIassical NTP-binding Fold and Related Proteins It was recognized early on that the Gly-loop that forms the giant anion hole for accommodating the P-and a-phosphates of NtTP in the adenylate kina~es’~.~’ can also be found in F1-ATPa~e31*32 and myosin.33 The Gly-loop is the main part of the characteris- tic sequence fingerprint Gly-X-X-Gly-X-Gly-Lys, which is now known to occur in a large number of NTP-binding and NTP-processing proteins, indicating that it is an important structural feat~re.3~ In the structurally known cases the Gly-loop is located between the first P-strand and the following a-helix of a chain fold consisting of a total of four P-strands and four surrounding helices.This chain fold is essentially the main body of the NMP-kinases and of the G-proteins H-ras-p21 and elongation factor Tu. It is also observed in the more distantly related proteins flavodoxin, GheY and y,S-resolvase. I call it the classical NTP-binding fold.34 Finding a sequence fingerprint in a characteristic chain fold is usually taken as clear indication for a relationship by divergent evolution.For the NTP-binding proteins, however, the characteristic chain fold is small and therefore not very significant; it constitutes only the central part of otherwise different chain folds. Considering the whole chain folds we find sequence conservation at the Gly-loop over times where the chain folds changed appreciably, which is not at all common. Still, divergent evolution within this protein group was recently confirmed by the structure of a protein kina~e,~’ which showed similar NTP binding in a completely unrelated chain fold. Here, the p-and a-phosphates bind at a glycine-rich loop, connecting neighbouring antiparallel p-strands, while the (strongly conserved) lysine binding to the p-and y-phosphates comes from a position distant along the chain.This seems to be a clear case of convergent evolution that demonstrates how important the giant anion hole and the lysine are for phosphoryl transfer. A closer inspection of the two known high-resolution structures of Gly-loops in NTP-binding proteins22924 showed that the second and third glycines of the fingerprint assume main-chain dihedral angles that are clearly fobidden for residues with side chains. The corresponding angles of the first glycine, though forbidden, are near to the allowed extended chain region.22 It is therefore peculiar that just this first glycine is particularly well conserved, while for instance the second glycine (strongly forbidden) is an alanine in the G-protein elongation factor Tu. Taken together, this peculiarity, the strong conservation of the combination of giant anion hole and lysine, the large induced-fit movement of the AMP-kinases and the observation that the Gly-loop can indeed move in a crystal-form transition of AK1, indicate that dynamical rather than static properties render this structural feature so important.While one particular static structure, e.g. for NTP binding, can be built up in many ways, it is much more difficult to construct a polypeptide that assumes a series of dynamically related structures, as the Gly-loop obviously does. This rationalizes the extreme conservation of the Gly-loop. I thank Drs. E. Schiltz and A. G. Tomasselli for discussions, and numerous students for their basic work at our institute as reported in the publications.References 1 W. P. Jencks, Ada Enzymol., 1975,43, 219. 2 P. A. Karplus and G. E. Schulz, J. Mot. Biol., 1989, 210, 163, 3 M. S. Weiss, U. Abele, J. Weckesser, W. Welte, E. Schiltz and G. E. Schulz, Science, 1991, 254, 1627. 4 R. C. McDonald, T. A. Steitz and D. M. Engelman, Biochemistry, 1979, 18, 338. 5 G. E. Schulz, Cold Spring Harbor Symp. Quant. Biol., 1987, 52, 428. 6 A. Berger, E. Schiltz and G. E. Schulz, Eur. J. Biochem., 1989, 184, 433. 7 T.Stehle and G. E. Schulz, J. Mol. Biol., 1990, 211, 249. 8 T. Stehle and G. E. Schulz, J. Mol. Biol., in the press. G. E. Schulz 9 P. Liljelund, A. Sanni, J. D. Friesen and F. Lacroute, Biochem. Biophys. Res. Commun., 1989, 165,464. 10 A. G. Tomasselli and L.H.Noda, Eur. J. Biochem., 1979, 93, 263. 11 K. Diederichs and G. E. Schulz, J. Mol. BioL, 1991,217, 541. 12 A. Heil, G. Muller, L. H. Noda, T. Pinder, R. H. Schirmer and 1. von Zabern, Eur. J. Biochem., 1974, 43, 131. 13 G. E. Schulz, M. Elzinga, F. Marx and R. H. Schirmer, Nature (London), 1974, 250, 120. 14 D. Dreusicke and G. E. Schulz, J. Mol. Biol., 1988, 203, 1021. 15 D. Dreusicke, P. A. Karplus and G. E. Schulz, J. Mol. Biol., 1988, 199, 359. 16 E. F. Pai, W. Sachsenheimer, R. H. Schirmer and G. E. Schulz, J. Mol. Biol., 1977, 114, 37. 17 K. Diederichs and G. E. Schulz, Biochemistry, 1990,29, 8138. 18 D. C. Fry, D. M. Byler, H. Susi, E. M. Brown, S.Kuby and A. S. Mildvan, Biochemistry, 1988,27,3588. 19 G. E. Schulz, C. D. Barry, J.Friedman, P. Y. Chou, G. D. Fasman, A. V. Finkelstein, V. I. Lim, 0.B. Ptitsyn, E. A. Kabat, T. T. Wu, M. Levitt, B. Robson and K. Nagano, Nature (London), 1974,250,140. 20 U. Egner, A. G. Tomasselli and G. E. Schulz, J. Mol. BioL, 1987, 195, 649. 21 C. W. Muller and G. E. Schulz, J. Mol. BioL, 1988,202,909. 22 C. W. Muller and G. E. Schulz, J. Mol. Biol., 1992, 224, 159. 23 A. M. deVos, L. Tong, M. V. Milburn, P. M. Matias, J. Jancarik, S.Noguchi, S. Nishimura, K. Miura, E. Ohtsuka and S-H. Kim, Science, 1988, 239, 888. 24 E. F. Pai, U,Krengel, G. A. Petsko, R. S.Goody, W. Kabsch and A. Wittinghofer, EMBO J., 1990,9, 2351. 25 F. Jurnak, Science, 1985, 230, 32. 26 B. F. C. Clark, M. Kjeldgaard, T. F. M. LaCour, S. Thirup and J. Nyborg, Biochim. Biophys. Acta, 1990, 1050,203. 27 G. E. Schulz, C. W. Muller and K. Diederichs, J. Mol. Biol., 1990, 213, 627. 28 C. Reuner, M. Hable, M. Wilmanns, E. Kiefer, E. Schiltz and G. E. Schulz, Protein Seq. Data Anal., 1988, 1, 335. 29 D. Dreusicke and G. E. Schulz, FEBS Lett., 1986, 208, 301. 30 M-D. Tsai and H. Yan, Biochemistry, 1991,30,6806. 31 J. E. Walker, M. Saraste, M. J. Runswick and N. J. Gay, EMBO J., 1982, 8, 945. 32 P. D. Vogel and R. L. Cross, J. Biol. Chem., 1991,266, 6101. 33 J. Karn, S. Brenner and L. Barnett, Proc. Natl. Acud. Sci. USA, 1983, 80, 4253. 34 G. E. Schulz, Curr. Opinion Struct. Biol., 1991, 2, 61. 35 D. R. Knighton, J. Zheng, L. F. TenEyck, V. A. Ashford, N-H. Xuong, S. S. Taylor and J. M. Sowadski, Science, 1991, 253, 407. Paper 2/00193D; Received 14th January, 1992
ISSN:1359-6640
DOI:10.1039/FD9929300085
出版商:RSC
年代:1992
数据来源: RSC
|
9. |
Structural and evolutionary relationships in lipase mechanism and activation |
|
Faraday Discussions,
Volume 93,
Issue 1,
1992,
Page 95-105
G. Guy Dodson,
Preview
|
PDF (999KB)
|
|
摘要:
Faraday Discuss., 1992, 93, 95-105 Structural and Evolutionary Relationships in Lipase Mechanism and Activation G. Guy Dodson" and David M. Lawson Department of Chemistry, University of York,Heslington, York YO1 5DD, UK Fritz K. Winkler Central Research Units, F. Hoflman-La roche Ltd., CH-4002, Basel, Switzerland Lipases that break down triglycerides to monoglycerides and glycerol are characterised by low or no activity in water; in the presence of an oil/water interface, however, their activity increases markedly. The structural and chemical basis for this phenomenon, referred to as interfacial activation, has been revealed by the crystal structures of a fungal lipase and a human pancreatic lipase which evidently have a divergent evolutionary history. These studies reveal that: (1) In both enzymes the catalytic sidechains are Asp: His: Ser, the same as occur in the serine proteases.The active atoms on this catalytic triad have essentially identical stereochemistry in the serine proteases and in these two lipases. The amino acids themselves, however, have quite different conformations and orientations. (2) In both enzymes the catalytic groups are buried and inaccessible to the surrounding solvent. Burial in these two lipases is brought about by a small stretch of helix (the lid) which sits over the active site. (3) In both enzymes this helical lid presents non-polar sidechains over the catalytic group, and polar sidechains to the enzyme surface. Although the 'lids' are very similar in construction in the two enzymes, they belong to very different parts of the polypeptide chain.(4) Although the amino acid sequences have no identity (except at the active serine) the two enzymes show a similar architectural framework consisting of a central five-stranded parallel /3 sheet structure. The catalytic groups decorate this /3 sheet structure in a strikingly similar way though there are also some significant differences. The crystal structure of the complex between the fungal enzyme and a substrate analogue demonstrates how the helical lid is displaced to reveal the active site. The movement of the lid also greatly enlarges the non-polar surface at the active surfaces and buries previously exposed polar residues. The movement of the lid also helps to create the appropriate movement at the oxyanion hole.It is possible to define the stereochemistry at the active site and to identify the positioning of the fatty acid and the glycerol moieties. Lipases are enzymes which break down triglycerides to di- and mono-glycerides, glycerol and free fatty acids. The ester bond which links the fatty acid and glycerol moieties is hydrolysed by a nucleophilic attack through an activated serine. The enzymes occur very widely in nature and vary greatly in size, specificity and catalytic properties. Characteristically the enzymes are inactive or very poorly active in aqueous conditions, where their lipid substrates are insoluble.' The presence of an oil/water interface formed by lipid micelles, however, activates the lipases2 in a fashion kinetically similar to that seen in the activation of phospholipase A2, but has a completely different structural 95 v IV I11 I1 1 A ie L/ -v N 111 I1 I v IV I11 I1 I Fig.1 Schematic diagrams showing the approximate arrangement of secondary structure elements in: (a) Rhizornucor miehei lipase, (b) human pancreatic lipase, (c) Geotrichurn candidurn lipase, (d)acetylcholine esterase and (e) wheat-germ serine caroxypeptidase. The /3 strands are depicted as arrows and numbered with respect to sequence from the N-terminus. a helices are represented by cylinders. Important residues are labelled (referred to in the text), and the conserved central motif of five parallel &strands is indicated in each case by the Roman numerals I-V (see also Fig.3). The N-and C-termini are labelled, respectively, as ‘N’ and ‘C’, although in the case of the carboxypeptidase, the labels are given subscripts ‘A’ or ‘B’ to distinguish between the termini of the two polypeptide chains. N.B. The p sheets are not in fact planar as these diagrams suggest, but they are twisted, such that in the case of Rml for example, strands 1 and 8 lie at an angle of 90” to each other when viewed down the length of the sheet. (a), (b) and (e) were drawn by referring directly to crystallographic atomic coordinates, whereas (c) and (d) were produced using approximate coordinates derived from published stereo figures G. G. Dodson, D. M.Lawson and E K. Winkler Recently there have been a series of X-ray studies on lipases and related enzymes, which have revealed the structural basis of their catalytic mechani~rn.'-~ It became clear from these studies that: (1) Although apparently unrelated in sequence, these enzymes share a common central core of five parallel P-strands.(2) The catalytic serine is part of a triad equivalent to that found in the serine proteases. (3) The catalytic serine proves, so far, to be sited in exactly the same place on the central P-sheet structure. (4) The carboxylic acid component of the catalytic triad is variable in its position and may be either an aspartic acid or a glutamic acid. (5) The catalytic histidine is located on loops or extensions C-terminal to the other catalytic residues.Even though it does not sit on the central five P-strand system, it is presented to the other two members of the catalytic triad in a very similar fashion. (6) In the lipases, the active site is shielded from the surrounding environment by loops and/or helices which are presumably displaced during the process of interfacial activation. A recent crystallographic study using Rhizomucor miehei lipase and a substrate analogue confirm this hypothesis? The variations in the tertiary structure, at the catalytic residues and in their manner of burial seen in the lipase family, are a striking example of evolution at the molecular level. In this paper, the structural relationships between a fungal lipase from the yeast Rhizomucor miehei (Fh~l),~and human pancreatic lipase (Hp1),6 will serve as a point of reference for the other related enzymes: lipase from Geotrichum candidum (Gc~),~ acetylcholine esterase from Torpedo californica (Ace): and wheat-germ serine car- boxypeptidase (Wsc)." These are illustrated schematically in Fig.1. It should be stressed that the latter two enzymes are not lipases and do not have buried active centres. However, in both cases, the catalytic residues lie at the bottom of a hydrophobic pocket. Two other structurally related enzymes have recently been described, namely dienelactone hydrolase from Pseudomonas sp.' ' and halogenoalkane dehalogenase from Xanthobacter autotrophicus.'2 These, however, differ from the other enzymes in that the nucleophilic residues of their catalytic triads are, respectively, cysteine and aspartate, not serine.A similar central P sheet also occurs in carboxypeptidase A,13 although the location and nature of the active centre are entirely unrelated. These three proteins will not be considered any further in this discussion. Sequence and Chain Structure in Human and Fungal Lipase One of the most striking differences between the amino acid sequences is their length. For example, the human pancreatic lipase (Hpl) contains 449 amino acids,14 while the fungal enzyme from Rhizomucor miehei (Rml) contains 269 residues.*' Other related lipases exhibit the same kind of variation. There are virtually no detectable sequence identities or similarities in this family, apart from that in the immediate environment of the catalytic site. Here there is a common motif Gly-X,-Ser-X2-Gly, where X1 and X2 are variable residues.Nevertheless, the latter are both conserved between the Rml and Hpl sequences, being, respectively, His and Leu. Also conserved is the unusual positive 4 angle (ca. 60") of the active serine, which adopts a conformation normally seen only in glycine residues. This is because it lies on a tight junction between a P strand and an a helix. Curiously, although the pentapeptide concensus sequence is also seen in serine proteases (with the exception that the last Gly is sometimes replaced by Ala), the serine always assumes a more conventional configuration (see Fig. 2). In contrast to the diversity seen in the primary structure, there is a much greater conservation in the secondary and tertiary structures of the Hpt, Rml, Gcl and other related enzymes.They all contain an essentially identical central structure of 5 &strands, referred to from the N terminus as I-V (see Fig. 1 and 3). These five strands contain the catalytically active serine and carboxylic acid. It is interesting that the strands in this common structure are all parallel and are connected in the simplest fashion, i.e. in order of their sequence. The loops connecting the common strands, however, vary Lipase Structure 1 ,SGT(S8r-195) ‘CHA(Ser-195) 1 c BEM \ 4 SBC(Ser-221)I3. -c RML(Ser-144) ‘HPL(Ser-152)-1 504 I -150 -100 -50, 0 50 100 150 +/degrees Fig.2 Ramachandran plot showing the main chain torsion angles of the active serines from Rhizomucor miehei lipase (Rml), human pancreatic lipase (Hpl), Streptomyces griseus trypsin (Sgt), chymotrypsin (Cha) and subtilisin Carlsberg (Sbc) His A -/F-c Asp/Glu IAsD) Ser Fig. 3 Schematic diagram showing the basic folding motif seen in all the structures considered in this paper. The relative positions of the catalytic residues are indicated. The ‘Asp’ in brackets refers to the position of the catalytic carboxyl in Hpl markedly in size and structure, although they all share the feature of the stretch of helix between strands I1 and 111. When the main-chain atoms of this conserved motif are taken from the Rml and Hpl structures and superimposed, using a least-squares fit algorithm, the agreement is quite remarkable (see Fig.4), especially when one considers that only eight out of the 49 residues involved in this comparison (which includes the pentapeptide concensus sequence), are conserved in the two sequences. Note that with G. G. Dodson, D. M. Lawson and F. K. Winkler V I11 Fig. 4 Figure showing how closely the &sheets of Rml and Hpl overlap. The a-carbon positions of Rml are indicated by the narrow lines and those of Hpl by the thick lines. N.B. the three strands not included in the conserved motif (see Fig. 3) run in opposite directions in the two structures. Strand '0' of the Rml and strand '8' of the Hpl are not shown the exception of Hpl, which has a non-catalytic C-terminal domain, all these enzymes are single-domain proteins. Moreover, the calcium binding site which occurs between strands I1 and I11 in Hpl is not observed in any of the other enzymes [see Fig.l(b)]. In all the lipases (and the related enzymes), the complete &sheet structure is often considerably larger than this common core of five &strands (see Fig. 1). Thus there are a further four @strands extending from the central core in both Hpl and Rml, and another six strands in Gcl, Wsc and Ace. The Fig. 1(a)and 1(b)highlight the similarities and differences in the secondary and tertiary structures between Rml and Hpl. Most noticeable is the N-terminal helix extending across the &sheet structure in Rml. This helix follows a short stretch of &conformation at residues 5-8 (labelled '0' in the diagram) which complete the &structure.By contrast, the N-terminal residues in Hpl form a small isolated antiparallel P-sheet. Also, while in Rml the ordering of strands 1-8 is according to sequence, in Hpl, the second strand of the main sheet is in fact the third with respect to sequence, and arises from a loop between the core strands I and 11. A similar cross-over connection is also seen at the analogous positions in Gcl, Ace and Wsc. Catalytic Residues Both the Hpl and the Rml enzymes have a similar grouping of Asp:His:Ser, which is very closely reminiscent of the catalytic triad of serine pro tease^,^^^^^^ as is illustrated in Fig. 5. However, although the active atoms of the triad occupy similar positions, and lie roughly in the plane of the imidazole ring, the polarity of the main chain which supports the active serine residue is reversed in the lipases, such that the seryl hydroxyl group is presented from the opposite side of this plane.This inverts the stereochemistry of the triad and presumably has important mechanistic implications. Nevertheless, the basic charge-relay mechanism that generates the nucleophilic character of the serine which is essential for catalysis, is doubtless conserved between the two enzyme families. The construction of the oxy-anion hole also appears to be different between the Hpl and Rml. In the substrate analogue complex of the Rml the electron density suggests that Ser-82 is brought adjacent to the oxy-anion by the activation process and at the Lipase Structure Fig.5 Comparison of catalytic tri ds. (a) Tri ds from Rml (thin lines) and Hpl (thick lines) , , superimposdd. (b) Overlap -of active site residues from Streptomyces griseus trypsin (thin lines), chymotrypsin (medium lines) and subtilisin Carlsberg (thick lines). Note that the main chain direction of the active serine seen for the lipases in (a) is opposite to that seen for the proteases in (b) same time helps to stabilize the reaction intermediate. In addition, the amide nitrogen of Leu-145 is also within H-bonding distance of the oxy-anion hole, and, since it sits at the N-terminus of the conserved helix (which occurs between &strands I1 and 11), the virtual positive charge created at this position by the helix dipole would presumably strengthen the interaction.” This effect will be accentuated by the predominantly hydrophobic character of the helix and its surroundings.Therefore, in lipases, the interactions which stabilize the oxy-anion hole are quite different to those seen in serine proteases (e.g.trypsin uses the amide nitrogens of the active serine, Ser-195 and Gly-193), as a direct consequence of the inverted stereochemistry of the active site. There is one significant difference between Hpl and the other enzymes: the aspartic acid of the triad in Hpl is at the end of strand 111, while on Rml for example, it is at the end of strand IV. Curiously, there is in the Hpl structure an aspartic acid on strand G.G. Dodson, D. M. Lawson and F. K. Winkler IV I11 I1 Fig, 6 Figure showing the overlap of the active sites of Rml (thin lines) and Hpl (thick lines). The catalytic serines and histidines agree closely, as do the conserved histidines, which occur N-terminal to the serines, and the tryptophans which form part of the lid helices. However, the catalytic Asp of Hpl is located at the end of strand 111. Nevertheless, there is another Asp at the end of strand IV which occupies a similar position to the catalytic Asp of Rml, although it does not interract with the histidine I11 which matches in position the catalytic aspartic acid in Rml. Fig. 6 illustrates the arrangement of the aspartic acids in the two enzymes.It is possible that this second Asp (D205) was the catalytic carboxyl in an ancestral form of the enzyme, although this would require changes in the local structure. In the Hpl, the His is part of a helix which arises from a loop formed between core strand V and the final strand of the main P-sheet, while in Rml the His is in an extended C-terminal segment with no defined secondary structure, but stabilized by a disulfide bridge. Deacylation of the acyl intermediate by a water molecule is required to regenerate the enzyme prior to cleavage of a further ester bond. There is a well defined buried water molecule hydrogen-bonded to the active serine in Rml, but none in evidence in Hpl. However, the presence of polar and charged side chains adjacent to the catalytic site, provides a local hydrophilic environment, which could trap water during this enzyme’s more complex activation.Buried water molecules are also reported in Gcl and in the Acho enzymes, trapped near the active The hydrated side chains in Gcl are carboxylic acids, and equivalents to these exist in the other lipases, suggesting they play a functional role in the enzyme’s catalysis. Nevertheless, it is worth noting that the enzyme only adsorbs to the oillwater interface and probably does not become completely engulfed by its substrate. Therefore it should be able to scavenge water molecules from the aqueous phase, in order to carry out the deacylation reaction. Activation The mechanism of activation for Rml is suggested immediately by its crystal structure, in which a segment of helix (85-91) is positioned on top of the catalytic site, effectively acting as a lid by burying it in a non-polar environment.’ The helix is supported by Lipase Structure two extended and flexible segments, which suggest that activation could be achieved simply by the rolling away of the helix.The crystal structure of the Rml enzyme complexed to the substrate analogue n-hexylchlorophosphonate ethyl ester confirmed this prediction exactly, the lid helix was displaced to one side thereby exposing the catalytic triad.8 The resultant complex is thought to be reminiscent of the tetrahedral transition state which forms on binding the triglyceride substrate (see Fig.7). The acyl moiety of the inhibitor sits in a hydrophobic groove which is partly pre-existing, and partly generated by the activation process. This is presumably the binding pocket for the leaving fatty acid. A tryptophan residue located on the underside of the lid, which in the uncomplexed enzyme blocks access to the active serine [see Fig. l(a) and 61, faces outward in the activated conformation, and presumably interacts favourably with the oillwater interface. This displacement of the helix lid considerably enlarges the non-polar surface around the active site by exposing previously buried hydrophobic residues, which would of course favour (and be favoured by) interaction with a lipid interface. At the same time, the polar residues on the lid helix, and on the surface adjacent to it, become buried.Moreover, numerous water molecules, which form a well defined network on this polar surface in the inactive enzyme, become displaced, thereby adding a favourable entropic term to the stability of the activated enzyme. The very simplicity of this movement suggests the opening and closing of the active site could be intrinsic. In the presence of polar and aqueous conditions, only the closed conformation is stable and the enzyme is inactive. And equally, the hydrophobic environment at the oil/water interface will stabilize the non-polar surfaces generated by displacement of the lid, and allow the enzyme to become active (see Fig. 8). The movement of the extended segment 82-84, associated with the displacement of the helical lid brings Ser-82 adjacent to the oxy anion hole, and it appears from electron DG Trglyceride 1I Inhibitor o.-.ASP -C/ $0 enzyme- substrate complex enzyme-inhibitor complex Fig. 7 Figure comparing the reaction mechanism for the formation of the tetrahedral enzyme- substrate complex (a),to that for the formation of the enzyme-inhibitor complex (b).The inhibitor is n-hexyl chlorophosphonate ethyl ester (DG = diglyceride, R = fatty acyl chain) G. G. Dodson, I>. M. Lawson and F. K. Winkler ...................... +++++++++++++ + + + + + + + + + + + + + +jq ! i+++++++++++++ ++++++++++++++ +++ +++ ...................... (a) fb) Fig. 8 Schematic diagram showing the electrostatic changes associated with lipase activation.Hydrophilic surfaces are coloured black, hydrophobic surfaces are stippled, and the active site is hatched. Water molecules are shown as crosses, and triglyceride molecules are indicated by black spots. In the inactive enzyme (a)there is very little exposed hydrophobic surface. However, when the enzyme encounters a lipid micelle (b), it becomes activated: the lid rolls back into a hydrophobic trench previously filled with water molecules. This opens up a large hydrophobic patch which interacts favourably with the non-polar interface. At the same time, the active site becomes exposed, so that tryglyceride molecules can enter the active site and lipolysis can take place density maps that the Ser amide nitrogen and side-chain Oyare both H-bonding to the phosphonyl oxygen.Note that the direction of the movement by Ser-82 is opposite to that of the helix lid. However, further movement of the helix would have the effect of withdrawing Ser-82 from the oxy-anion hole, which suggests that the full extent of the movement by the helix lid is seen in this substrate analogue complex (see Fig. 9). The mechanisms by which the other lipases are activated are not yet defined experi- mentally. In general, the domains burying the catalytic residues in the aqueous environ- ment, have a more complex structure, which could not expose the active surfaces by a single or simple conformational change. This is true of the Hpl where, although there is a small helix burying the active triad [which also bears a tryptophan residue adjacent to the active serine, see Fig.l(6) and 61, its removal is not sufficient to allow the triglyceride to bind.6 The situation in Gcl appears to be more complex, since the catalytic triad is covered by a pair of helices [see Fig. l(c)], which would both have to move in order to allow unimpeded access to the active centre. Conclusions The general similarities in the catalytic stereochemistry, activation requirements and the structural core of lipases does suggest that they may share a common evolutionary origin. Nevertheless, if this is true, there has been remarkable divergence in the sequences of the enzymes, in the positioning of some of the catalytic residues and in the peripheral protein structure.For example, the trypsin protease family, which is very ancient and has diverged in many ways, still exhibits similarities in sequence and structure which are easily detectable.I6 Perhaps the larger variations seen in the lipase families are associated with the less demanding stereochemistry needed for hydrolysing esters. Lipase Structure t Fig. 9 Figure showing changes at the active site in Rml as a result of activation. (a) View looking down on the active site of the inactive enzyme. The tryptophan residue which blocks access to the active Ser is not shown. Note that the side chain of Ser-82 faces away from the active site. (b) The same view, but with the inhibitor covalently bound to Ser-144. The lid helix is clearly displaced away from the active centre, but Ser-82 now helps to stabilize the oxyanion hole of the phosphonyl oxygen through favourable interactions with its arnide nitrogen and hydroxyl group. (c) Shows (a) and (b) superimposed.(d) Shows a side view of the active site in the enzyme- inhibitor complex. A further hydrogen bond with the amide ntrogen of Leu-145 adds to the stability of the oxy-anion hole. This interaction will be strengthened by the dipole moment of the helix of which Leu-145 forms the N-terminus The study of lipase structures is currently a very active field of research, which is stimulated by their biochemical interest and industrial significance. We anticipate that the determination of more three-dimensional structures will help to resolve existing problems regarding lipase structure-function relationships and their evolution, but will no doubt raise further fundamental questions.The contributions of Dr. A. M.Bnozowski, Dr. Z. S.Derewenda and our many colleagues in the laboratories of Novo Nordisk A/S (Copenhagen) and Hoffman-La Roche (Basle) G. G. Dodson, D. M.Lawson and E K. Winkler are gratefully acknowledged. We also thank Dr. T. J. Oldfield for essential assistance in the preparation of schematic figures for the different lipase molecules. References 1 H. L. Brockman, in Lipases ed. B. Borgstrom and H. L. Brockman, Elsevier, Amsterdam, 1984pp. 1-46. 2 P. Desnuelle, L. Sarda and G. Ailhard, Biochim Biophys. Acta, 1960, 37, 570. 3 D. L. Scott, S.P.White, Z. Otwinowski, W. Yuan, M. H. Gelb and P. B. Sigler, Science, 1990,250,1541. 4 D. Blow, Nature (London), 1991, 351, 444. 5 R. L. Brady, A. M. Brzozowski, 2.S. Derewenda, E. J. Dodson, G. G. Dodson, S. P. Tolley, J. P. Turkenburg, L. Christiansen, B. Huge-Jensen, L. Norskov, L. Thim and U. Menge, Nature (London), 1990, 343, 767. 6 F. K. Winkler, A. DArcy and W. Hunziker, Nature (London), 1990,343, 771. 7 J. D. Schrag, Y. Li, S. Wu and M. Cygler, Nature (London), 1991,vol, 761. 8 A. M.Bnozowski, U.Derewenda, Z. S. Derewenda, G. G. Dodson, D. M.Lawson, J. P. Turkenburg, F. BjorFing, B. Huge-Jensen, S. Patkar and L. Thim, Nature (London), 1991, 351, 491. 9 J. L. Sussman, M. Harel, F. Frolow, C. Oefner, A. Oefner, A. Goldman, L. Toker and I. Silman, Science, 1991,253, 872. 10 D. I. Liao and S.J. Remington, J. BioZ. Chem., 1990,265,6528. 11 D. Pathank and D. Ollis, J. Mol. BioZ., 1990, 214, 497. 12 S. M. Franken, H. J. Rozeboom, K. H. Kalk and B. W. Dijkstra, EMBO J., 1991, 10,1297. 13 D.C. Rees, M.Lewis and W. N. Lipscomb, J. MoZ. Biol., 1983, 168, 367. 14 M. E. Lowe, J. L. Rosemblum and A. W. Strauss, J. BioZ. Chem, 1989, 264,20042. 15 E. Boel, B. Huge-Jensen, M.Christensen, L. "him and N. Fiil, Lipids, 1988, 23, 701. 16 D. Blow, J. J. Birktoft and B. S. Hartley, Nature (London), 1969, 221, 337. 17 W. G. J. Hol, P.T. van Duijnen and H. J. C. Berendsen, Nature (London), 1978,273,443. Paper 2/00072E;Received 6th January, 1992
ISSN:1359-6640
DOI:10.1039/FD9929300095
出版商:RSC
年代:1992
数据来源: RSC
|
10. |
General discussion |
|
Faraday Discussions,
Volume 93,
Issue 1,
1992,
Page 107-129
D. Eisenberg,
Preview
|
PDF (2849KB)
|
|
摘要:
Faraday Discuss., 1992, 93, 107-129 GENERAL DISCUSSION Prof. D. Eisenberg (University of California at Los Angeles, USA) opened the dis- cussion: Within the community of those who study protein folding, there has been much discussion of the existence of folding intermediates, but without as yet much detailed information on their structures. To fill this void, would it be useful to regard the structures of uncleaved serpin-like proteins as analogues of folding intermediates on the folding pathway to the rearranged serpins? Dr. M. F. Perutz (MRC Laboratory, Cambridge) responded: This is an interesting suggestion. You could regard the active serpin as a folding intermediate that is prevented from taking up the native conformation of least energy by a chaperone, e.g.vitronectin for the plasminogen activator inhibitor or heparin for antithrombin. Prof. A. D. Buckingham ( University of Cambridge) asked: You have drawn attention to the important role of hydrophobic forces in determining the structure of a protein. Since these forces arise from a dislike of liquid water by hydrocarbons, one might think that they are not highly specific in their effects. Can we quantify their contribution to free-energy changes in protein denaturation? Dr. Perutz replied: The hydrophobic cores of proteins contribute to their stability by dispersion forces and by the hydrophobic effect. Suppose we bury a methane molecule in a hole within the hydrophobic core, and that methane makes contact with four methyl or methylene groups of the protein.Dispersion forces would provide ca. 200 cal mol-’ at each contact or 800 calories in total. The hydrophobic effect contributes ca. 24cal mol-’ for every A2 of buried hydrophobic surface. If we take 1.75 I$ as the radius of methane, this gives 38.5 A2of buried surface or 923 cal. This simple calculation suggests that dispersion forces and the hydrophobic effect contribute about equally to the stability of proteins. Prof. L. Graf (Eotvos Lora’nd University, Budapest, Hungary) opened the discussion of the paper by Prof. Thornton: Your study is based on the assumption that ‘both inhibitors and substrates bind to the proteinase in the same manner’ suggesting that a segmental mobility of the substrates is required to adopt an inhibitor-like binding conformation.My view is that while a conformational mobility of the nick site segment in the protein substrates is most likely, the final binding conformation of these sites cannot be identical with that of inhibitors. If your assumption were valid, substrates would turn out to be inhibitors rather than substrates, or proteinase inhibitors would be hydrolysed by proteinases at the same rate as substrates. However, this is not the case. The apparent contradiction might be resolved by assuming that the structure of serine proteinases undergoes some changes to complement those of the substrates throughout the catalytic reactions. Such changes of the serine proteinase structure clearly do not occur upon inhibitor binding.My suggestion on substrate (but not inhibitor) initiated conformational changes is supported by our kinetic studies of trypsin and a variety of trypsin mutants on a series of synthetic substrates.’32 A series of alterations of the substrate-binding pocket of trypsin (residues 189-195, 2 14-220 and 225-228) were made via site-directed mutagenesis to produce mutants with chymotrypsin-like specificity. The mutations, however, did not result in specific pro- teinases. Their affinities (K, was found to be identical with K,) toward both trypsin-like 107 General Discussion and chymotrypsin-like peptide substrates are comparable to those of trypsin and chy- motrypsin toward their specific substrates. On the other hand, they hydrolyse both kinds of substrates with similarly low efficiency.Thus, the redesigned and apparently proper enzyme-substrate interactions in the ground state do not properly contribute to transi- tion-state stabilization. Our interpretation of these results is that the interaction of trypsin and chymotrypsin with their own substrates might induce differential conforma- tional changes crucial for optimal transition-state stabilization. Furthermore, we suggest that the structural unit that mediates such differential structural changes upon substrate binding may be identical with the so-called ‘activation domain’ of trypsin and chymotryp~in.~ The activation domain consists of peptide segments (residues 16-19, 142-152, 184-194 and 216-223) which are flexible (though to a different extent) in both trypsinogen and chymotrypsinogen but form tightly packed structural units in the two activated proteinases.Our view is that the different conforma- tional flexibilities of these domains in trypsin and chymotrypsin may provide the structural basis for different substrate specificities of these two proteina~es.~ It follows from this view that inhibitors and substrates should bind to serine pro- teinases in a (perhaps only slightly) different way. The binding of substrates would initiate specific domain-wise structural changes in the enzymes, which are necessary for proteolysis, whereas the binding of inhibitors would ‘freeze’ the conformational flexibility of the proteinases preventing the occurrence of proteolysis.This proposal is similar to Dufton’s hyp~thesis.~ 1 L. Graf, A. Jancso, L. Szilagyi, G. Hegyi, K. Pinter, G. Naray-Szabo, J. Hepp, K. Medzihradszky and W. J. Rutter, Roc. Natl. Acud. Sci. USA, 1988, 85, 4961. 2 L. Graf, I. Boldogh, L. Szilagyi and W. J. Rutter, in Protein Structure and Function, TWEL, 1990, pp. 49-51. 3 R. Huber and W. Bode, Acc. Chem. Res., 1978, 11, 114. 4 L. Graf, I. Venekei, L. Szilagyi and A. Patthy, in preparation. 5 M. J. Dufton, FEBS Lett., 1990, 271, 9. Prof. J. Thornton ( University College, London) responded: Crystallographic data on many serine proteinases and many inhibitor complexes show that the serine proteinase active-site structure changes very little either between homologous members of the family or when different inhibitors (large or small) are bound. This is in contrast to the aspartyl proteinases, where the two domains of the enzyme and the covering loop in the active site are observed to change their structures when inhibitors are bound and also show differences between homologous pairs.These data combined with B-values from the structures suggest that the active-site region in serine proteinases is relatively rigid. Of course we do not have data for the structure during catalysis or when a substrate binds. However, given the highly suggestive positions of the catalytic triad and the sub-site specificities of some members of the family, the strong implication is that the substrate will adopt a conformation similar to the inhibitors. It is very striking that the inhibitor loops are all very similar in conformation, despite having no evolutionary relationship.We used the inhibitor loop structure as a target conformation, and it is not necessary to require that the nick sites must adopt exactly the same structure as the inhibitors, for our observations to be valid. The nick sites are grossly different from the target structure and grossly different from each other. Their conformations must change, if only so that they can fit into the active-site groove which all the structural data suggest is relatively rigid. It may be that substrates do bind in slightly different conformations than inhibitors, and indeed experimental data show that the inhibitors bind very strongly. However, it is our view that such differences will be very small close to the P,-P,’ sites, but could increase in both directions away from the cleavage site.Clearly for some members of the serine proteinase family, with quite distinct sub-site specificities, the whole binding- site cleft is important and the structure of the substrate is more likely to be close to that observed in the inhibitors. General Discussion Prof. D. M. Blow (Imperial College, London) commented: The serine proteinases such as chymotrypsin are correctly described as rigid enzymes. The example shown by Prof. Graf of K,JK, for wild-type trypsin and some mutants shows the key point. By making a single alteration at the active site, the interaction with the best substrates is destroyed, but interactions with other substrates are not optimised. This confirms that the assumption of the paper that the substrate must adapt to the enzyme, is probably a good one.Prof. A. R. Fersht (University of Cambridge) said: I would like to comment on the apparent contradiction raised by Prof. Graf that polypeptide inhibitors of serine proteases cannot bind in the same conformation as substrates since they would then be substrates and not inhibitors. There seems to be no contradiction since the crucial feature of inhibitors such as those from the BPTI family and the CI-2 families etc. is that the leaving group amine that is released on hydrolysis is strongly constrained so that it cannot leave the active site. The amino group thus blocks the access of water to the acyl enzyme and prevents the deacylation step.Furthermore, the high local effective concentration of the released amino group causes the acylation step to be reversed so that the microequilibrium on the enzyme surface greatly favours the uncleaved inhibitor. The ready reversibility of the acylation step is shown by simple experiments on chy- motrypsin. The acyl enzyme formed from acetyl-L-phenylalanine and chymotrypsin reacts four times faster with 1 mol dm-3 alaninamide to generate AcPh-AlaNH, than it does with 55 mol dme3 H20 to form AzPhe. Thus, the hydrolysis of AcPheAlaNH, catalysed by chymotrypsin is slowed down 44-fold by 1mol dm-3 AlaNH,. 1 C. J. Longstaff, A. F. Campbell and A. R. Fersht, Biochemistry, 1990, 29, 7339.Dr. Perutz asked: Your hypothesis could be tested by measuring the temperature dependence of proteolysis of different protein substrates, since the segmental mobility that you postulate would increase with rising temperature. Have you done this? Prof. Thornton replied: No, we have not performed any experiments, but this seems like a good one to try. Prof. M. Karplus (Harvard University, USA) asked: In attempting to predict nick sites you used a set of parameters based on an analysis of the known crystal structure. Most of these (i.e. accessibility, protrusion index, temperature factors) have been used by various people to identify antigenic sites on proteins. I wonder whether there is a correlation between the two types of sites and whether you have examined any differences that might exist.Prof. Thornton replied: Indeed we developed the concept of ‘protrusion index’ to identify likely antigenic sites. The analysis of known nick sites shows that the critical factors, which make a segment of polypeptide chain a good candidate for proteinase attack are, to some extent, the same as those which seem to produce immuno-dominance, i.e. accessibility, protrusion and flexibility. We have not studied the correlation in detail. However, it seems to me that there is a difference between proteinase nick site recognition and antibody-protein antigen recognition. There are many different antibody combining sites not designed for any specific antigen and the immune system will ‘choose’ the antibody which best recognises an antigen.In this recognition there are two contradictory forces: a flexible antigen will allow better fitting, but this will be unfavourable in energy (entropy) terms and may therefore generate weaker binding. There are, however, some well known immuno-dominant epitopes (e.g. the loop in foot and mouth disease virus) which are very flexible, suggesting that the improved fitting is more important. For the proteinases, there is probably only one binding site conformation and the proteins have General Discussion evolved to actively ‘avoid’ proteolysis. Therefore it is unlikely that we will observe a protein with an ideal nick site loop, and the requirement for some local unfolding makes biological sense.The implication of these studies is that sites of proteolysis are also likely to be antigenic and this can provide a guide for designing antigenic peptides. Dr. Perutz communicated: The mobility of proteins detected by NMR explains why peptide segments, whose structure in the native protein makes them misfits in the active sites of proteolytic enzymes, are nevertheless liable to be split by them. Such segments can probably take on, momentarily, a conformation that fits into, and at that moment binds to, the active site of a proteolytic enzyme. Prof. Thornton communicated: I am sure that many of the nicksite loops are flexible in solution and will flicker in and out of the target conformation. It may be that a slightly disruptive environment, e.g.increase in temperature or pH, will encourage the loop to be more flexible and thereby encourage proteolysis. Prof. L. N. Johnson (University ofoxford) opened the discussion of Prof. Eisenberg’s paper: The data, presented in Fig. 4 of your paper on the 3D profile scores, include proteins with less than 200 amino acids. The method is applicable to larger proteins as exemplified by the work described on diptheria toxin (ca. 500 amino acids). In large proteins the proportion of polar and non-polar amino acids is about the same as for smaller proteins. If the protein adopts a compact and roughly spherical shape, then, for large proteins, there is insufficient surface area to expose all polar groups (since surface area increases as r2 and volume as r3).Large proteins often overcome this difficulty by having indentations on the surface or by folding into separate domains but some, notably glycogen phosphorylase, have internal cavities that have buried polar groups compensated with clusters of water molecules. Does the algorithm proposed take into account such possibilities of buried polar groups and associated water clusters? Prof. Eisenberg replied: The algorithm used does take into account the possibility of internal polar groups associated with water clusters. In application of the algorithm, the protein is defined by the coordinates of protein atoms (plus cofactors if specified). Water molecules and other non-protein atoms are deleted from the protein coordinate file, before computation of accessible surface areas.This means that when the solvent accessible surface area is computed for the protein groups adjacent to internal solvent clusters, these positions in the structure will be defined in the profile as exposed environments. The result is that the profile will give highest scores at these positions for polar amino acids, exactly the class of residues that you say are found bound to the water clusters. Dr. F. Rippmann (E. Merck, Darmstadt, Germany) asked: Could an additional application of your method be the prediction of sites on the protein’s surface which interact with other proteins? We have heard earlier the example of plasminogen activator inhibitor (PAI) which interacts with vitronectin. Shouldn’t it be possible to predict where the interaction site is, as your actually determined profile should differ at the position of certain surface residues from the one expected for a non-interacting fully solvated protein? A related example, where your method obviously works well is heat-shock cognate protein 70 (hsc 70).Isn’t it somewhat surprising that the hsc 70 sequence is picked up by the 3D profile of actin, although the crystallographically determined hsc 70 domain is supposed to interact intimately with its peptide-binding domain (of unknown struc- ture)? Shouldn’t your profile, maybe using a modified window, be useful to give a hint where the domain interface is? Prof. Eisenberg replied: This is an important question that we have not yet studied in any detail.As you note, residues involved in interprotein interactions might well General Discussion have a different statistical composition than residues in other places on the protein surface. Perhaps such interfacial surfaces could be recognized by extension of the profile method, but we have as yet made no attempt to do this. You also comment on the ability of the profile for actin to recognize the sequence of the heat-shock cognate 70 protein (as shown in ref. 2 of our paper). Both actin and the heat-shock cognate 70 protein bind to other proteins, and obviously their surfaces must differ in detail. However, the pattern along their sequences of polar and apolar residues has an underlying similarity, which leads to their common fold.It is this pattern of polar and apolar residues that is detected by the profile. The more subtle differences in structure that account for the different binding functions of the two proteins is not detected by the profile method, although it is conceivable that extensions of the method could look for such features. Prof. G. G. Dodson (University of York) said: There are effective methods in X-ray analysis for detecting structural errors. For most crystal structures these methods should be sufficient for detecting mutations in model atomic coordinates and I would generally prefer these to systems which were based on consistencies which might wrongly signal unexpected local properties in a protein Prof. Dodson also communicated: Have you had the opportunity to investigate the 3D profiles of any serpins or serpin-related structures? Prof.Eisenberg responded: I agree that there are effective methods in X-ray analysis for detecting structural errors, but I would still recommend the application of profile analysis for assessing models, both derived from X-ray diffraction and from other methods. Conventional X-ray methods for checking models on the whole work best when high-resolution data are available and when the phases have already been deter- mined at least fairly well. But the most uncertain models are those derived from X-ray studies of lower-resolution electron density maps, sometimes with rather poor initial phases. It is these cases where profile analysis may be most helpful in signalling unusual structural segments.Also profile analysis is particularly sensitive to one type of error in the interpretation of electron density maps to which other methods are insensitive: this is a misregistration of the sequence with the electron density. The profile window plot can reveal such a misregistration, which may be hard to spot with other methods. The second advantage of the 3D profile method is that it is equally applicable to protein models derived from NMR and computational studies. To date there are few methods for checking computational models so profiles may play a role here. Prof. Eisenberg also communicated: We have not yet had the opportunity to investi- gate the 3D profiles of serpins and serpin-related structures.As coordinates for these structures become available, profile analysis will face an unusual challenge. Dr. P.N. Edwards (ICI Pharmaceuticals, MaccZesJield) said: In your analysis, you assume that all water molecules are equal, but it seems to me that some waters are more equal than others, particularly in regard to their contribution to ‘solvent’ i.e. water exposed hydrophobic surface area of particular residues. To take an extreme example from what is really a continuum, a water molecule in the centre of a protein which is fully saturated in respect to hydrogen-bonding potential (two H bonds to both hydrogens and lone pairs) will still contribute to such area calculations. Please could you comment? , Prof. Eisenberg responded: You draw a distinction between a water molecule in the centre of the protein which is fully hydrogen bonded and a water molecule on the surface which is part of the external solvent. You are undoubtedly correct in suggesting that these two water molecules experience very different energies of interaction with the protein.Nevertheless, my guess is that they are likely to be associated with the same General Discussion types of chemical groups, polar groups containing oxygen and nitrogen atoms. Thus in the context of profile analysis, the presence of water would signal the same type of neighbouring amino acid side chains. In short, for the purpose of profile analysis, I believe it is appropriate to treat even these different sorts of water molecule as the same.Prof. Karplus asked: Whenever a criterion is suggested for determining whether a crystal structure is correct one wonders whether it can be used as an effective potential for protein folding. In your case, the ‘pseudo-energy’ might be the value of the profile score. Vis-d-vissuch pseudo-energies, I might mention that pair potentials of the Jernigan type, while able to determine that the correct structure is a low-energy structure when compared with folding the sequence into a known structure for a different sequence (as shown by Sippl, for example), have much more difficulty in finding the correct structure when the sequence is folded from an extended structure. Prof. Eisenberg replied: I of course subscribe to the belief that fundamental quantum mechanics describes the structure of proteins, as of smaller molecules.Yet my experience in trying to apply conventional, approximate potential functions to understanding protein structure is that many such functions are still unable to describe structure-energy relations in a way that offers much hope for solving the protein-folding problem. Hence my interest in what you term pseudo-energies, as exemplified by the profile score. Profile scores are certainly further from a Schrodinger hamiltonian than are conventional energy expressions used in molecular mechanics, but they are more directly derived from proteins. In the limited regime of protein folding, profile scores may represent proteins better than the more conventional functions.For this reason, Dr. Bowie and I have started computer experiments in which we use the profile score as one criterion for finding the correct folded structure as the sequence is randomly folded. Prof. P. Halling (University of Struthclyde, Glasgow) opened the discussion of Prof. Wuthrich’s paper: Could you comment on the relationship of your findings to proteins in the solid state (e.g. freeze-dried powders). Water adsorption and desorption in these systems can be very much slower (hours or days) than the exchange times you observe in solution. Possible reasons might be: (a)water exchange in solution is a coordinated process, rather than distinct dehydration and rehydration steps; (b) necessary protein conformation changes are much slower in the dry state or (c) particle-level (rather than molecular-level) processes are involved.Prof. K. Wuthrich (ETH-Hunggerberg, Zurich, Switzerland) replied: You present a nice list of possible explanations for the apparently different behaviour of proteins in the solid state. The only additional comment I may perhaps add concerns the fact that in solution we study dynamic processes under conditions of thermodynamic equilibrium. This may not hold for all the solid-state systems that you mention. Prof. M. C. R. Symons (University of Leicester) said: Your results for long-lived water molecules are most interesting. However, it is not easy to relate residence times to solvation energies because of the mechanisms involved in water movement. Steric factors will be involved, and the principle of ‘anticooperativity” will often play a part.Briefly, if, for example, a group---is solvated by three H20in free solution [-A-. . .(HO--),I but only by one H20 in a restricted environment [-A-. ..HO-]the hydrogen bond in the latter will be much stronger than any one of the three in the former and hence the water residence time will be greater. However, the total solvation energy for the former is much the greater. 1 M.C. R.Symons, unpublished results. General Discussion Dr. Perutz asked: How do the rates of exchange of internal water molecules compare with those of tritium exchange? Prof. Wuthrich replied: Exchange measurements using the tritium technique have been used to study the exchange of polypeptide amide protons with the solvent. Since this type of exchange reaction is governed by the chemical process at the amide group (acid or base catalysis) it is much slower than the water exchange rates, which are governed by the fluctuations of the protein.For example, in BFTI there are amide protons in contact with internal water for which the exchange is several orders of magnitude slower than the water exchange rate. Prof. Karplus said: You raised the problem that there seems to be no correlation between the waters which have low B factors in X-ray studies of crystals and the lifetimes of bound waters as determined by NMR for water molecules on the surface of proteins. Since the X-ray determination measures position in space (or better, the probability of finding a water molecule at a given position in space), whilst NMR determines the lifetime of a water molecule relative to certain protein atoms, it is not surprising that the two types of measurements give different results.For example, a water strongly bound to a mobile side chain (e.g. an external lysine) would be visible by NMR but not by X-ray measurements. That water molecules do move with side chains has been shown by simulations (e.g. ref. 1). 1 C. L. Brooks and M. Karplus, J. MoZ. BioZ., 1989, 208, 159. Prof. Wuthrich responded: I fully agree with your notion that X-ray diffraction and NMR observe different properties of surface hydration water. Considering the wealth of crystal structure data on hydration water available in the data banks, it would nonetheless be of considerable interest if some inferences on the solution behaviour of hydrated protein surfaces could be made on the basis of the crystallographic data. This was (and continues to be) a motivation for us to investigate possible correlations between the two different kinds of measurements.Prof. Dodson said: The differences observed in the positions of the surface water molecules in basic pancreatic trypsin inhibitor (BPTI) as determined by NMR and X-ray analysis raise a variety of questions. The observation could, at least in part, follow from the sensitivity of water molecule positions to side-chain position and conformation. This behaviour was observed in the detailed analysis of the 22x1 insulin crystal structure.' In this crystal there are two molecules in the asymmetric unit with, in general, very similar structures.The close structural match of the side chains in the two molecules did not, however, extend to the water molecules which only rarely obeyed the local symmetry relating the protein molecules. Thus in addition to other factors associated with solution effects, small movements and relaxations, or even altered molecular motions are likely to create water positions different from those seen in the crystal. Consistent with this only six surface water molecules are preserved in three different crystal forms of BFTI. Also consistent is the observation that the two surface water molecules detected by NMR at positions seen in the X-ray analysis are both attached to the directional and relatively stable main-chain amides (Ala-25, Ile-19).Prof. Dodson then asked: Are there more than two surface water molecules in the BPTI crystal structures directly bound to main-chain amide or carbonyl functions? 1 E. N. Baker, T. L. Blundell, J. F. Cutfield, S. M. Cutfield, E. J. Dodson, G. G. Dodson, D. M. Crowfoot Hodgkin, R. E. Hubbard, N.W. Isaacs, C. D. Reynolds, K. Sakabe, N. Sakabe and N. M. Vijayan, Philos Trans. R. SOC.London A, 1988,319, 369. Prof. Wuthrich replied: These are certainly interesting comments. In answer to your question: Yes, there are numerous water molecules in the BPTI crystal structures that are directly bound to main-chain carbonyl functions.General Discussion Miss K. Marshall (University of Cambridge) opened the discussion of Prof. Vallee and Dr. Auld's paper: Could you tell us about any examples where your predictions have enabled the elucidation of a zinc binding site in an enzyme whose crystal structure has not yet been determined? Prof. B. L. Vallee (Harvard Medical School, Boston, USA)and Dr. D. S. Auld (Harvard University, USA)replied: Our findings have proven to be of predictive value for catalytic zinc binding sites of enzymes whose X-ray or NMR structures have not as yet been determined. Monozinc aminopeptidases and leukotriene A4 hydrolase are examples. A domain of ca. 300 amino acids in the monozinc human intestinal, rat kidney and Escherichia coli amino peptidases contains a linear arrangement of two His and one Glu, identical with the zinc binding site of thermolysin.The short spacer of thermolysin and that of the aminopeptidases consists of three amino acids while the long spacer contains 19 and 18 amino acids,' respectively. These binding sites predicted the presence of zinc and a hitherto unrecognized activity of yet another enzyme: leukotriene A4 hydrolase. It contains an amino acid segment that is homologous to the zinc binding domain of intestinal aminopeptidase.' On this basis, the LTA, hydrolase was shown to contain 1g atom zinc per mol protein by atomic absorption spectroscopy, to exhibit aminopeptidase activity and to be inhibited by bestatin and captopril which are inhibitors of metalloproteases.Moreover, mutagenic replacements of the zinc ligands of LTA, hydrolase completely abolish both its ability to bind zinc and its activity.* This is the first demonstration by mutagenesis and metal analysis of the identity of a hitherto unknown zinc enzyme-binding site. These studies demonstrate the predictive capacity of the 'short' and 'long spacer' system of zinc enzymes and their active site identities as guides for the recognition of zinc binding sites in enzymes not known heretofore to contain zinc. 1 B. L. Vallee and D. S. Auld, Biochemistry, 1990, 29, 5647. 2 J. F. Medina, A. Wetterholm, 0.RQdmark, R. Shapiro, J. Z. Haeggstrom, B. L. Vallee and B. Samuelsson, Roc. Natl. Acad. Sci, USA, 1991, 88, 7620. Prof. M. Rossi (University of Naples, Italy) said to Prof.Vallee and Prof. Blow: Xylose isomerases and alcohol dehydrogenases so far studied are enzymes with a quaternary structure and Dr. Blow told us that xylose isomerase is not a Michaelis enzyme. Has there been found, in both systems, any cooperativity between ligands binding to their sites or is the quaternary structure needed only to bury inside the hydrophobic residues of the protein? Prof. Blow replied: No aspect of the kinetic behaviour of xylose isomerase suggests any cooperativity. There is a structural requirement that the molecule be at least in a dimeric form. We would not expect any enzymatic activity in monomeric xylose isomerase, because the hydrophobic environment for the part of the substrate where the hydride shift occurs would not be sufficient to exclude water, which is a definite requirement.Part of the hydrophobic environment is provided by the side chain of Phe-25 from an adjacent subunit. Some xylose isomerases are active in a dimeric form. Rangarajan et al.' have shown that in 8 mol dm-3 urea Arthrobacter xylose isomerase dissociates into dimers which are fully active. 1 M. Rangarajan, B. Asboth and B. S. Hartley, Biochern. J., 1992, 285, 889. Prof. Vallee and Dr. Auld said: Allosterism is only apparent in y isozymes of class I ADHs.' Professor Rossi additionally raises an interesting subject. At this point it is anyone's guess in what manner evolution might express itself in zinc-specific enzyme- binding sites and their environment.Since the systematic character of zinc binding sites Genera1 Discussion has only been recognized recently it is no surprise that this question could not have been addressed as yet. However, Prof. Jornvall at the Karolinska Institute has spent much time and effort on the evaluation of phylogenetic relationships of the alcohol dehydrogenase families and their superfamilies. Sorbitol dehydrogenase is a zinc enzyme’ which is distantly related to alcohol dehydrogenase and a member of an extended protein family shown to also contain 5 crystallin and threonine dehy-drogena~e.~.~Based on fitting the amino acid sequence of sheep sorbitol dehydrogenase to the three-dimensional structure of horse-liver alcohol dehydrogenase the coordination of the catalytic zinc site has been proposed to be His Cys Glu.H,O in which one of the active-site Cys ligands of ADH has been replaced by a Glw5 In addition, in sorbitol dehydrogenase the structural zinc site of ADH is non-existent as expected, since the zinc content is 1 g atom of Zn per subunit.’ Furthermore, the short-chain alcohol dehydrogenase of Drosophila melunoguster neither has an active nor structural zinc site even though the specificity is preserved.6 The evolutionary relationships are also clearly apparent in the crystallins of the lens, where neither the catalytic site nor the ADH activity is expre~sed.~ 1 G.Mahr, K. H. Falchuk, D. S. Auld and B. L. Vallee, Proc. Nutl. Acud. Sci. USA, 1986, 83, 2336. 2 W. Maret and D. S.Auld, Biochemistry, 1988, 27, 1622. 3 C. Karlsson, W. Maret, D. S. Auld, J-0. Hoog and H. Jornvall, Eur. J. Biochem., 1989, 186, 543. 4 H. Jornvall, M. von Bahr-Lindstrom and J. Jeffery, Eur. J. Biochem. 1984, 140, 17. 5 H. Eklund, E. Horjales, H. Jornvall, C.-I. BrandCn and J. Jeffery, Biochemistry 1985, 24, 8005. 6 B. Person, M. book and H. Jornvall, Eur. J. Biochem., 1991, 200, 537. 7 T. Barras, B. Perrson & H. Jornvall, Biochemistry, 1989, 28, 6133. Dr. E. Pombo-Villar (Suudoz Phurma Ltd., Busel, Switzerlund) asked Prof. Blow: For the isomerisation there are several other mechanisms possible. Would you like to summarise the evidence for the hydride-shift mechanism as shown in Fig. 5, as opposed to an acid (Lewis) or base-catalysed keto-enol isomerisation? Prof.Blow replied: The original reason for proposing the hydride-shift mechanism was that the crystal structure revealed a totally non-polar surface on the side of the binding site facing the substrate’s 1-and 2-hydrogens. This eliminates the alternatives of acid or base catalysis of the isomerisation. Subsequently Lee et aL’ showed that V,,, for the CZostridium thermosulfurogenes enzyme is reduced by a factor of four when the glucose substrate is deuteriated at C(2). This observation has been confirmed for the Arthrobacter enzyme.’ This appears to prove that the hydride-shift mechanism is involved. 1 C. Lee, M. Bagdasarian, M.Meng and J. G. Zeikus, J. Biol. Chem., 1990, 265, 19082. 2 0. S. Smart, J. Akins and D. M. Blow, Proteins, 1992, 13, 100.Prof. Karplus asked: In your picture of non-Michaelis kinetics you showed an intermediate with a Gibbs energy lower than that of the reactant or product. Since such an intermediate approaches an inhibitor it clearly would not accelerate the reaction . I wonder, therefore, whether under physiological conditions ( ie. with appropriate con- centrations) the Gibbs-energy diagram would actually correspond to your drawing? Prof. Blow replied: As Professor Karplus points out, the stable intermediate has been observed under highly non-physiological, equilibrium conditions in an enzyme crystal. I have no evidence that it is stable under other conditions, but it might be. A stabilised state slows down the rate of approach to equilibrium by increasing the occupation of this state, and so decreasing the small fraction traversing the transition state.This applies at equilibrium whether the state stabilised is the enzyme-substrate complex (the Michaelis complex), enzyme-product complex, or an intermediate state. General Discussion An enzyme binds the transition state of the substrate tightly enough to order it correctly for catalysis. In many enzymes, similar energetically favourable interactions also exist in the enzyme-substrate and enzyme-product complexes, leading to relatively tight binding as measured by the dissociation constants K, and K, of the substrate and product, stabilising their bound states. If they were not bound at all, the dissociation constants would be infinite.If the substrate or product is bound too tightly, it becomes an inhibitor by sequestering the enzyme in the bound form. Perhaps what is unusual about xylose isomerase is that the open-chain conformation of the substrate at the presumed rate-limiting step (isomerisation) is very different from the closed-ring conformation of substrate in solution, and when initially bound, so that a very different stabilisation energy applies to the two states. K, and K, can be very different from Kransition. Dr. M. L. Sinnott ( Universityof Illinois at Chicago, USA)said: Ideally, Gibbs-energy profiles for enzyme-catalysed reactions should be constructed using the physiological concentration of substrate and product as thermodynamic standard states.Xylose isomerases apparently are confined to Actinomycetes, which are biomass-degraders. The xylose presumably arises from the hydrolysis of the hemicellulose component of the biomass. It may therefore be the case that under physiological conditions xylose isomerase works in a thick soup of xylose but only trace amounts of xylulose, which is rapidly mopped up by primary metabolism. If this is so, then the enzyme is, under physiological conditions, catalysing an essentially irreversible reaction. Under such circumstances Burbaum et al.' predict that the optimal Gibbs-energy profile is a 'descend- ing staircase' of bound intermediates. The lower Gibbs energy of the enzyme-straight- chain intermediate may thus merely be a reflection of the irreversibility of the reaction under physiological conditions.1 J. J. Burbaum, R. T. Raines, W. J. Albery and J. R. Knowles, Biochemistry, 1989, 28, 9283. Prof. Blow replied: You have emphasised that physiological situations (as well as enzyme assays) are normally far from equilibrium, and the reaction runs in one direction. The rate-limiting step is now the highest barrier at any step of the descending staircase. Most of the enzyme with substrate bound is held at the state before this step. If this minimum before the transition state is strongly stabilised, the barrier becomes unne- cessarily high and the rate is reduced. No minimum past the transition state reduces the rate (unless it becomes so deep that the following barrier takes over as the rate-limiting step).For xylose isomerase, Whitlow et aL' give evidence that the open-chain form observed crystallographically is open-chain fructose. Applying that result to the natural substrate xylose, one might infer that open-chain xylulose is stabilised. This is after the presumed rate-limiting step of the forward natural reaction (open-chain xylose +xylulose) and its stabilisation makes no difference to the overall rate. I should mention that my colleague Charles Collyer insisted long ago that our diagram of a non-Michaelis enzyme should have its minimum in the correct place (see Fig. 3 of ref. 2). 1 M.Whitlow, A. J. Howard, B. C. Finzel, T. L. Poulos, E. Winborne and G. L. Gilliland, Proteins, 1991, 9, 153. 2 C. A. Collyer and D.M. Blow, Roc. Narl. Acad. Sci. USA, 1990,87, 1362. Dr. H.B. F. Dixon (University of Cambridge) said: I was interested in Dr Pombo- Villar's question to Prof. Blow, as most of the features Prof. Blow marked in Fig. 5 to promote hydride movement would also promote hydron dissociation and the shift of electrons within the molecule. Prof. Blow's answer was convincing, that there is no base General Discussion on the other side of the molecule to transport the hydron separately from the electrons left behind. But the similarities between the two mechanisms make me wonder whether one type of aldose-ketose isomerase might have arisen in evolution from the other, by a change from a basic group to a hydrophobic environment or the reverse. Is there any evidence, such as structural similarity, that this might have happened? Prof.Blow replied: Similarities between the different enzymes which have an eight- stranded alp barrel motif have been considered by many authors. The active site is always at the carboxy-terminal end of the 6-sheets forming the inside of the barrel. It is usually not obvious which of the eight strands of one structure should be aligned with the first strand of the other, for such a comparison. The barrels are rarely cylindrical, but their shape does not vary consistently. The idea that these two isomerases might have a related mechanism occurred to me very forcibly when I discovered that Dr. Petsko's group were one of our competitors in the analysis of xylose isomerase, since Petsko made a large contribution to determining the triose-phosphate isomerase structure.However, I have found no significant similarities in the organisation of functional groups between the two enzymes. Prof. Eisenberg asked Prof. Vallee: In the wide range of Zn-metalloproteins which you have described, to what extent are the Zn-binding sites pre-organized, and to what extent does the binding energy of Zn-association organize the structures of the metal- loproteins? Prof. Vallee and Dr. Auld replied: We have discussed the implications of the short and long spacers to protein folding and the use of the metal as a probe for the process,' but we are not aware of any experimental investigations centring on their role in protein folding.While the structure of the zinc-binding site is pre-determined by the amino acid sequence, the order and progression of events that ultimately result in zinc binding is not known. We do know from studies of leukotriene A4 hydrolase, that mutagenesis of any one of the three zinc-binding residues (His, His or Glu) virtually abolishes zinc binding.2 Further, removal of zinc from the catalytic zinc site of E. coli alkaline phosphatase results in a disordered a-helix of the short spacer regi~n.~ This may imply that zinc aids in the organization of this secondary structure. However, in general the capacity of zinc to adapt to and stabilize the secondary structure of proteins and the role of secondary structure in metal function are as yet unknown.1 B. L. Vallee and D. S. Auld, Roc. Natl. Acad. Sci. USA, 1990, 87, 220. 2 J. F. Medina, A. Wetterholm, 0.RHdmark, R. Shapiro, J. Z. Haeggstrom, B. L. Vallee and B. Samuelsson, Proc. Natl. Acad. Sci. USA., 1991, 88, 7620. 3 E. E. Kim and H.W. Wycoff, J. Mol. Biol., 1991, 218, 449. Dr. Auld communicated: Astacin may be the head of a new family of zinc proteases. We have shown that astacin contains 1g atom of Zn and has within its amino acid sequence a segment, HELMHAIGFYH, that is similar to part of the Zn-binding site of the bacterial metalloproteinase thermolysin HELTHAVDYT.' X-ray diffraction studies have shown that the first two His(H) of thermolysin ligands to the Zn and that the Glu (E) provides the general base for catalysk2 The third His is not seen in thermolysin.BMP-13 was the first protein found with homology to astacin? Domain A of BMP-1, comprising 200 amino acids, is 36% identical to astacin. One indication for a possible proteolytic activity is the sequence HELGHVVGFWH in BMP-1, which is strikingly similar to the corresponding sequence found in astacin. The fact that astacin and domain A of BMP-1 are homologous and have a catalytic zinc binding site that is different from that obtained for the thermolysin family3 suggests that astacin may represent a unique family of zinc endoproteases. This has most recently gained strength by the finding that the metalloendopeptidases from mouse-kidney brush border, meprin A, and human General Discussion intestinal brush border, PPH, both have 200 amino acid domains that are homologous to astacin and BMP-1.6 Sequence comparisons show ca 30-35% identity between either meprin A or PPH and either BMP-1 or astacin.This study has, in fact suggested that these proteins constitute a new family of metalloendopeptidases called the 'astacin' family. Both meprin A and PPH are inhibited by 1,lO-phenanthroline but not by phosphoramidon, an inhibitor of thermolysin, suggesting they are probably metalloenzy- mes, but different in nature from thermolysin. Another probable member of this class is a developmental protein of DrosophiZa called tolloid, tld, which is 41% identical to BMP-1.7 The two proteins each contain three distinct sequence elements, the astacin domain, an EGF-like domain, and a repeating domain with an average of 113 amino acids with 4 Cys having conserved spacing.The last domain is also found in human complement subcomponents Clr and Cls, and in Xenopus laevis embryonal protein, UVS.2, involved in dorsal anterior development.8 The latter protein has a sequence which is homologous to the C-terminal side of the zinc signature in astacin' and thus, may be another member of the astacin family.6 Two other members of the astacin family are the proteins encoded by the genes responsible for spatial and temporal expression patterns during sea-urchin embry- ogenesis, sea-urchin embryo" and sea-urchin blastula protease 10 (BPlO)." All seven members of the astacin family have a catalytic zinc binding site signature HExxHxxGxxH (Table 1, top).A search of the sequences contained in protein and translated gene bank database files for this peptide sequence retrieves 26 additional proteins, all of which are proteases. The latter can be subdivided into three groups based on overall homology (Table 1). The vast majority of these proteases has been shown to degrade the extracellular matrix, ECM. These proteases also share with astacin a preference for substrates containing Pro residues one or two amino acids removed from the cleavage site.12 The haemorrhagic toxins, Ht-a, -b, -c, -d and -e isolated from the venom of western diamondback rattlesnake, CrotaZus atrox have molecular weights of 68, 24, 24, 24 and 25.7 kDa, respectively; each protease contains 1 mol of zinc per mol of pr~tein.'~ Several venom proteins from crotaloid, Ht-d and HT-2, and viper, HR2a, H2, HRlB, and Tg, are homologous and contain the same catalytic zinc signature found in the astacin family (Table 1).The high-molecular-weight protease HRl B, 66 kDa, contains disintegrin-like and von Willebrand-like sequences in addition to the N-terminal protease d~main.'~ The role of these peptide regions in attaching the protein to matrix components for localized action of the protease has yet to be deter- mined. The extracellular protease from Serratiu '' and the homologous proteases from the gram-negative bacterium Erwina chrysantherni, proteases B and C,16 are all believed to be zinc enzymes due to their inhibition by chelating agents.At the bottom of Table 1 are listed all 14 of the matrix metalloproteinases, MMPs, and the sea-urchin hatching enzyme HE6.17 The MMPs range in size from the 92 kDa collagenase's to the 28 kDa pump-1." All, including HE6, have a domain of 270 amino acids which is homologous to pump-1, the prototype for the MMP family. Pump-1 has the essential features for both the latent form of the enzyme, a propeptide that contains a single Cys believed to be involved in zymogen activation' and the catalytic domain containing the catalytic zinc-binding site. The larger forms contain additional domains which may be important for their binding to the matrix components. This group of enzymes is inhibited by chelating agents but not by inhibitors of serine, cysteine, or aspartic proteinases.They act at neutral pH, are activated by Ca2+ and have the ability to degrade the ECM. They are believed to be zinc enzymes since Zn can reverse the effect of chelator inhibition in assays of their activity. While all these proteins share the HExxHxxGxxH sequence, none of them has a glutamate that is 20 amino acids removed from the second histidyl residue as is observed in thermolysin. Determination of the function of the observed Gly and His residues in the formation and stabilization of the proposed zinc binding site awaits the structural Table 1 Proteins containing the HExxHxxGxxH sequence 92 96 116 -___ -~ Astacus protease TIIHELMHAIG FYH YVTI NYQNV human BMP 1* IVVHELGHVVG F WH HVSIVRENI meprin A* TIEHEILHALG FFH YVNI WWDQI human PPH* IIEHEILHALG FYH YVNI WWDQI sea-urchin emb.* TIVHEIGHAIG FHH YINVHFENV sea-urchin BPlO* TIVHEIGHAIG FHH YI NVLYQNI Drosophila tld* IIIHELGHTIG FHH HI VI NKGNI Ht-d proteinase TMAHELGHNLG MEH SLCIMRPGL HR2a proteinase TMTHEIGHNLG MEH ACIMSAVIS HT-2* TMAHELGHNLG MEH SLCIMRPGL Tg proteinase* TMTHE MGHNLG MHH C I M S K V L S R HZ* TMTHELGHNLG MEH ACIMSDVIS 9 HRlB* IMTHEMGHNLG I P H FPCIMSPMI g Serrutia sp.protease* TFTHEIGHALG LSH NPTYRDVTY 5 Serrutia ma. protease* TFTHEIGHALG LSH NPTYNDVTY b E. chrysantherni protease B* SFTHEIGHALG LSH DISYKNSAA 5 E. chrysantherni B374 protease C* TFTHEIGHALG LAH DPSYNDAVY 5 E. chrysantherni EC16 protease C* TLTHEIGHALG LNH N P S Y S D V T Y 8. human pump 1* AATHELGHSLG MGH PTYGNGDPQ 3 human fibro.collag.* VAAHELGHSLG LSH PSYTFSGDV human neut. collag.* VAAHEFGHSLG LAH PNYAFRETS bovine collag.* VAAHEFGHS LG LAH PS YTFS GDV pig collag.* VAAHELGHSLG L S H PNYIYTGDV rabbit collag.* VAAHELGHSLG LSH PNYMFSGDV rat collag.* VAAHELGHS LG LDH PIYTYGKSH rat transin* VAAHELGHSLG LFH PVYKSSTDL rat pransin 2* VAAHELGHSLG LFH PVYRFSTSQhuman stromelysin" VAAHEIGHSLG LFH PLYHSLTDL human stromelysin 2" VAAHELGHSLG LFH PLYNSFTEL human stromelysin 3" VAAHEFGHVLG LQH AFYTFRYPL sea-urchin HE6* VAAHEFGHSLG LYH PYYQGYVPN human collag. IV(72 kDa)* VAAHEFGHAMG LEH PIYTYTKNF clr human collag. IV(92 kDa)* VAAHEFGHALG LDH PMYRFTEGP * Not analysed for Zn. General Discussion determination of the homologous families.However minimally, they may signal a zinc site which is characteristic of the metalloproteases but different from thermolysin. 1 W. Stocker, R. L. Wolz, R. Zwilling, D. A. Strydom and D. S. Auld Biochemistry, 1988 27, 5026. 2 B. W. Matthews, J. N. Jansonius, P. M. Colman, B. P. Schoenborn and D. Dupourque, Nature New Biology, 1972, 238, 37. 3 J. M. Wozney, V. Rosen, A. J. Celeste, L. M.Mitsock, M. J. Whitters, R. W. Kriz, R. M. Hewick and E. A. Wang, Science, 1988, 242, 1528. 4 K. Titani, H-J.To&, S. Hormel, S. Kumar, K. A. Walsh, J. Radl, H. Neurath and R. Zwilling, Biochemistry, 1987, 26, 222. 5 B. L. Vallee and D. S. Auld, Biochemistry, 1990, 29, 5647. 6 E. Dumennuth, E. E. Sterchi, W. Jiang, R. L. Wolz, J. S. Bond, A. V. Flannery and R.J. Beynon, J. Biol. Chem., 1991,266, 21381. 7 M.J. Shimell, E. L. Ferguson, S. R. Childs and M. B. O'Connor, Cell, 1991, 67, 469. 8 P. Bork, FEBS Lett., 1991, 282, 9. 9 S. M. Sat0 and T. D. Sargent, Deeu. Biol. 1990, 137, 135. 10 T. Lapage, C. Ghiglion and C. Gache, Deeuelopment, 1992 114, 147. 11 S. D. Reynolds, L. M.Angerer, J. Palis, A. Nasir and R. C. Angerer, Development, 1992, 114, 796. 12 W. Stocker, M. Ng and D. S. Auld, Biochemistry, 1990, 29, 10418. 13 J. B. Bjarnason and T. Tu, Biochemistry, 1978, 17, 3395. 14 M. Takeya, K. Oda, T. Miyata and T. Omari-Satoh, J. Biol. Chem., 1990, 265, 16068. 15 K. Nakahama, K. Yoshimura, R. Marumoto, M. Kikuchi, I. S. Lee, T. Hase and H. Matsuhara, Nucleic Acids Res., 1986, 14, 5843.16 P. Delepelaire and C. Wandersman, J. Biol. Chem., 1990, 265, 17118. 17 T. Lepage and C. Gache, EMBO J, 1990,9,3003. 18 S. M. Wilhelm, I. E. Collier, B. L. Marmer, A. Z. Eisen, G. A. Grant and G. I. Goldberg, J. Biol. Chem., 1989,264, 17213. 19 B. Quantin, G. Murphy and R. Breathnach, Biochemistry, 1989, 28, 5327. 20 H. Emonard and J-A. Grimaud, Cell. Mol. Biol., 1990, 36, 131. Dr. A. Berry (University ofCarnbridge) addressed Prof. Vallee: You have shown that the sequence motif L,-short spacer-L,-long spacer-L, that you have described can be used as a tool to predict the zinc-binding site in zinc metalloproteins whose structures are not known (e.g. leukotiene & hydrolase). We have initiated such a study of the zinc-binding site of the class I1 fructose bisphosphate aldolases and have identified the ligands by mutagenesis.We also looked at the structures of four enzymes known to contain a catalytically active zinc atom, to see if any secondary structural elements were conserved in zinc-binding sites, for example whether the L1,Lz, L3 pattern was always associated with helix, sheet or turns, and could find no correlation. Since you have obviously looked at many more examples than we were able to, would you comment on whether your zinc-binding sequence motif describes any zinc-binding structural motif in catalytic zinc-binding sites. Prof. Vallee and Dr. Auld replied: We have inspected the secondary structures surrounding the catalytic zinc sites of several enzymes for the possibility that they might reveal general principles that would predict such sites. Our search has not, as yet, progressed to a point that would unequivocally predict such features obligatory for active zinc sites in systems whose three-dimensional structure has not been identified yet.A summary of the general state of our present knowledge is brief. We have already reported the secondary structures surrounding the zinc ligands in certain families of zinc enzymes'*2 where an a-helical or @-structural region of the protein supplies the metal ligands of the catalytic zinc sites (Table 2). Here the length of the short spacer conditions the support structure. Thus, in carbonic anhydrase the short spacer consists of but one amino acid. As a consequence the ligands are provided by the same side of the P-sheet.In contrast, when the spacer consists of three amino acids, as e.g. in thermolysin, an a-helical support structure becomes feasible and juxtaposes the ligands in an orientation suitable to establish a tetrahedral-like coordination sphere. 1 B, L. Vallee and D. S. Auld, Biochemistry, 1990, 29, 5647. 2 B. L. Vallee and D. S. Auld, this Discussion. General Discussion Table 2 Ligands to catalytic zinc sites: secondary structural environment CA 1,II P P P CPD-A, B turn ff P TL ff ff ff AE ff a! ff PLC ff ff turn NP1 ff ff P AP ff ff P ADH ff P ff L,-L, refer to zinc ligands 1-3. a and /3 refer to a -helical and &structural elements while turn refers to a region between some form of either LY or P structural element.Prof. Rossi commented: We are working on an alcohol dehydrogenase purified from an extremophile, an archeobacterium growing at 87"C. This enzyme is composed of two-identical subunits of 36 000 Da and contains two zinc atoms and five cystein residues per subunit. It is thermostable and by dialysis against EDTA loses its thermostability but not its activity. In this case one atom of zinc is lost per subunit. The sequences of either the gene or the protein have shown ca. 33% homology with horse-liver alcohol dehydrogenase (HLADH) and conserved sequences at the structural and catalytic sites. However, from molecular modelling studies it has been shown that zinc is coordinated by three cysteins and a glutammic acid residue instead of four cystein as in HLADH and other proteins with structural zinc.It is likely that this is one of the reasons for the high stability of the enzyme and I would like to have a comment on this from Dr. Vallee. 1 S. Ammendola et al., Biochemistry, submitted. Prof. Vallee and Dr. Auld replied: Prof. Rossi's observations on the structural zinc ligands of the archeobacterium, extremophile ADH are unique thus far, to the best of our knowledge. The combination of three Cys and one Glu in such a site has not been reported previously, although combination of three Cys and one His has been observed in the G32 protein' which is not, of course, an enzyme. The replacement of a Cys by a Glu should not materially alter the stability of zinc binding and could be reflected in greater thermostability of the protein.Recent studies of a zinc-containing adenylate kinase have shown that the denaturation temperature of the apo enzyme is ca. 10" lower than that of the zinc enzyme.2 A Cys Cys Cys Glu site may be but the first of novel combinations of structural ligands and suggest the existence of other additional combina- tions. Thus, one might yet encounter the replacement of the Glu by an Asp in other, similar, instances. 1 D. P. Giedroc, K. M. Keating, K. R. Williams, W. H. Konigsberg and J. E. Coleman, Roc. Natl. Acad. Sci. USA, 1986, 83, 8452. 2 P. Glaser, E. Presecan, M. Delepierre, W. K. Surewicz, H. H. Mantsch, 0. B6rzu and A-M.Gilles, Biochemistry, 1992, 31, 3038. Prof. Karplus addressed two questions to Prof. Vallee: Is there any difference between Asp and Glu as a Zn ligand and are there any cases where two Asp occur (in analogy with Ca2+ binding sites which may have three charged ligands)? General Discussion You mention in your paper the role of the flexibility of Zn binding in catalysis. I wonder if you could amplify on that? Prof. Vallee and Dr. Auld replied: Glutamates have been encountered most frequently in catalytic sites of mono-zinc enzymes. Their presence in such sites may possibly reflect greater flexibility imparted to the zinc coordination sphere due to the extra methylene group in Glu vs. Asp. However, in di- or tri-zinc coactive sites' aspartate residues are prevalent.Asp frequently is the bridging ligand between two juxtaposed metal sites. Asp may be preferred here over Glu owing to the need to have the metals in close proximity, calling for less flexibility in the bridging ligand. Asp might also provide a better direct conduit of protein conformational changes to the zinc site, if any. Two Asp ligands are found in the coactive (second zinc) binding sites of alkaline phosphatase,' phospholipase C,' nuclease P12 and one of the zinc-binding sites of lens aminopeptidase.' Your second question is similarly interesting. While the theoretical implications have received some attention, experimental scrutiny of the problem has been lacking. The discernment of short and long spacers in enzyme sites now allows synthetic and mutagenic replacements to be made which should reveal interactions of the intervening residues, other than acid-base participation, that may be critical to catalysis. The entire panoply of structural methods now available coupled with functional consequences of the alterations should serve to evaluate the structure-function relationships.In collabor-ation with Hans Jornvall we have begun such work on a somewhat simpler system, i.e. the peptide surrounding the horse ADH-structural binding site.3 Thus far we can only report that folding of the synthetic replicate as well as zinc and cobalt binding are virtually identical to those in its natural state. Studies of the effects of amino acid substitution are in progress.The role of zinc ligation in placing amino acid residues in contact with recognition groups likely also occurs in DNA-binding proteins. Mutagenic studies of the nuclear receptor families have shown that discrimination among oestrogen, glucocorticoid and thyroid hormone receptors largely depends upon a series of residues in the zinc-a-helix recognition sequence, i.e. EGA, GSV and EGG, re~pectively.~ The number and order of glycines in such sequences may profoundly affect the resulting orientation of amino acid residues within the a-helix that are in contact with the DNA recognition site. Clearly, this is but the beginning of what promises to be an exciting chapter in the exploration of the mechanisms underlying molecular recognition in zinc enzymes and DNA-binding proteins.1 See B. L. Vallee and D. S. Auld, references in these Discussions. 2 A. Volbeda, A. Lahm, F. Sakiyama and D. Suck, EMBO J., 1991,10, 1607. 3 T. Bergman, H. Jornvall, B. Holmquist and B. L. Vallee, Eur. J. Biochem., 1992, 205, 467. 4 M. Beato, Cell, 1989, 56, 335. Dr. Rippmann asked: Recently it has been reported that HIV protease can be inhibited by zinc. By analogy to renin where there is a heavy-metal binding site near the reactive aspartates it can be concluded that zinc binds in a similar position in HIV protease. Could you comment on this putative binding site? Prof. Vallee and Dr. AuId replied: Zinc has been reported to inhibit both the HIV protease and renin which have a pair of Asp residues in their catalytic sites.' It is not known at present whether the carboxylates of these residues are the ligands to the inhibitory zinc.However, in the last few years different patterns of zinc inhibition have emerged. An inhibitory zinc-binding site in the serine protease, tonin, has been observed where the zinc is bound to the catalytic His-57 as well as His-97 and His-99.* In this case the coordination sphere is completed by Glu-148 from a neighbouring protein molecule in the crystals. Such protein-zinc-protein interactions have also been proposed General Discussion as a basis for the strong binding of human growth hormone to human prolactin receptor^.^ A different mechanism of zinc inhibition has been proposed for the metalloproteases exemplified by carboxypeptida~e-A.~*~ In this case an inhibitory Zn(0H)Cl binds to both the carboxylate of Glu-270 and the catalytic zinc, thus forming a bridged hydroxide complex with a Ki of 5 x mol dm-3.1 Z-Y.Zhang, 1. M. Reardon, J. O., Hui, K. L. O’Connell, R.A. Poorman, A. G.Tomasselli and R. L. Heinrikson, Biochemistry, 1991,30, 717. 2 M. Fujinaga and M.N. G. James, J. Mol. Biol., 1987, 195, 373. 3 B. C. Cunningham, S. Bass, G. Fuh and J. A. Wells, Science, 1990, 250, 1709. 4 K. S. Larsen and D. S. Auld, Biochemistry, 1989, 28, 9620. 5 K. S. Larsen and D. S. Auld, Biochemistry, 1991,30, 2613. Dr. I).R.Brown asked: It has been mentioned that Zn2+is non-toxic. Zn2+therefore presumably will not replace other metals in enzymes. It is the smallest divalent ion of the first-row transition elements.Is this the physical parameter of importance here? If so, why is Cd2+ toxic, when it is of similar size? Prof. Vallee and Dr. Auld said: We regret that the substance of the questions presented would have to be rephrased substantially to allow for a meaningful answer within the limits of current knowledge. It is prudent to remember that ‘toxicity’ is a biological not a chemical concept and used loosely but often not understood well chemically. The chemical basis of metal toxicology remains poorly defined, leaving the subject quite phenomenological. It is tempting to relate toxicological manifestations to fundamental physical chemical metal properties, but there is little evidence for meaningful correla- tions.’ Given the state of the art it would seem most valuable to identify and evade false premises and their experimental pursuit.It seems appropriate to point out that the biological effects of metals are commonly presented as a continuum-from physiology to pharmacology to toxicology and pathology. It stands to reason that the underlying chemistry is generally assumed to follow a similar progression. However, in point of fact this is not the case. The structures of toxicological metal-protein interactions and the resultant pathology differ from these and are either unknown or ill defined, by and large. Zinc is the only one of the pre-, post- and transition elements recognized to be non-toxic while being indispensable to all forms of life and critical to transmission of the genetic message.The homeostatic mechanisms regulating zinc absorption and retention operate with such efficiency that zinc overload is extremely unlikely and zinc can therefore be considered essentially non-toxic.2 The induction of the low-molecular- weight peptide thionein by heavy metals can radily result in the scavenging of up to seven mol of zinc per mole of thionein which thus likely becomes one of the critical components in the regulation of the free zinc content of cells. Reports of zinc toxicity, e.g. gastrointestinal distress and diarrhoea, are largely anecdotal and have been experienced e.g. by military personnel who drank limeade prepared in galvanized container^.^ Oral ingestion of even large, excess amounts of zinc in the form of its salts have not been reported to cause such symptoms. The occurrence of copper deficiency, accompanied by anaemia, in some cases of prolonged supplemental zinc therapy has been referred to erroneously as zinc toxicity4 when in fact, it constitutes an example of a ‘conditioned defi~iency’.~ This term describes the induction of deficiency of one nutrient (copper) by the ingestion of another which acts as the conditioning factor (zinc).Thus, in this case zinc is the conditioning factor while copper is made deficient by it. Discontinuation of zinc administration restores normal copper metabolism. As another example pertinent to the metabolism of zinc, calcium is a conditioning factor for zinc both in many plants and in swine; in these instances calcium makes zinc deficient.6 The mechanisms are not known in these and many other, similar cases.124 General Discussion 1 R. B. Martin in Handbook on Toxicity ofznorgunic Compounds, ed. H. G. Seiler and H. Sigel, Marcel Dekker, New York, 1988 pp 9-26. 2 R. L. Bertholf, ref. 1, pp. 787-796. 3 G. R. Callender and C. J. Gentzkow, Military Surgery, 1937, 80, 67. 4 G. J. Fosmire, Am. J. Clin. Nutr., 1990, 51, 225. 5 B. Ershoff, Physiol. Rev., 1948, 28, 107. 6 H. F. Tucker and W. D. Salmon, Proc. SOC.Exp. Biol. Med., 1953,88, 613. Dr. J. A. Littlechild (University of Exeter) communicated: The type I1 aldolase enzymes are another group of enzymes that bind zinc. In this case the metal ion is required for activity of the enzyme and not stability.The metal ion can be substituted by other metals such as cobalt, manganese and cadmium. We have crystallised the type I1 aldolase from the thermophilic bacterium, Bacillus stearothermophilus. The type 11 aldolase enzymes appear to have two conserved histidine residues in the motif HXDH. Candidates for the third ligand-binding site are found at conserved histidines and glutamic acids at several spacings consistent with other catalytic zinc-binding enzymes. The exact location of the zinc ligand must await the structure determination and accompanying mutagenesis experiments. Dr. A. Thomson (University of Lancaster) opened the discussion of Dr. Ito's paper: I have a question and a comment.The question: Whittaker and Whittaker in their EPR work on the apoenzyme saw a radical. Was this a tyrosyl radical or a substituted tyrosyl radical and, if the former, how do you explain their results in the context of your tyrosyl-cysteine bond? The comment: The tyrosine aromatic ring and the indole of the tryptophan appear to be a similar distance apart in the active site to similar rings in charge-transfer complexes. It is possible that the electron is abstracted from the tryptophan rather than the tyrosinate to give an indolyl radical cation which would be stabilised by the adjacent negative phenolate of the tyrosine. This radical cation could then act as the direct hydrogen abstractor in the catalytic process. Dr. N. Ito (University of Leeds) responded: In the paper about the EPR signal in apo GOase, Whittaker et al.' did not comment about the nature of the tyrosine residue.More recently, however, they have carried out EPR and ENDOR studies2 to show that the tyrosine residue is modified and misses one ring hydrogen. This is consistent with our observation of the Tyr-Cys bridge. As for the possibility of a Trp radical, it is plausible, although there has been no report to suggest its presence in GOase. Our model is based on a report that the electron and proton transfer are coupled but independent in the catalytic mechanism of G0a~e.~ If this is not the case, Trp-290 in our model might be described as a radical rather than a base, as pointed out. 1 M. M. Whittaker and J.W. Whittaker, J. Biol. Chem., 1990, 265, 9610. 2 G. T. Babcock et al., J. Am. Chem. Soc., 1992, 114, 3727. 3 G. A. Hamilton, hog. Bioorg. Chem., 1971, 1, 83. Dr. Edwards commented: The suggestions that Trp-290 is the residue ionising with pK, =7.25 and that the nitrogen anion or radical is involved in hydrogen removal from the C(6) methylene of D-galactose suffer from the objections that the pK, shift from that of indole (pK,= 17) is much too large to be credible except in the circumstance that Trp-290 exists as a radical cation. If this were the case then the ionisation would produce an indole radical which is exposed to capture by dioxygen in the solvent; the enzyme would have a very short lifetime, which is not observed. Hydrogen removal from C(6) by the oxygen of Tyr-272 is highly plausible and ionisation of weakly coordinated Tyr-494 with pK, =7.25 would be expected to change the redox potential markedly.General Discussion Dr. It0 responded: Our crystallographic data cannot rule out the suggested model, where Tyr-272 acts as a hydrogen acceptor. One reason we prefer our model is that a tyrosine residue strongly coordinated to copper would have a pK, much lower than 7.25. It would be difficult for such a tyrosine to hold a proton, unless molecular oxygen binds to the copper immediately after the aldehyde leaves it, which seems to us unlikely. We agree that the pK, shift must involve modification of Trp-290, which could be either a Trp radical or through association of Trp-290 with the tyrosine radical.The proposal that Tyr-495 is responsible for the pK, of 7.25 associated with the enzyme activation is an alternative explanation. Prof. Wuthrich addressed Prof. Dodson: I have two questions. First, are the internal cavities in your system filled with X-ray observable water? Secondly, are the internal cavities lined with a single layer or with multiple layers of hydration water molecules? Prof. Dodson replied: In reply to the first question as to whether there is still space in the internal structure of the protein for unidentified water molecules, the answer is that in the mucor lipase protein the cavities are filled entirely by observed water structure. The second question concerns whether there is more than one layer of water molecules in the internal cavity of the protein.This is partly a matter of definition. There are water molecules that contact two other water molecules but they also contact protein. Nor are these structures large enough really to be referred to as layers. The packing is, however, more than just the occasional individual molecule. They do connect and they do form clusters involving two, three or more water molecules. Dr. Ito responded to Prof. Wuthrich: All the gaps between the finger from D3 and D2 of GOase are filled with the ordered internal water molecules. In connection with this point, it might be interesting to mention an internal acetate ion, which is found in the structure at pH 4.5. This acetate ion is completely buried in the protein molecule, as the Connolly surface calculated from the refined coordinates indicates that its binding site is inaccessible from the solvent. The nearest surface is ca.4 A from the carboxyl carbon of the acetate and yet the anion is replaced by three water molecules in the structure at pH 7.0, suggesting that the acetate ion is exchangeable. The protein wall separating the acetate ion and the solvent is therefore flexible enough to allow exchange. -Prof. Wuthrich commented to Prof. Dodson: It appears to me that your protein structure could perhaps be classified as an inverted micelle. Prof. Dodson replied: The extensive internal water structure in the top half of the mucor lipase, which we think may be buried in lipid, is indeed reminiscent of an inverted micelle.Of course this could reflect the requirements for stability for the enzyme in a lipid (non-polar) environment at the water/lipid interface, a reversal of the normal situation for globular proteins. Prof. Eisenberg asked Dr. Ito: Is it possible to denature and then renature the complicated and beautiful structure of galactose oxidase that you have described? If so, what are your ideas about how the structure might spontaneously fold and assemble? Dr. Ito responded: We have not done any unfolding/refolding experiments and are unaware that this has been attempted in other laboratories. We agree these would be very interesting experiments. As for the folding of the p-flower, the presence of internal water molecules between D2 and the finger from D3 may give some insight about how it folds.If the finger is formed first and D2 forms around it later using the finger as a core, then one might General Discussion expect the water to be replaced to allow tighter interactions between the two domains. On the other hand, if D2 forms first, followed by penetration of the finger, some water could be trapped between them as we see in the structures. Dr. P. I?. Knowles (University of Leeds) said: The active form of galactose oxidase is generated by one-electron oxidation of an inactive form. The inactive form gives a Cu2+ EPR signal whereas the active form does not. There is strong evidence from Whittaker and co-workers that the absence of an EPR signal in the active species is due to antiferromagnetic exchange coupling between Cu2+and a tyrosine free-radical species.There is no immediately obvious way to observe the tyrosine free radical directly by EPR, this would require selective reduction of Cu2+to Cu". Prof. Symons commented: Regarding the stabilisation of tyrosyl radicals in proteins, my feeling is that this is often akin to matrix isolation rather than any form of chemical stabilisation. This is supported by the fact that the benzene ring always has a strongly preferred orientation relative to the -CHx-unit. In your enzyme, given that the tyrosyl anion is a ligand to copper, one cannot really envisage a tyrosyl radical as such, since there will be strong d(r)-p(r) overlap and hence strong copper participation.Dr. Knowles replied: It is clear that something other than interaction with the Cu'' centre is involved in stabilization of the tyrosyl radical in galactose oxidase. The radical generated by oxidation of apo-galactose oxidase is quite stable. Support for Prof. Symons's proposal on stabilization of tyrosyl radicals in proteins comes from EPR and ENDOR studies,'*2 which show that the benzene ring of tyrosine in the oxidised apo-enzyme has a preferred orientation relative to the p -CH, group. There is no direct evidence, to date, showing that the thioether bond between Tyr-272 and Cys-228 or the stacking interaction involving Trp-290, participate in stabilization of the tyrosyl radical in either the holo or apo forms of galactose oxidase.With respect to interaction between the tyrosyl radical and the Cu" centre, it has been estimated from magnetic susceptibility studies that there is an antiferromagnetic exchange splitting of at least 200 cm-' in the ground state. 1 M.M.Whittaker and J. W. Whittaker, J. Biol. Chem., 1990, 265, 9610. 2 G. T. Babcock et ab, J. Am. Chem. SOC.,1992, 114, 3727. Dr. Littlechild communicated: The role of a tyrosine residue in the catalytic mechan- ism of another enzyme, the type I aldolases can be explained as described. The structure determination of the type 1 human-muscle A aldolase has helped to elucidate the role of the carboxy-terminal tyrosine residue known to be involved in catalysis' This residue is at the end of a 'tail' which positions it in the centre of the P-barrel structure of this enzyme.A histidine residue at position 361 interacts with both the fructose bisphosphate substrates 1 phosphate and the C-terminal carboxyl group. These interactions clearly help orient the terminal tyrosine side chain relative to the substrate. The protonated C-terminal phenolic hydroxyl is therefore positioned so as to both collect and donate a proton during the bond breaking and making steps. The B (liver) isozyme has a lower activity towards fructose-l,6-bisphosphateand utilises fructose-1-phosphate. This can be explained in terms of the altered role of the terminal residue. In B enzyme the histidine is replaced by a tyrosine which presumably precludes the terminal tyrosine from carrying out the function as for the A isozyme. This suggests that in this case a water molecule collects and donates, albeit less efficiently, the necessary protons.1 J. Littlechild, S.Gamblin, P. Holden and H.C. Watson, unpublished results. General Discussion Prof. S. S. Taylor ( University of California, San Diego, USA)addressed Prof. Schulz: 1. Could you correlate more specifically the orientation of the arginines and the glycine loop in the open and closed structures? 2. What is the primary factor that triggers the closing of the cleft? Prof. G. E. Schulz ( Universityof Freiburg, Germany) replied: 1. In the open structure the arginines dangle somehow into the solution, they are not well connected. In the closed structure the argines come down and form the active site, building up a pathway for the transferred phosphoryl group.The glycine-rich loop is special. Its sequence fingerprint -G-X-X-G-X-G-K- is observed in an increasing family of proteins that bind nucleoside triphosphates.' The peculiar spatial structure of this Gly loop with the following residues is identical in all known cases, i.e. in the nucleoside monophosphate kinases (e.g. adenylate kinases) and in the G-proteins (e.g.H-ras-p21). The loop forms a giant anion hole fixing the p-(and also a-) phosphate of NTP with six hydrogen bonds from its amides. The role of the first glycine of the fingerprint, which is most stringently conserved, remains obscure. A side chain at this position would neither interfere with the backbone atoms nor with the environment.However, there is an indication from a not very natural adenylate kinase structure' that the peptide bond following this glycine may turn around by 180" during catalysis in conjunction with a movement of the lysine to which it is hydrogen bonded. This lysine is strictly conserved. I suggest that it accompanies the transferred y-phosphoryl group (see Fig. 5 in our paper) along its pathway to the attacking a-phosphate of AMP. Thus the glycine-rich loop undergoes no movement during the closing motion, but it is likely to move in the subsequent phosphoryl transfer. 2. I consider the incoming negative charges of the phosphates as the trigger. They are bound and attract electrostatically the arginines, some of which are in the moving domain INSERT.1 G. E. Schulz, Curt-. Opin.Struct. Biol., 1992, 2, 61. 2 D. Dreusicke and G. E. Schulz, J. Mol. Bid., 1988, 203, 1021. Prof. Buckingham asked: You postulate two conformers, E,,, and Einact.,of the class I AMP-kinase mutants. What are the rates of interconversion of the conformers, and what are the forces that give rise to the energy barrier separating them? Prof. Schulz replied: We have quite a number of different crystal structures of vmious adenylate kinases without ATP that show the INSERT domain at diverse rotational positions. INSERT behaves like a rigid body of 38 residues (see Fig. 3 in our paper) that moves rather freely around a hinge, its relative position with respect to the main body is determined by crystal packing rather than by a directional force at the hinge.I expect that bound ATP fixes this rigid body whenever it comes near to it during its tumbling motions. Fixing by ATP causes two aspartates to orient two arginines in such a manner that these arginines build up a stabilizing pathway for the transferred y- phosporyl group (Fig. 5 in our paper). The process of opening and closing is rather fast; the turnover numbers tell us that a catalytic cycle requires 1-2ms. Dr. Littlechild communicated: in the context of proteins that 'hinge-bend' during their catalytic cycle, another important example is the glycolytic enzyme phosphoglycer- ate kinase. Unlike adenylate kinase, all of the crystal structures that have been solved to date for this enzyme are in the so-called 'open' conformation.We have recently solved the structure to 1.6 A resolution of phosphoglycerate kinase from the thermophile Bacillus stearothermophilus co-crystallised in the presence of ATP.' In this case the 'closing' of the enzyme is not brought about by ATP binding alone. The orientation of the two domains of the enzyme has changed slightly from that for the yeast enzyme so that they move closer to each other by 4.2". General Discussion We have made many site-directed mutations at the 3-phosphoglycerate binding site of the yeast enzyme which is located in the N-terminal domain of the protein. These all affect the K, value for this substrate but also affect the K, value for ATP which binds to the C-terminal domain.This is a similar situation to that described for the adenylate kinase enzyme and is related to the dynamic structure of this group of enzymes. 1 G. Davies, S. Gamblin, J. Littlechild and H.C. Watson, Proteins, 1992, in the press. Prof. Blow asked Prof. Dodson: Do you have any thoughts on how the interaction of lipase with a more extensive hydrophobic matrix might be modelled in crystals? Prof. Dodson replied: There are two approaches that we are considering. The first is to try reacting the lipase molecule in solution with triglyceride analogues in which one of the esters is replaced by the phosphonate group. This approach requires difficult chemistry and there will be many problems associated with the crystallisation of the complex. The second approach is based on the recombinant lipase in which the active serine is replaced by alanine.This enzyme binds, but does not cleave, lipids. In these experiments we are attempting to co-crystallise triglycerides, diglycerides and related molecules with the enzyme in conditions where we think the molecule will be in its activated state. As with the earlier experiments on the smaller substrate analogues the crystallisation experiments are proving difficult indeed. Dr. 0. Misset (Gist-Brocades, Delft, The Netkerlunds) said: My question relates to the mechanism of lipolysis. It is known that lipases catalyse the hydrolysis of e.g. emulsified olive oil with rate constants of 5000 s-I. Two mechanisms can be envisaged: Either (i) the enzyme binds to the interface, performs one hydrolysis and goes back into solution: all with a frequency of 5000 s-’ (including the lid movement).Or (ii) the enzyme binds to the interface and remains (in the active ‘open’ form) attached and performs, like a type of grass-mowing machine, the hydrolysis of many trig1 ycerides. Can you comment on this? Prof. Dodson replied: The mechanism by which the enzyme switches from the inactive to the active state is a simple one. The movement of the small helical lid by cu. 8 A is easily made and is accompanied by a significant extension of the non-polar surface. Both the mechanisms suggested seem to me to be possible. The catalytic frequency of cu. 5000s-’ is quite consistent with the simple motion of the helix and is quite consistent too with the enzyme binding to the lipid interface, reacting and being released and becoming inactive in the aqueous environment.If the enzyme remained attached to the interface in the active state this would present the problem of water access which is needed for the deacylation step, but would not, I think, present problems for removal of products and arrival of new substrate. In my view it seems likely that the catalytic hydrolysis of the lipids could be brought about by a combination of the two mechanisms. Prof. Karplus asked: It is expected that the helix ‘lid’ is in equilibrium between the active and inactive conformation. In aqueous solution the inactive conformation is favoured because it exposes the hydrophilic face of the helix. Would it be possible to mutate the hydrophobic face of the helix to make it more hydrophilic and so favour the active conformation in aqueous solution? Since water-soluble substrates exist this should increase the catalysis rate. Such a result would support the proposed activation mechanism. General Discussion Prof. Dodson replied: The mutation of the hydrophobic face of the helix to a more hydrophilic character would we believe stabilise an open and therefore active conforma- tion in aqueous solution. It would therefore follow indeed that the soluble structure would be hydrolysed by the mutant enzyme. This would indeed support the proposed activation mechanism. Dr. Edwards said: Since the aqueous solubility of most lipids is vanishingly small, can one be sure that the lack of activity at very low lipid concentrations is not due to very slow mass transfer through the aqueous phase? Prof. Dodson replied: The best evidence that the lipase is genuinely inactive in aqueous conditions is that reagents such as DNPP do not react with the enzyme in solution. This reagent, however, reacts rapidly with the activated serines in serine proteases.
ISSN:1359-6640
DOI:10.1039/FD9929300107
出版商:RSC
年代:1992
数据来源: RSC
|
|