Data mining of metal ion environments present in protein structures
Introduction
Metal ions are frequently observed in protein structures, and are often crucial for protein function, stability, or both. Moreover, in many cases metal ions are critical for crystal formation as the ions mediate crystal contacts between proteins. In the release dated February 20, 2007 of the Protein Data Bank (PDB) [1], approximately 30% of structures contained metal ions. Among 23,537 structures of proteins complexed with one or more small molecular ligands; 20% contained one or more metal ions close to the ligand binding site that are likely to interact either directly or indirectly with the ligand. 10% of the structures have a direct cation–ligand contact and the other 10% have a cation–ligand interaction bridged by an amino acid or ordered water molecules. This detailed analysis of the metal coordination architecture within proteins represents an important addition to the understanding of the biochemical functions of metalloproteins.
The ratio of the number of observed data to the number of parameters used in structure refinement depends on the data resolution and the number of atoms in a crystallographic asymmetric unit. For macromolecular structures, this ratio is usually low, due to the limited resolution of the data used to determine such structures. Therefore, the use of model restraints is a nearly universally applied technique in model building and structure refinement processes [2]. In addition to the stereochemical restraints for the macromolecule itself [3], [4], it is essential to apply restraints to the metal ion-binding site (and subsequently interpret the electron density) taking into account the coordination properties of the cation. In all the most popular programs used for macromolecular structure refinement, the restraints for metal–ligand interactions must be manually defined by the user. While the stereochemistry of proteins and nucleic acids is well understood, there is no universal approach to describe the geometry of metal ion-binding sites. Alkaline earth cations such as calcium and magnesium are relatively easy to identify in electron density as the geometrical parameters (e.g. bond lengths and coordination number) of their binding sites are very well characterized [5], [6], [7], [8]. Alkali metal ions such as sodium and potassium, however, are more difficult to identify because their coordination spheres are not as regular as those of alkaline earth metal ions [9]. Transition metals have even more complex binding patterns as not only can their coordination numbers vary but they can have different oxidation states. The bond lengths for transition metals depend on their oxidation state and even within the same oxidation state, different bond lengths are observed due to known geometrical distortions of the coordination spheres, for example due to the Jahn–Teller effect [10] or different spin state.
Studies describing the geometry of metal ion-binding sites within proteins and in small-molecule structures were recently extensively discussed in a series of papers by Harding [5], [6], [7], [8], [9], [11]. Here, in contrast, our objective is to analyze the properties of metal ion-binding sites in protein structures as a function of structure resolution and crystallographic methodology. In particular, we report a relational database approach to statistically analyze metal ion sites in protein structures present in the PDB [1], and compare them to high-resolution small-molecule structures obtained from the Cambridge Structural Database (CSD) [12]. We not only examined the distributions of bond lengths and coordination numbers but also the B-factors (displacement parameter sometimes referred as ‘temperature factor’) and relative occupancies of metal ions versus their coordinating atoms were analyzed. The distributions were cross-correlated with the computer programs used for structure refinement. Our results show some abnormally high or low values of bond lengths and B-factors in metal-binding sites reported in the PDB. Despite many theoretical papers describing proper geometrical restraints for metal ion environments, our examination of recent structures indicates that those restraints are often not properly used in structure refinement.
Section snippets
Data set under investigation
This work is based on the PDB database release of February 20, 2007 (41,814 structures). All structures in PDB which contain one or more Ca, Mg, Na, K, Mn, Co, Fe, Zn, Ni, Cu cations are included in the statistical analysis unless otherwise specified. In the analyses of structure resolution, B-factor or occupancy, only metal ion-binding sites in protein structures solved by X-ray crystallography were included. For purposes of comparative analysis, the set was subdivided; structures with
Atom type and amino acid profiles of metal ion-binding sites
A distribution of normalized frequencies Fatom of atoms located within 3 Å from the metal ion is shown in Table 1. The same table generated with a cutoff of 4 Å gives similar, but somewhat noisier, results. The non-redundant subset of structures, containing around 30% data of the complete data set, gives very similar results to the complete data set shown in Table 1. The number of interactions listed in the last row of both Table 1a and b represents the number of pairs (in this case, a metal ion
Atoms and amino acids participating in metal ion-binding
All analyzed metal ions except Cu show a preference for interaction with a side chain carboxylate group (Table 1). Alkaline earth metal ions (Ca2+, Mg2+) exhibit the highest preference for coordination by side chain carboxylate groups followed by a weaker preference for interaction with oxygen atoms from side chain amide groups. Alkali metal ions (Na+, K+) are preferred approximately equally by all types of oxygen atoms. Metal ions from both the imidazole class (Mn, Co, Fe) and the sulfur class
Conclusion
Analysis of PDB structures that contain metal ions reveals that despite the several publications providing an excellent description of the geometry of metal ion environments, there are still many structures (even some solved very recently) that have quite unusual geometry. Often, the geometries of metal ion-binding sites were not properly restrained, most probably due to the lack of mechanisms to automatically generate such restraints in all of the commonly used refinement programs. We suggest
Acknowledgments
We would like to thank Zbigniew Dauter, Andrzej Joachimiak, and Matthew Zimmerman for critically reading the manuscript and making valuable comments. The work was supported by NIH Grants GM74942 and GM53163.
References (22)
- et al.
J. Inorg. Biochem.
(1998) - et al.
Nucleic Acids Res.
(2000) Acta Crystallogr. D
(2007)- et al.
Acta Crystallogr. A.
(1991) - et al.
Acta Crystallogr. D
(2007) Acta Crystallogr. D
(2001)Acta Crystallogr. D
(1999)Acta Crystallogr. D
(2000)Acta Crystallogr. D
(2006)Acta Crystallogr. D
(2002)