Original article
The consequences of scoring docked ligand conformations using free energy correlations

https://doi.org/10.1016/j.ejmech.2006.12.037Get rights and content

Abstract

Ligands from a set of 19 protein–ligand complexes were re-docked with AutoDock, GOLD and FlexX using the scoring algorithms native to these programs supplemented by analysis using the HINT free energy force field. A HINT scoring function was calibrated for this data set using a simple linear regression of total HINT score for crystal-structure complexes vs. measured free energy of binding. This function had an r2 of 0.84 and a standard error of ±0.42 kcal mol−1. The free energies of binding were calculated for the best poses using the AutoDock, GOLD and FlexX scoring functions. The AutoDock and GoldScore algorithms estimated more than half of the binding free energies within the reported calibration standard errors for these functions, while that of FlexX did not. In contrast, the calibrated HINT scoring function identified optimized poses with standard errors near ±0.5 kcal mol−1. When the metric of success is minimum RMSD (vs. crystallographic coordinates) the three docking programs were more successful, with mean RMSDs for the top-ranking poses in the 19 complexes of 3.38, 2.52 and 2.62 Å for AutoDock, GOLD and FlexX, respectively. Two key observations in this study have general relevance for computational medicinal chemistry: first, while optimizing RMSD with docking score functions is clearly of value, these functions may be less well optimized for free energy of binding, which has broader applicability in virtual screening and drug discovery than RMSD; second, scoring functions uniquely calibrated for the data set or sets under study should nearly always be preferable to universal scoring functions. Due to these advantages, the poses selected by the HINT score also required less post-docking structure optimization to produce usable molecular models. Most of these features may be achievable with other scoring functions.

Introduction

The continuing search for therapeutic agents at lower cost and at greater speed (lab bench to bedside) has highlighted the value of computational methods for virtual screening of real or hypothetical libraries to identify new compounds with affinity for the target biomacromolecule. Docking of ligands into models of the protein or macromolecule active site and scoring of these ligands with respect to “fitness” are the most critical issues in virtual drug screening methods [1], [2], [3], [4], [5], [6], [7]. Methods that are both fast and reliable are particularly required for the prediction of binding affinity when screening a large library of compounds. While significant strides in reaching this goal over the past several years have been reported, real application of these computational tools has seldom delivered results equal to the promise of the validations and/or runs on test sets.

The docking and scoring paradigm can be thought of as two separate problems: first, to create plausible “poses” of the putative ligand within the active site, and second, to identify which of these poses is most likely to be true, i.e., binding most favorably with the target. The first problem is simply geometry, or more broadly informatics, i.e., how can we place a solid object (ligand) within a “cavity” of another solid (protein) in well-defined Cartesian space? The second problem is chemical, i.e., analyzing the specific structure and interactions of the docked ligand–protein model with an assigned score or ultimately a binding free energy. Often, however, the pose generation relies on some form of intermediate scoring to increase the plausibility of the poses presented for final evaluation. There are a wide range of algorithms and approaches used to produce docking poses [8], [9], [10], [11], [12], [13], [14]. Docking approaches can be classified into three categories: completely rigid body, partial flexibility and complete flexibility. In two-body systems, e.g., docking ligands into proteins, the first approach treats the protein and ligand as two independent but rigid bodies, the second treats only the ligand as flexible, i.e., allowing adjustment of its rotatable bonds, and the third considers both protein and ligand as flexible. While the first approach is too crude to be of much value, of the other two, only the semi-flexible approach is computationally accessible with widely available docking tools.

Docking and scoring programs have been reviewed and benchmarked extensively [5], [6], [7], [15], [16], [17], [18], [19]. To summarize these reports in a few words, most programs are capable of producing viable docking poses, albeit at varying speeds, usually including one or more poses qualitatively similar to the known crystallographic conformation. However, the ability of their associated scoring tools to identify the correct (crystallographic) pose is considerably more problematic. Not unexpectedly, within the training sets for the various docking/scoring codes, the results are significantly better than when these tools are applied to other data sets. The typical validation experiment is to correlate the pose scores with the RMSD (root mean square deviation) of each pose with respect to the crystallographic (experimental) pose [20], [21]. In principle the best score will correspond to the lowest RMSD. There are a few realities in computational molecular modeling (and crystallography is, at its core, modeling) that suggest caution in completely relying on this approach. First, except in rare cases of very high resolution or neutron diffraction, there is measurable uncertainty in the “reference” experimental pose. There are actually a number of cases where the entire ligand orientation is ambiguous [22], [23]. However, the crystallographic pose is a “frozen” model of what is actually a very dynamic process at biologically relevant temperatures. The measured free energy of binding, generally obtained at room temperature or higher, is a “weighted” composite of these states. In other words, higher quality crystal structures may not completely translate to better free energy predictions as they only represent a few of the many states. Second, in designing and calibrating docking scoring functions, optimization of the correspondence between the best scoring pose and the crystallographic structure (i.e., minimizing RMSD) is a somewhat contrived goal that is perhaps not totally relevant to virtual screening. Last, most dock scoring functions are constructed without explicit consideration of the contribution of entropic terms and hydrophobic interactions distinct from London forces.

We have previously described the properties of the HINT force field and free energy scoring tool for understanding protein–ligand interactions in biological media [24], [25], [26], [27]. A number of relevant principles to virtual screening, i.e., the importance of hydrophobic interactions and entropy in modeling free energy, and the energetic influences of active site (bridging) waters and ionization states of acidic and basic residues and ligand functional groups were highlighted. Here we examined 19 protein–ligand complexes for which crystallographic and binding energy data were available. After some limited model cleanup, we calculated an experimental vs. calculated free energy correlation for the set using the HINT free energy scoring tool [24]. The ligands for these complexes were then re-docked into their host proteins using three popular docking codes and the resulting poses were evaluated using several scoring criteria.

Our principal motivation for the present work is to answer the following: does a calibrated free energy scoring tool have distinct advantages over the current collection of docking score functions for evaluating calculated poses? The answer to that question reveals several guidelines governing the usage of docking and scoring studies in modern drug discovery that should be of general interest to medicinal chemists who use or interpret docking results. These results suggest, in part, why virtual screening has yet to deliver results commensurate with expectations [7]. One particular point of emphasis is a comparison of scoring functions with respect to their usefulness in virtual screening as opposed to selecting poses with lowest root mean square deviation (RMSD) between experimental and docked conformations.

Section snippets

Protein–ligand test set

The structures of the 19 analyzed protein–ligand complexes (Table 1) were retrieved from the Protein Data Bank [28] (www.rcsb.org). The structures were chosen according with the following criteria: non-covalent binding between protein and ligand, crystallographic resolution lower or equal to 3 Å, and inhibition constant values ranging from μM to nM. Also, in order to study a heterogeneous set representative of different existing structural architectures, proteins characterized by multiple

Results and discussion

Identifying new potential lead compounds in silico, without actually performing expensive and time-consuming library screening, has become a most intriguing challenge of computational chemistry. Designing and applying docking and scoring tools that are accurate, precise and thus able to identify the correct pose of ligands docked in target binding pockets is a principal goal [8], [9], [10], [11], [12], [13], [14]. Many studies have been performed to identify which docking program performs best

Summary

This work is part of our long-range goal of designing and implementing a reliable and accurate system for scoring and understanding the results from virtual screening. Previous reports were an initial calibration of the HINT scoring function for the free energy of binding in protein–ligand complexes [24], an examination of the role of ionization state in calculating free energy of binding or what we termed “computational titration” [25], and a study of the energetic effects of bridging water in

Acknowledgements

This work was partially supported by funds from the Italian Ministry of Instruction, University and Research within a COFIN2005 and an Internationalization project (A.M.) and U.S. NIH grant GM71894 (G.E.K.). We acknowledge Dr. Philip D. Mosier for critical and thoughtful suggestions that greatly improved this manuscript.

References (86)

  • T.P. Lybrand

    Curr. Opin. Struct. Biol.

    (1995)
  • W.P. Walters et al.

    Drug Discov. Today

    (1998)
  • M. Rarey et al.

    J. Mol. Biol.

    (1996)
  • G. Jones et al.

    J. Mol. Biol.

    (1997)
  • R.D. Clark et al.

    J. Mol. Graph. Model.

    (2002)
  • A. Amadasi et al.

    J. Mol. Biol.

    (2006)
  • Z.S. Derewenda et al.

    J. Mol. Biol.

    (1995)
  • G. Jones et al.

    J. Mol. Biol.

    (1995)
  • M. Stahl et al.

    J. Mol. Graph. Model.

    (1998)
  • H. Gohlke et al.

    J. Mol. Biol.

    (2000)
  • D.M. Miller et al.

    J. Biol. Chem.

    (1983)
  • G.J. Kleywegt et al.

    Structure

    (1994)
  • H. Brandstetter et al.

    J. Mol. Biol.

    (1992)
  • J. Sturzebecher et al.

    Thromb. Res.

    (1984)
  • L.W. Guddat et al.

    J. Mol. Biol.

    (1994)
  • G.V. Richieri et al.

    J. Biol. Chem.

    (1994)
  • Z. Xu et al.

    J. Biol. Chem.

    (1993)
  • D. Turk et al.

    FEBS Lett.

    (1991)
  • Q. Huai et al.

    Structure

    (2003)
  • J.E. Souness et al.

    Immunopharmacology

    (2000)
  • A.K. Shiau et al.

    Cell

    (1998)
  • G. Amari et al.

    Bioorg. Med. Chem.

    (2004)
  • G.E. Kellogg

    Med. Chem. Res.

    (1999)
  • C. Perez et al.

    J. Med. Chem.

    (2001)
  • C. Bissantz et al.

    J. Med. Chem.

    (2000)
  • L. Xing et al.

    J. Comput. Aided Mol. Des.

    (2004)
  • G.L. Warren et al.

    J. Med. Chem

    (2006)
  • R. Abagyan et al.

    J. Comput. Chem.

    (1994)
  • S. Makino et al.

    J. Comput. Chem.

    (1997)
  • C. McMartin et al.

    J. Comput. Aided Mol. Des.

    (1997)
  • G.M. Morris et al.

    J. Comput. Chem.

    (1998)
  • R.A. Friesner et al.

    J. Med. Chem.

    (2004)
  • P.S. Charifson et al.

    J. Med. Chem.

    (1999)
  • R. Wang et al.

    J. Med. Chem.

    (2003)
  • B.D. Bursulaya et al.

    J. Comput. Aided Mol. Des.

    (2003)
  • R. Wang et al.

    J. Chem. Inf. Model.

    (2004)
  • H. Chen et al.

    J. Chem. Inf. Model.

    (2006)
  • E. Kellenberger et al.

    Proteins

    (2004)
  • D. Borek et al.

    Acta Crystallogr. D Biol. Crystallogr.

    (2003)
  • A.M. Davis et al.

    Angew. Chem. Int. Ed. Engl.

    (2003)
  • P. Cozzini et al.

    J. Med. Chem.

    (2002)
  • M. Fornabaio et al.

    J. Med. Chem.

    (2003)
  • M. Fornabaio et al.

    J. Med. Chem.

    (2004)
  • Cited by (0)

    View full text