Original article
Troubleshooting computational methods in drug discovery

https://doi.org/10.1016/j.vascn.2010.02.005Get rights and content

Abstract

Computational approaches for drug discovery such as ligand-based and structure-based methods, are increasingly seen as an efficient approach for lead discovery as well as providing insights on absorption, distribution, metabolism, excretion and toxicity (ADME/Tox). What is perhaps less well known and widely described are the limitations of the different technologies. We have therefore presented a troubleshooting approach to QSAR, homology modeling, docking as well as hybrid methods. If such computational or cheminformatics methods are to become more widely used by non-experts it is critical that such limitations are brought to the user's attention and addressed during their workflows. This could improve the quality of the models and results that are obtained.

Introduction

Lead generation and avoiding late stage failures of lead compounds are the two major bottlenecks in the drug discovery process. The problems associated with lead discovery starts with the escalated costs associated with lead generation against a validated target. Further attempts to utilize combinatorial and high throughput screening has lead to a marginal increase in success which had prompted pharmaceutical and biotech industries to question the cost effectiveness of these technologies (Macarron, 2006). Similarly, problems associated with late stage failures of potent lead compounds due to undesirable physicochemical properties has lead to a major shift in the drug discovery protocols over the past decade to evaluate lead compounds for drug like properties very early on in the discovery process using computational prediction methods based on statistical techniques that utilize existing experimental knowledge about physicochemical properties (Ekins, Ring, Grace, McRobie-Belle & Wrighton, 2000c). Virtual screening techniques have been integrated into the design protocol thereby reducing the burden on experimentally screening millions of compounds to identify lead molecules and hence provide very cost effective methods to solve the lead generation issues (Oprea & Matter, 2004).

In silico methods can be broadly listed as those that can be applied to study a) targets such as protein and DNA sequences, genomes, networks or pathways and even tissues and organs such as virtual liver b) small molecules such as those that do not contain the amino acids or nucleic acids and c) complexes of small molecules with target proteins or DNA or RNA. Based on these classifications, in silico techniques have been utilized for analyzing target and small molecule interactions and networks (Bajorath, 2008, Hopkins, 2007, Oprea et al., 2007, Young et al., 2008), quantitative structure activity relationships (Dimitrov et al., 2005, Dudek et al., 2006), analyzing similarity between either small molecules or targets (Lemmen & Lengauer, 2000, Melville et al., 2009, Schaffer et al., 2001, Sharan & Ideker, 2006, Williams & Schreyer, 2009), pharmacophore based modeling and screening (Chang et al., 2006c, Khedkar et al., 2007, Sun, 2008), homology modeling (Hillisch et al., 2004, Qu et al., 2009, Xu et al., 2000), molecular dynamics simulation protocols (Amaro & Li, 2010, Morra et al., 2010, Okimoto et al., 2009), data mining of genomes or small molecule databases (Feng et al., 2009, Marechal, 2008, Wagner & Clemons, 2009), network or pathway analysis (Hendriks et al., 2008, Hopkins, 2008), machine learning techniques etc. (Chekmarev et al., 2009a, Kortagere et al., 2009b, Li et al., 2007, Melville et al., 2009). Thus modern drug discovery is a complex, iterative process involving several levels of screening the wealth of experimental data available from whole genome sequencing projects to those derived from public and private databases of chemicals such as PubChem (Sayers et al., 2009), Zinc (Irwin & Shoichet, 2005) and ChemSpider (Williams, 2008). Extracting all the information at the required level of sophistication has been made possible only with the exceptional bioinformatics and computational tools. Although computational methods have been designed for a variety of applications listed above, this study concentrates on an overview of limitations and ways to troubleshoot them for the in silico techniques relevant to virtual screening and absorption, distribution, metabolism, excretion and toxicity (ADME/Tox) filtering.

Section snippets

Virtual screening algorithms

Virtual screening of large chemical libraries has become an important step in the pathway to discovering new lead molecules. It has the advantage of being able to limit the number of molecules that has to be synthesized and tested experimentally using high throughput screening (HTS) against a biological target thereby reducing the cost of production. Virtual screening can be done using either the ligand-based e.g. pharmacophore or similarity-based and receptor-based methods. Both these

In silico modeling of ADMET properties

The screening of ADME/Tox properties of molecules can be done using in vitro and in vivo methods, but they are not cost effective to perform for very large numbers of compounds, instead in silico techniques to predict these properties can be used and only those compounds that advance as lead molecules can be screened using in vitro and in vivo techniques. Several methods have been proposed a decade ago (Ekins et al., 1999a, Ekins et al., 1999b, Ekins et al., 2000a, Ekins et al., 1999c, Ekins et

Troubleshooting

Several essential factors play a key role during the building of an in silico model to predict complex physiological properties such as ADMET. These include ensuring the molecular structures in a dataset are correct (e.g. stereochemistry), the quality of the datasets containing experimental values is high, the chemical space covered by the dataset molecules in the training and test sets is adequate and comparable, preferably interpretable molecular descriptors are used, the statistical methods

Conclusion

We have described various in silico techniques involved in the drug discovery process for lead molecule design and ADMET prediction and focused on the limitations or trouble shooting that we think is important to be aware of for each method. It is probably worth suggesting to software vendors and developers that they should make the user aware of some of these issues. If such technologies are to have a wider user base, troubleshooting should be addressed more explicitly and checks performed in

References (146)

  • H. Gohlke et al.

    Statistical potentials and scoring functions applied to protein–ligand binding

    Current Opinion in Structural Biology

    (2001)
  • A. Hillisch et al.

    Utility of homology models in the drug discovery process

    Drug Discovery Today

    (2004)
  • J.D. Hughes et al.

    Physiochemical drug properties associated with in vivo toxicological outcomes

    Bioorganic & Medicinal Chemistry Letters

    (2008)
  • Y.A. Ivanenkov et al.

    Computational mapping tools for drug discovery

    Drug Discovery Today

    (2009)
  • Y.A. Ivanenkov et al.

    Computational mapping tools for drug discovery

    Drug Discovery Today

    (2009)
  • L.J. Jolivette et al.

    Methods for predicting human drug metabolism

    Advances in Clinical Chemistry

    (2007)
  • W.L. Jorgensen et al.

    Prediction of drug solubility from structure

    Advanced Drug Delivery Reviews

    (2002)
  • H. Li et al.

    Machine learning approaches for predicting compounds that interact with therapeutic and ADMET related proteins

    Journal of Pharmaceutical Sciences

    (2007)
  • R. Macarron

    Critical review of the role of HTS in drug discovery

    Drug Discovery Today

    (2006)
  • G.R. Marshall et al.

    Three-dimensional structure–activity relationships

    Trends in Pharmacological Sciences

    (1988)
  • M. Matic et al.

    Pregnane X receptor: Promiscuous regulator of detoxification pathways

    International Journal of Biochemistry and Cell Biology

    (2007)
  • T. Abshear et al.

    A model validation and consensus building environment

    SAR and QSAR in Environmental Research

    (2006)
  • R.E. Amaro et al.

    Emerging ensemble-based methods in virtual screening

    Current Topics in Medicinal Chemistry

    (2010)
  • J. Aqvist et al.

    Ligand binding affinities from MD simulations

    Accounts of Chemical Research

    (2002)
  • A.H. Asikainen et al.

    Consensus kNN QSAR: A versatile method for predicting the estrogenic activity of organic compounds in silico. A comparative study with five estrogen receptors and a large, diverse set of ligands

    Environmental Science and Technology

    (2004)
  • A. Bender et al.

    Analysis of pharmacology data and the prediction of adverse drug reactions and off-target effects from chemical structure

    ChemMedChem

    (2007)
  • B. Brooks et al.

    Harmonic dynamics of proteins: Normal modes and fluctuations in bovine pancreatic trypsin inhibitor

    Proceedings of the National Academy of Sciences of the United States of America

    (1983)
  • E.O. Cannon et al.

    A novel hybrid ultrafast shape descriptor method for use in virtual screening

    Chemistry Central Journal

    (2008)
  • C. Chang et al.

    Rapid identification of P-glycoprotein substrates and inhibitors

    Drug Metabolism and Disposition

    (2006)
  • C. Chang et al.

    Computer optimization of biopharmaceutical properties

  • D. Chekmarev et al.

    Predicting inhibitors of acetylcholinesterase by regression and classification machine learning approaches with combinations of molecular descriptors

    Pharmaceutical Research

    (2009)
  • D. Chekmarev et al.

    Predicting inhibitors of acetylcholinesterase by regression and classification machine learning approaches with combinations of molecular descriptors

    Pharmaceutical Research

    (2009)
  • D.S. Chekmarev et al.

    Shape signatures: New descriptors for predicting cardiotoxicity in silico

    Chemical Research in Toxicology

    (2008)
  • A. Cherkasov et al.

    Progressive docking: A hybrid QSAR/docking approach for accelerating in silico high throughput screening

    Journal of Medicinal Chemistry

    (2006)
  • J.C. Cole et al.

    Comparing protein–ligand docking programs is difficult

    Proteins

    (2005)
  • L. Diao et al.

    Novel inhibitors of human organic cation/carnitine transporter (hOCTN2) via computational modeling and in vitro testing

    Pharmaceutical Research

    (2009)
  • R. Dias et al.

    Evaluation of molecular docking using polynomial empirical scoring functions

    Current Drug Targets

    (2008)
  • S. Dimitrov et al.

    A stepwise approach for defining the applicability domain of SAR and QSAR models

    Journal of Chemical Information and Modeling

    (2005)
  • D.J. Dix et al.

    The ToxCast program for prioritizing toxicity testing of environmental chemicals

    Toxicological Sciences

    (2007)
  • A.Z. Dudek et al.

    Computational methods in developing quantitative structure–activity relationships (QSAR): A review

    Combinatorial Chemistry & High Throughput Screening

    (2006)
  • A.K. Dunker et al.

    Intrinsic disorder and protein function

    Biochemistry

    (2002)
  • S. Ekins

    Computational toxicology: Risk assessment for pharmaceutical and environmental chemicals

    (2007)
  • S. Ekins et al.

    A combined approach to drug metabolism and toxicity assessment

    Drug Metab Dispos

    (2006)
  • S. Ekins et al.

    Three and four dimensional-quantitative structure activity relationship (3D/4D-QSAR) analyses of CYP2D6 inhibitors

    Pharmacogenetics

    (1999)
  • S. Ekins et al.

    Three and four dimensional-quantitative structure activity relationship analyses of CYP3A4 inhibitors

    Journal of Pharmacology and Experimental Therapeutics

    (1999)
  • S. Ekins et al.

    Three and four dimensional-quantitative structure activity relationship (3D/4D-QSAR) analyses of CYP2C9 inhibitors

    Drug Metabolism and Disposition

    (2000)
  • S. Ekins et al.

    Three dimensional-quantitative structure activity relationship analyses of substrates for CYP2B6

    Journal of Pharmacology and Experimental Therapeutics

    (1999)
  • S. Ekins et al.

    Three dimensional quantitative structure activty relationship (3D-QSAR) analysis of CYP3A4 substrates

    Journal of Pharmacology and Experimental Therapeutics

    (1999)
  • S. Ekins et al.

    In vitro and pharmacophore based discovery of novel hPEPT1 inhibitors

    Pharmaceutical Research

    (2005)
  • S. Ekins et al.

    Computational discovery of novel low micromolar human pregnane X receptor antagonists

    Molecular Pharmacology

    (2008)
  • Cited by (44)

    • Reverse screening on indicaxanthin from Opuntia ficus-indica as natural chemoactive and chemopreventive agent

      2018, Journal of Theoretical Biology
      Citation Excerpt :

      The therapeutic activity of a drug depends on the ability to interact with a specific binding site on target protein to exert a favorite biologic effect. Ligands that share favorable interactions could exert the similar therapeutic effect on the target protein (Kortagere and Ekins, 2014). Therefore, the therapeutic role of novel compounds can be predicted by analyzing the favorable interaction at the binding site of the target protein.

    • Receptor-based virtual screening protocol for drug discovery

      2015, Archives of Biochemistry and Biophysics
      Citation Excerpt :

      Geometry-based algorithms are usually prized because they are fast and robust in dealing with structural variations or missing atoms/residues in the input structure [19]. Energy-based algorithms, on the other hand, are often more sensitive and specific [20]. Despite the distinctive approaches, the performance is very similar and both methods can correctly predict 95% of the known binding sites [21].

    • Comprehension of drug toxicity: Software and databases

      2014, Computers in Biology and Medicine
    • High-throughput respirometric assay identifies predictive toxicophore of mitochondrial injury

      2013, Toxicology and Applied Pharmacology
      Citation Excerpt :

      Toxicophores are developed from the analysis of chemical alignment of toxicants like pharmacophores are developed from the alignment of drug-like chemicals (Rhoades et al., 2012). The observed toxicities of many drugs and pesticides are associated with specific chemical structures within the compounds of interest (Bradbury, 1995; Casalegno et al., 2006; Kortagere and Ekins, 2010). The presence of “toxic” structures in novel or untested compounds can be used to predict potential toxicity based on the assumption that compounds containing the same features, or toxicophore, are likely to cause similar toxic effects (Huang et al., 2009).

    View all citing articles on Scopus
    View full text