Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

I-TASSER: a unified platform for automated protein structure and function prediction

Abstract

The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: A schematic representation of the I-TASSER protocol for protein structure and function predictions.
Figure 2: Example of external restraint files.
Figure 3: An illustrative example of the I-TASSER result page.
Figure 4: An illustrative example of the I-TASSER result page.
Figure 5: Illustrative examples of the I-TASSER function predictions.

Similar content being viewed by others

References

  1. The UniProt, C. The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res. 37, D169–D174 (2008).

    Article  Google Scholar 

  2. Berman, H.M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Zhang, Y. Progress and challenges in protein structure prediction. Curr. Opin. Struct. Biol. 18, 342–348 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Marti-Renom, M.A. et al. Comparative protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct. 29, 291–325 (2000).

    Article  CAS  PubMed  Google Scholar 

  5. Altschul, S.F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Bowie, J.U., Luthy, R. & Eisenberg, D. A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164–170 (1991).

    Article  CAS  PubMed  Google Scholar 

  7. Jones, D.T., Taylor, W.R. & Thornton, J.M. A new approach to protein fold recognition. Nature 358, 86–89 (1992).

    Article  CAS  PubMed  Google Scholar 

  8. Liwo, A., Lee, J., Ripoll, D.R., Pillardy, J. & Scheraga, H.A. Protein structure prediction by global optimization of a potential energy function. Proc. Natl. Acad. Sci. USA 96, 5482–5485 (1999).

    Article  CAS  PubMed  Google Scholar 

  9. Simons, K.T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997).

    Article  CAS  PubMed  Google Scholar 

  10. Wu, S., Skolnick, J. & Zhang, Y. Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol. 5, 17 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Jauch, R., Yeo, H.C., Kolatkar, P.R. & Clarke, N.D. Assessment of CASP7 structure predictions for template free targets. Proteins 69, 57–67 (2007).

    Article  CAS  PubMed  Google Scholar 

  12. Zhang, Y., Kolinski, A. & Skolnick, J. TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophys. J. 85, 1145–1164 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Battey, J.N. et al. Automated server predictions in CASP7. Proteins 69, 68–82 (2007).

    Article  CAS  PubMed  Google Scholar 

  14. Moult, J. et al. Critical assessment of methods of protein structure prediction-Round VII. Proteins 69 (Suppl 8): 3–9 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Kopp, J., Bordoli, L., Battey, J.N., Kiefer, F. & Schwede, T. Assessment of CASP7 predictions for template-based modeling targets. Proteins 69, 38–56 (2007).

    Article  CAS  PubMed  Google Scholar 

  16. Das, R. et al. Structure prediction for CASP7 targets using extensive all-atom refinement with Rosetta@home. Proteins 69, 118–128 (2007).

    Article  CAS  PubMed  Google Scholar 

  17. Zhang, Y. Template-based modeling and free modeling by I-TASSER in CASP7. Proteins 69, 108–117 (2007).

    Article  CAS  PubMed  Google Scholar 

  18. Zhou, H. et al. Analysis of TASSER-based CASP7 protein structure prediction results. Proteins 69 (Suppl 8): 90–97 (2007).

    Article  CAS  PubMed  Google Scholar 

  19. Zhang, Y. I-TASSER: fully automated protein structure prediction in CASP8. Proteins 77, 100–113 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Cozzetto, D. et al. Evaluation of template-based models in CASP8 with standard measures. Proteins 77 (Suppl 9): 18–28 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Zhang, Y. Protein structure prediction: when is it useful? Curr. Opin. Struct. Biol. 19, 145–155 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Ekins, S., Mestres, J. & Testa, B. In silico pharmacology for drug discovery: applications to targets and beyond. Br. J. Pharmacol. 152, 21–37 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Becker, O.M. et al. An integrated in silico 3D model-driven discovery of a novel, potent, and selective amidosulfonamide 5-HT1A agonist (PRX-00023) for the treatment of anxiety and depression. J. Med. Chem. 49, 3116–3135 (2006).

    Article  CAS  PubMed  Google Scholar 

  24. Brylinski, M. & Skolnick, J. Q-Dock: low-resolution flexible ligand docking with pocket-specific threading restraints. J. Comput. Chem. 29, 1574–1588 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Arakaki, A.K., Zhang, Y. & Skolnick, J. Large-scale assessment of the utility of low-resolution protein structures for biochemical function assignment. Bioinformatics 20, 1087–1096 (2004).

    Article  CAS  PubMed  Google Scholar 

  26. Yue, P. & Moult, J. Identification and analysis of deleterious human SNPs. J. Mol. Biol. 356, 1263–1274 (2006).

    Article  CAS  PubMed  Google Scholar 

  27. Boyd, A. et al. A random mutagenesis approach to isolate dominant-negative yeast sec1 mutants reveals a functional role for domain 3a in yeast and mammalian Sec1/Munc18 proteins. Genetics 180, 165–178 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Ye, Y., Li, Z. & Godzik, A. Modeling and analyzing three-dimensional structures of human disease proteins. Pac. Symp. Biocomput. 11, 439–450 (2006).

    Google Scholar 

  29. Keedy, D.A. et al. The other 90% of the protein: assessment beyond the Calphas for CASP8 template-based and high-accuracy models. Proteins 77 (Suppl 9): 29–49 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Tress, M., Ezkurdia, I., Grana, O., Lopez, G. & Valencia, A. Assessment of predictions submitted for the CASP6 comparative modeling category. Proteins 61 (Suppl 7): 27–45 (2005).

    Article  CAS  PubMed  Google Scholar 

  31. Moult, J. Comparative modeling in structural genomics. Structure 16, 14–16 (2008).

    Article  CAS  PubMed  Google Scholar 

  32. Tress, M. et al. Assessment of predictions submitted for the CASP7 domain prediction category. Proteins 69 (Suppl 8): 137–151 (2007).

    Article  CAS  PubMed  Google Scholar 

  33. Malmstrom, L. et al. Superfamily assignments for the yeast proteome through integration of structure prediction with the gene ontology. PLoS Biol. 5, e76 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Zhang, Y., Devries, M.E. & Skolnick, J. Structure modeling of all identified G protein-coupled receptors in the human genome. PLoS Comput. Biol. 2, e13 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  35. Lopez, G., Rojas, A., Tress, M. & Valencia, A. Assessment of predictions submitted for the CASP7 function prediction category. Proteins 69 (Suppl 8): 165–174 (2007).

    Article  CAS  PubMed  Google Scholar 

  36. Brylinski, M. & Skolnick, J. A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc. Natl. Acad. Sci. USA 105, 129–134 (2008).

    Article  CAS  PubMed  Google Scholar 

  37. Roy, A., Srinivasan, N. & Gowri, V.S. Molecular and structural basis of drift in the functions of closely-related homologous enzyme domains: implications for function annotation based on homology searches and structural genomics. In Silico Biol. 9, S41–S55 (2009).

    PubMed  Google Scholar 

  38. Bork, P., Sander, C. & Valencia, A. Convergent evolution of similar enzymatic function on different protein folds: the hexokinase, ribokinase, and galactokinase families of sugar kinases. Protein Sci. 2, 31–40 (1993).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Zhang, Y. & Skolnick, J. Tertiary structure predictions on a comprehensive benchmark of medium to large size proteins. Biophys. J. 87, 2647–2655 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Zhang, Y. & Skolnick, J. Automated structure prediction of weakly homologous proteins on a genomic scale. Proc. Natl. Acad. Sci. USA 101, 7594–7599 (2004).

    Article  CAS  PubMed  Google Scholar 

  41. Karplus, K., Barrett, C. & Hughey, R. Hidden Markov models for detecting remote protein homologies. Bioinformatics 14, 846–856 (1998).

    Article  CAS  PubMed  Google Scholar 

  42. McGuffin, L.J. & Jones, D.T. Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics 19, 874–881 (2003).

    Article  CAS  PubMed  Google Scholar 

  43. Wallner, B. & Elofsson, A. Pcons5: combining consensus, structural evaluation and fold recognition scores. Bioinformatics 21, 4248–4254 (2005).

    Article  CAS  PubMed  Google Scholar 

  44. Soding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951–960 (2005).

    Article  PubMed  Google Scholar 

  45. Rost, B., Yachdav, G. & Liu, J. The PredictProtein server. Nucleic Acids Res. 32, W321–W326 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Ginalski, K., Elofsson, A., Fischer, D. & Rychlewski, L. 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics 19, 1015–1018 (2003).

    Article  CAS  PubMed  Google Scholar 

  47. Fischer, D. 3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor. Proteins 51, 434–441 (2003).

    Article  CAS  PubMed  Google Scholar 

  48. Kim, D.E., Chivian, D. & Baker, D. Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res. 32, W526–W531 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Kelley, L.A. & Sternberg, M.J. Protein structure prediction on the Web: a case study using the Phyre server. Nat. Protoc. 4, 363–371 (2009).

    Article  CAS  PubMed  Google Scholar 

  50. Jones, D.T. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999).

    Article  CAS  PubMed  Google Scholar 

  51. Wu, S. & Zhang, Y. LOMETS: a local meta-threading-server for protein structure prediction. Nucleic Acids Res. 35, 3375–3382 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Shi, J., Blundell, T.L. & Mizuguchi, K. FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J. Mol. Biol. 310, 243–257 (2001).

    Article  CAS  PubMed  Google Scholar 

  53. Wu, S. & Zhang, Y. MUSTER: improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins 72, 547–556 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Xu, Y. & Xu, D. Protein threading using PROSPECT: design and evaluation. Proteins 40, 343–354 (2000).

    Article  CAS  PubMed  Google Scholar 

  55. Zhou, H. & Zhou, Y. Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments. Proteins 58, 321–328 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Zhou, H. & Zhou, Y. Single-body residue-level knowledge-based energy score combined with sequence-profile and secondary structure information for fold recognition. Proteins 55, 1005–1013 (2004).

    Article  CAS  PubMed  Google Scholar 

  57. Zhang, Y., Kihara, D. & Skolnick, J. Local energy landscape flattening: parallel hyperbolic Monte Carlo sampling of protein folding. Proteins 48, 192–201 (2002).

    Article  CAS  PubMed  Google Scholar 

  58. Zhang, Y., Hubner, I., Arakaki, A., Shakhnovich, E. & Skolnick, J. On the origin and completeness of highly likely single domain protein structures. Proc. Natl. Acad. Sci. USA 103, 2605–2610 (2006).

    Article  CAS  PubMed  Google Scholar 

  59. Chen, H. & Zhou, H.X. Prediction of solvent accessibility and sites of deleterious mutations from protein sequence. Nucleic Acids Res. 33, 3193–3199 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Wu, S. & Zhang, Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics 24, 924–931 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  61. Zhang, Y. & Skolnick, J. SPICKER: A clustering approach to identify near-native protein folds. J. Comput. Chem. 25, 865–871 (2004).

    Article  CAS  PubMed  Google Scholar 

  62. Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Li, Y. & Zhang, Y. REMO: a new protocol to refine full atomic protein models from C-α traces by optimizing hydrogen-bonding networks. Proteins 76, 665–676 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Barrett, A.J. Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). Enzyme Nomenclature. Recommendations 1992. Supplement 4: corrections and additions (1997). Eur. J. Biochem. 250, 1–6 (1997).

    Article  CAS  PubMed  Google Scholar 

  65. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Zhang, Y. & Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins 57, 702–710 (2004).

    Article  CAS  PubMed  Google Scholar 

  67. Zhang, Y. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9, 40 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Li, W., Zhang, Y. & Skolnick, J. Application of sparse NMR restraints to large-scale protein structure prediction. Biophys. J. 87, 1241–1248 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Sali, A. & Blundell, T.L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993).

    Article  CAS  Google Scholar 

  70. Betancourt, M.R. & Skolnick, J. Universal similarity measure for comparing protein structures. Biopolymers 59, 305–309 (2001).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank Dr. A. Szilagyi for reading the manuscript. This work was supported in part by the Alfred P. Sloan Foundation; the National Science Foundation (Career award 0746198); and the National Institute of General Medical Sciences (GM083107, GM084222).

Author information

Authors and Affiliations

Authors

Contributions

Y.Z. conceived and supervised the project. A.R., A.K. and Y.Z. designed and performed the experiments. A.R. and Y.Z. wrote the manuscript.

Corresponding author

Correspondence to Yang Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Roy, A., Kucukural, A. & Zhang, Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5, 725–738 (2010). https://doi.org/10.1038/nprot.2010.5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nprot.2010.5

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing