Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Mining the human genome using microarrays of open reading frames

Abstract

To test the hypothesis that the human genome project will uncover many genes not previously discovered by sequencing of expressed sequence tags (ESTs), we designed and produced a set of microarrays using probes based on open reading frames (ORFs) in 350 Mb of finished and draft human sequence. Our approach aims to identify all genes directly from genomic sequence by querying gene expression. We analysed genomic sequence with a suite of ORF prediction programs, selected approximately one ORF per gene, amplified the ORFs from genomic DNA and arrayed the amplicons onto treated glass slides. Of the first 10,000 arrayed ORFs, 31% are completely novel and 29% are similar, but not identical, to sequences in public databases. Approximately one-half of these are expressed in the tissues we queried by microarray. Subsequent verification by other techniques confirmed expression of several of the novel genes. Expressed sequence tags (ESTs) have yielded vast amounts of data1,2, but our results indicate that many genes in the human genome will only be found by genomic sequencing.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Representation of clustering of expression.
Figure 2: A 'Mondrian' of a 'virtual' BAC.
Figure 3: A study of BAC AL049839, depicted by 'Mondrian'.

Similar content being viewed by others

References

  1. Strausberg, R.L., Dahl, C.A. & Klausner, R.D. New opportunities for uncovering the molecular basis of cancer. Nature Genet. 15, 415–416 (1997).

    Article  CAS  Google Scholar 

  2. Adams, M.D. et al. Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence. Nature 377, 3–174 ( 1995).

    CAS  Google Scholar 

  3. Uberbacher, E.C. & Mural, R.J. Locating protein-coding regions in human DNA sequnce by multiple sensor-neural network approach. Proc. Natl Acad. Sci. USA 88, 11261– 11265 (1991).

    Article  CAS  Google Scholar 

  4. Solovyev, V.V., Salamov, A.A. & Lawrence, C.B. Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames. Nucleic Acids Res. 22, 5156–5163 (1994).

    Article  CAS  Google Scholar 

  5. Burset, M. & Guigó, R. Evaluation of gene structure prediction programs. Genomics 34, 353–367 (1996).

    Article  CAS  Google Scholar 

  6. Ansari-Lari, M.A. et al. Comparative sequence analysis of the gene-rich cluster at human chromosome 12p13 and its syntenic region in mouse chromosome 6. Genome Res. 8, 29–40 ( 1998).

    CAS  Google Scholar 

  7. Worley, J. et al. A systems approach to fabricating and analyzing DNA microarrays . in Microarray Biochip Technnology (ed. Schena, M.) 65–86 (Biotechniques Books, Natick, Massachusetts, 2000).

    Google Scholar 

  8. Altschul, S.F. et al. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 ( 1990).

    Article  CAS  Google Scholar 

  9. Joshi, J.G., Fleming, J.T., Dhar, M. & Chauthaiwale, V. A novel ferritin heavy chain messenger ribonucleic acid in human brain. J. Neurol. Sci. 134, 52–56 (1995).

    Article  CAS  Google Scholar 

  10. Dunham, I. et al. The DNA sequence of human chromosome 22. Nature 402, 489–495 ( 1999).

    Article  CAS  Google Scholar 

  11. Heizmann, C.W. Ca2+−binding S100 proteins in the central nervous system. Neurochem. Res. 9, 1097–2000 (1999).

    Article  Google Scholar 

  12. Wigge, P. & McMahon, H.T. The amphiphysin family of proteins and their role in endocytosis at the synapse. Trends Neurosci. 21, 339–344 ( 1998).

    Article  CAS  Google Scholar 

  13. Millward, T.A., Zolnierowicz, S. & Hemmings, B.A. Regulation of protein kinase cascades by protein phosphatase 2A. Trends Biochem. Sci. 24, 186–191 (1999).

    Article  CAS  Google Scholar 

  14. Ullrich, B. et al. Functional properties of multiple synaptotagmins in brain . Neuron 6, 1281–1291 (1994).

    Article  Google Scholar 

Download references

Acknowledgements

We thank D. Jenkins and other members of the ART team for assistance, support and encouragement; R. Thomas for programming assistance; Operon for cooperation on this project; and J. Graham for the use of DiCTion.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David R. Rank.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Cite this article

Penn, S., Rank, D., Hanzel, D. et al. Mining the human genome using microarrays of open reading frames. Nat Genet 26, 315–318 (2000). https://doi.org/10.1038/81613

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1038/81613

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing