Mining the human genome using microarrays of open reading frames

Nat Genet. 2000 Nov;26(3):315-8. doi: 10.1038/81613.

Abstract

To test the hypothesis that the human genome project will uncover many genes not previously discovered by sequencing of expressed sequence tags (ESTs), we designed and produced a set of microarrays using probes based on open reading frames (ORFs) in 350 Mb of finished and draft human sequence. Our approach aims to identify all genes directly from genomic sequence by querying gene expression. We analysed genomic sequence with a suite of ORF prediction programs, selected approximately one ORF per gene, amplified the ORFs from genomic DNA and arrayed the amplicons onto treated glass slides. Of the first 10,000 arrayed ORFs, 31% are completely novel and 29% are similar, but not identical, to sequences in public databases. Approximately one-half of these are expressed in the tissues we queried by microarray. Subsequent verification by other techniques confirmed expression of several of the novel genes. Expressed sequence tags (ESTs) have yielded vast amounts of data, but our results indicate that many genes in the human genome will only be found by genomic sequencing.

Publication types

  • Comparative Study
  • Validation Study

MeSH terms

  • Cell Line
  • Exons / genetics
  • Expressed Sequence Tags
  • Gene Expression Profiling / instrumentation
  • Gene Expression Profiling / methods*
  • Genome, Human*
  • Human Genome Project*
  • Humans
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis*
  • Open Reading Frames*
  • Organ Specificity
  • Polymerase Chain Reaction
  • Sequence Analysis, DNA
  • Sequence Homology, Nucleic Acid