Assembly, Annotation, and Integration of UNIGENE Clusters into the Human Genome Draft

  1. Degen Zhuo1,2,5,
  2. Wei D. Zhao1,2,5,
  3. Fred A. Wright2,5,
  4. Hee-Yung Yang4,
  5. Jian-Ping Wang1,2,
  6. Russell Sears1,2,
  7. Troy Baer3,
  8. Do-Hun Kwon1,2,
  9. David Gordon1,2,
  10. Solomon Gibbs1,2,
  11. Dean Dai4,
  12. Qing Yang1,2,
  13. Joe Spitzner4,
  14. Ralf Krahe2,
  15. Don Stredney3,
  16. Al Stutz3, and
  17. Bo Yuan1,2,6
  1. 1Bioinformatics Group, 2Division of Human Cancer Genetics, James Cancer Hospital and Solove Research Institute, The Ohio State University, Columbus, Ohio 43210, USA; 3Ohio Supercomputer Center (OSC), Columbus, Ohio 43212, USA; 4Labbook.Com, Columbus, Ohio 43229, USA

Abstract

The recent release of the first draft of the human genome provides an unprecedented opportunity to integrate human genes and their functions in a complete positional context. However, at least three significant technical hurdles remain: first, to assemble a complete and nonredundant human transcript index; second, to accurately place the individual transcript indices on the human genome; and third, to functionally annotate all human genes. Here, we report the extension of the UNIGENE database through the assembly of its sequence clusters into nonredundant sequence contigs. Each resulting consensus was aligned to the human genome draft. A unique location for each transcript within the human genome was determined by the integration of the restriction fingerprint, assembled genomic contig, and radiation hybrid (RH) maps. A total of 59,500 UNIGENE clusters were mapped on the basis of at least three independent criteria as compared with the 30,000 human genes/ESTs currently mapped in Genemap'99. Finally, the extension of the human transcript consensus in this study enabled a greater number of putative functional assignments than the 11,000 annotated entries in UNIGENE. This study reports a draft physical map with annotations for a majority of the human transcripts, called the Human Index of Nonredundant Transcripts (HINT). Such information can be immediately applied to the discovery of new genes and the identification of candidate genes for positional cloning.

Footnotes

  • 5 These authors contributed equally to this work.

  • 6 Corresponding author.

  • E-MAIL yuan.33{at}osu.edu; FAX (614) 688-4761.

  • Article published on-line before print; Genome Res.,10.1101/gr.164501.

  • Article and publication are at www.genome.org/cgi/doi/10.1101/gr.164501.

    • Received September 11, 2000.
    • Accepted February 5, 2001.
| Table of Contents

Preprint Server