Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes.

Détails

ID Serval
serval:BIB_5340B52DD725
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Titre
Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes.
Périodique
Proceedings of the National Academy of Sciences of the United States of America
Auteur⸱e⸱s
Guigo R., Dermitzakis E.T., Agarwal P., Ponting C.P., Parra G., Reymond A., Abril J.F., Keibler E., Lyle R., Ucla C., Antonarakis S.E., Brent M.R.
ISSN
0027-8424 (Print)
ISSN-L
0027-8424
Statut éditorial
Publié
Date de publication
2003
Volume
100
Numéro
3
Pages
1140-1145
Langue
anglais
Résumé
A primary motivation for sequencing the mouse genome was to accelerate the discovery of mammalian genes by using sequence conservation between mouse and human to identify coding exons. Achieving this goal proved challenging because of the large proportion of the mouse and human genomes that is apparently conserved but apparently does not code for protein. We developed a two-stage procedure that exploits the mouse and human genome sequences to produce a set of genes with a much higher rate of experimental verification than previously reported prediction methods. RT-PCR amplification and direct sequencing applied to an initial sample of mouse predictions that do not overlap previously known genes verified the regions flanking one intron in 139 predictions, with verification rates reaching 76%. On average, the confirmed predictions show more restricted expression patterns than the mouse orthologs of known human genes, and two-thirds lack homologs in fish genomes, demonstrating the sensitivity of this dual-genome approach to hard-to-find genes. We verified 112 previously unknown homologs of known proteins, including two homeobox proteins relevant to developmental biology, an aquaporin, and a homolog of dystrophin. We estimate that transcription and splicing can be verified for >1,000 gene predictions identified by this method that do not overlap known genes. This is likely to constitute a significant fraction of the previously unknown, multiexon mammalian genes.
Mots-clé
Amino Acid Sequence, Animals, Exons, Genetic Techniques, Genome, Genome, Human, Humans, Introns, Mice, Molecular Sequence Data, Reverse Transcriptase Polymerase Chain Reaction, Sequence Analysis, DNA, Sequence Homology, Amino Acid, Tissue Distribution
Pubmed
Web of science
Open Access
Oui
Création de la notice
24/01/2008 16:51
Dernière modification de la notice
20/08/2019 15:08
Données d'usage