Understanding mutational and selective processes that govern gene duplication in mammals
Vinckenbosch N.
Kaessmann H.
Université de Lausanne, Faculté de biologie et médecine
Gene duplication is an essential source of material for the origin of genetic novelties. The reverse transcription of source gene mRNA followed by the genomic insertion of the resulting cDNA - retroposition - has provided the human genome with at least ~3600 detectable retrocopies. We find that ~30% of these retrocopies are transcribed, generally in testes. Their transcription often relies on preexisting regulatory elements (or open chromatin) close to their insertion site, which is illustrated by mRNA molecules containing retrocopies fused to their neighboring genes. Retrocopies appear to have been profoundly shaped by selection. Consistently, human retrocopies with an intact open reading (ORF) are more often transcribed than retropseudogenes, which leads to a minimal estimate of 120 functional retrogenes present in our genome. We also performed an analysis of Ka/Ks for human retrocopies. This analysis demonstrates that several intact retrocopies evolved under purifying selection and yields an estimated formation rate of ~1 retrogene per million year in the primate lineage. Using DNA sequencing and evolutionary simulations, we have identified 7 such primate-specific retrogenes that emerged on the lineage leading to humans
In therian genomes, we found an excess of retrogenes with X-linked parents. Expression analyses support the idea that this "out of X" movement was driven by natural selection to produce autosomal functional counterparts for X-linked genes, which are silenced during male meiosis. Phylogenetic dating of this "out of X" movement suggests that our sex chromosomes arose about 180 MYA ago and are thus much younger than previously thought.
Finally, we have also analyzed young gene duplications (and deletions) that arose by non allelic-homologous recombination and are not fixed in species. Using wild-caught and laboratory animals, we detected thousands of DNA segments that are polymorphic in copy number in mice. These copy number variants were found to profoundly alter the transcriptome of several mouse tissues. Strikingly, their influence on gene expression is not limited to the gene they contain but seems to extend to genes located up to 1.5 million bases away.
