Identification of the shortest species-specific oligonucleotide sequences.
Détails
Etat: Public
Version: Final published version
Licence: CC BY-NC 4.0
ID Serval
serval:BIB_68897A2BE9DE
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
Identification of the shortest species-specific oligonucleotide sequences.
Périodique
Genome research
ISSN
1549-5469 (Electronic)
ISSN-L
1088-9051
Statut éditorial
Publié
Date de publication
14/02/2025
Peer-reviewed
Oui
Volume
35
Numéro
2
Pages
279-295
Langue
anglais
Notes
Publication types: Journal Article
Publication Status: epublish
Publication Status: epublish
Résumé
Despite the exponential increase in sequencing information driven by massively parallel DNA sequencing technologies, universal and succinct genomic fingerprints for each organism are still missing. Identifying the shortest species-specific nucleotide sequences offers insights into species evolution and holds potential practical applications in agriculture, wildlife conservation, and healthcare. We propose a new method for sequence analysis termed nucleic "quasi-primes," the shortest occurring sequences in each of 45,076 organismal reference genomes, present in one genome and absent from every other examined genome. In the human genome, we find that the genomic loci of nucleic quasi-primes are most enriched for genes associated with brain development and cognitive function. In a single-cell case study focusing on the human primary motor cortex, nucleic quasi-prime genes account for a significantly larger proportion of the variation based on average gene expression. Nonneuronal cell types, including astrocytes, endothelial cells, microglia perivascular-macrophages, oligodendrocytes, and vascular and leptomeningeal cells, exhibit significant activation of quasi-prime-containing gene associations related to cancer, whereas simultaneously suppressing quasi-prime-containing genes are associated with cognitive, mental, and developmental disorders. We also show that human disease-causing variants, eQTLs, mQTLs, and sQTLs are 4.43-fold, 4.34-fold, 4.29-fold, and 4.21-fold enriched at human quasi-prime loci, respectively. These findings indicate that nucleic quasi-primes are genomic loci linked to the evolution of species-specific traits, and in humans, they provide insights in the development of cognitive traits and human diseases, including neurodevelopmental disorders.
Mots-clé
Humans, Species Specificity, Animals, Genome, Human, Sequence Analysis, DNA/methods, Quantitative Trait Loci, High-Throughput Nucleotide Sequencing, Oligonucleotides/genetics, Genomics/methods
Pubmed
Web of science
Création de la notice
08/01/2025 14:31
Dernière modification de la notice
17/05/2025 7:09