Identification of the shortest species-specific oligonucleotide sequences.
Details
State: Public
Version: Final published version
License: CC BY-NC 4.0
Serval ID
serval:BIB_68897A2BE9DE
Type
Article: article from journal or magazin.
Collection
Publications
Institution
Title
Identification of the shortest species-specific oligonucleotide sequences.
Journal
Genome research
ISSN
1549-5469 (Electronic)
ISSN-L
1088-9051
Publication state
Published
Issued date
14/02/2025
Peer-reviewed
Oui
Volume
35
Number
2
Pages
279-295
Language
english
Notes
Publication types: Journal Article
Publication Status: epublish
Publication Status: epublish
Abstract
Despite the exponential increase in sequencing information driven by massively parallel DNA sequencing technologies, universal and succinct genomic fingerprints for each organism are still missing. Identifying the shortest species-specific nucleotide sequences offers insights into species evolution and holds potential practical applications in agriculture, wildlife conservation, and healthcare. We propose a new method for sequence analysis termed nucleic "quasi-primes," the shortest occurring sequences in each of 45,076 organismal reference genomes, present in one genome and absent from every other examined genome. In the human genome, we find that the genomic loci of nucleic quasi-primes are most enriched for genes associated with brain development and cognitive function. In a single-cell case study focusing on the human primary motor cortex, nucleic quasi-prime genes account for a significantly larger proportion of the variation based on average gene expression. Nonneuronal cell types, including astrocytes, endothelial cells, microglia perivascular-macrophages, oligodendrocytes, and vascular and leptomeningeal cells, exhibit significant activation of quasi-prime-containing gene associations related to cancer, whereas simultaneously suppressing quasi-prime-containing genes are associated with cognitive, mental, and developmental disorders. We also show that human disease-causing variants, eQTLs, mQTLs, and sQTLs are 4.43-fold, 4.34-fold, 4.29-fold, and 4.21-fold enriched at human quasi-prime loci, respectively. These findings indicate that nucleic quasi-primes are genomic loci linked to the evolution of species-specific traits, and in humans, they provide insights in the development of cognitive traits and human diseases, including neurodevelopmental disorders.
Keywords
Humans, Species Specificity, Animals, Genome, Human, Sequence Analysis, DNA/methods, Quantitative Trait Loci, High-Throughput Nucleotide Sequencing, Oligonucleotides/genetics, Genomics/methods
Pubmed
Web of science
Create date
08/01/2025 14:31
Last modification date
17/05/2025 7:09