A likelihood method for estimating present-day human contamination in ancient DNA samples using low-depth haploid chromosome data

Moreno-Mayar, J. Víctor; Korneliussen, Thorfinn Sand; Albrechtsen, Anders; Dalal, Jyoti; Renaud, Gabriel; Nielsen, Rasmus; Malaspinas, Anna-Sapfo

doi:10.1101/594481

A likelihood method for estimating present-day human contamination in ancient DNA samples using low-depth haploid chromosome data

Détails

Télécharger: 594481.full.pdf (9318.28 [Ko])
Etat: Public
Version: de l'auteur⸱e
Licence: Non spécifiée

ID Serval

serval:BIB_84C6FA3621CE

Type

Autre: (aucun autre type ne convient)

Collection

Publications

Institution

UNIL/CHUV

Titre

A likelihood method for estimating present-day human contamination in ancient DNA samples using low-depth haploid chromosome data

Auteur⸱e⸱s

Moreno-Mayar J. Víctor (co-premier), Korneliussen Thorfinn Sand (co-premier), Albrechtsen Anders, Dalal Jyoti, Renaud Gabriel, Nielsen Rasmus, Malaspinas Anna-Sapfo

Date de publication

05/2019

Langue

anglais

Résumé

Motivation The presence of present-day human contaminating DNA fragments is one of the challenges defining ancient DNA (aDNA) research. This is especially relevant to the ancient human DNA field where it is difficult to distinguish endogenous molecules from human contaminants due to their genetic similarity. Recently, with the advent of high-throughput sequencing and new aDNA protocols, hundreds of ancient human genomes have become available. Contamination in those genomes has been measured with computational methods often developed specifically for these empirical studies. Consequently, some of these methods have not been implemented and tested while few are aimed at low-depth data, a common feature in aDNA datasets.
Results We develop a new X-chromosome-based maximum likelihood method for estimating present-day human contamination in low-depth sequencing data. We implement our method for general use, assess its performance under conditions typical of ancient human DNA research, and compare it to previous nuclear data-based methods through extensive simulations. For low-depth data, we show that existing methods can produce unusable estimates or substantially underestimate contamination. In contrast, our method provides accurate estimates for a depth of coverage as low as 0.5× on the X-chromosome when contamination is below 25%. Moreover, our method still yields meaningful estimates in very challenging situations, i.e., when the contaminant and the target come from closely related populations or with increased error rates. With a running time below five minutes, our method is applicable to large scale aDNA genomic studies.
Availability and implementation The method is implemented in C++ and R and is freely available in https://github.com/sapfo/contaminationX.
Contact morenomayar{at}gmail.com, annasapfo.malaspinas{at}unil.ch.

Mots-clé

ancient DNA, contamination, population genetics

URN

urn:nbn:ch:serval-BIB_84C6FA3621CE1

OAI-PMH

oai:serval.unil.ch:BIB_84C6FA3621CE

DOI

10.1101/594481

Open Access

Oui

Financement(s)

Fonds national suisse / Carrières

Commission Européenne / H2020 / CAMERA

Création de la notice

16/06/2019 15:10

Dernière modification de la notice

09/04/2024 6:14

Données d'usage

SERVAL

serveur académique lausannois

A likelihood method for estimating present-day human contamination in ancient DNA samples using low-depth haploid chromosome data

Détails