Probabilistic base calling of Solexa sequencing data.

Détails

Ressource 1Télécharger: BIB_54200D66E5BC.P001.pdf (503.55 [Ko])
Etat: Public
Version: de l'auteur⸱e
ID Serval
serval:BIB_54200D66E5BC
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
Probabilistic base calling of Solexa sequencing data.
Périodique
BMC Bioinformatics
Auteur⸱e⸱s
Rougemont J., Amzallag A., Iseli C., Farinelli L., Xenarios I., Naef F.
ISSN
1471-2105 (Electronic)
ISSN-L
1471-2105
Statut éditorial
Publié
Date de publication
2008
Volume
9
Pages
431-
Langue
anglais
Résumé
BACKGROUND: Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA colonies. The processing and statistical analysis of such high-throughput data poses new challenges; currently a fair proportion of the tags are routinely discarded due to an inability to match them to a reference sequence, thereby reducing the effective throughput of the technology.
RESULTS: We propose a novel base calling algorithm using model-based clustering and probability theory to identify ambiguous bases and code them with IUPAC symbols. We also select optimal sub-tags using a score based on information content to remove uncertain bases towards the ends of the reads.
CONCLUSION: We show that the method improves genome coverage and number of usable tags as compared with Solexa's data processing pipeline by an average of 15%. An R package is provided which allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots.
Mots-clé
Bacteriophage phi X 174/genetics, Base Sequence/genetics, Chromosome Mapping/methods, Cluster Analysis, DNA, Viral/analysis, Expressed Sequence Tags, Pattern Recognition, Automated/methods, Quality Control, Sequence Analysis, DNA/methods, Software, Spectrometry, Fluorescence/methods
Pubmed
Web of science
Open Access
Oui
Création de la notice
18/10/2012 9:10
Dernière modification de la notice
20/08/2019 15:09
Données d'usage