Composite Key Generation on a Shared-Nothing Architecture

Détails

ID Serval
serval:BIB_8138B5B32572
Type
Actes de conférence (partie): contribution originale à la littérature scientifique, publiée à l'occasion de conférences scientifiques, dans un ouvrage de compte-rendu (proceedings), ou dans l'édition spéciale d'un journal reconnu (conference proceedings).
Collection
Publications
Institution
Titre
Composite Key Generation on a Shared-Nothing Architecture
Titre de la conférence
Performance Characterization and Benchmarking. Traditional to Big Data, 6th TPC Technology Conference
Auteur⸱e⸱s
Hoffmann M., Alexandrov A., Andritsos P., Soto J., Markl V.
Editeur
Springer International Publishing
Adresse
Hangzhu, China
ISBN
9783319153490
9783319153506
ISSN
0302-9743
1611-3349
Statut éditorial
Publié
Date de publication
2015
Peer-reviewed
Oui
Volume
8904
Série
Lecture Notes in Computer Science (LNCS)
Pages
188-203
Langue
anglais
Résumé
Generating synthetic data sets is integral to benchmarking, debugging, and simulating future scenarios. As data sets become larger, real data characteristics thereby become necessary for the success of new algorithms. Recently introduced software systems allow for synthetic data generation that is truly parallel. These systems use fast pseudorandom number generators and can handle complex schemas and uniqueness constraints on single attributes. Uniqueness is essential for forming keys, which identify single entries in a database instance. The uniqueness property is usually guaranteed by sampling from a uniform distribution and adjusting the sample size to the output size of the table such that there are no collisions. However, when it comes to real composite keys, where only the combination of the key attribute has the uniqueness property, a different strategy needs to be employed. In this paper, we present a novel approach on how to generate composite keys within a parallel data generation framework. We compute a joint probability distribution that incorporates the distributions of the key attributes and use the unique sequence positions of entries to address distinct values in the key domain.
Création de la notice
22/08/2017 14:11
Dernière modification de la notice
21/08/2019 6:16
Données d'usage