Composite Key Generation on a Shared-Nothing Architecture

Details

Serval ID
serval:BIB_8138B5B32572
Type
Inproceedings: an article in a conference proceedings.
Collection
Publications
Institution
Title
Composite Key Generation on a Shared-Nothing Architecture
Title of the conference
Performance Characterization and Benchmarking. Traditional to Big Data, 6th TPC Technology Conference
Author(s)
Hoffmann M., Alexandrov A., Andritsos P., Soto J., Markl V.
Publisher
Springer International Publishing
Address
Hangzhu, China
ISBN
9783319153490
9783319153506
ISSN
0302-9743
1611-3349
Publication state
Published
Issued date
2015
Peer-reviewed
Oui
Volume
8904
Series
Lecture Notes in Computer Science (LNCS)
Pages
188-203
Language
english
Abstract
Generating synthetic data sets is integral to benchmarking, debugging, and simulating future scenarios. As data sets become larger, real data characteristics thereby become necessary for the success of new algorithms. Recently introduced software systems allow for synthetic data generation that is truly parallel. These systems use fast pseudorandom number generators and can handle complex schemas and uniqueness constraints on single attributes. Uniqueness is essential for forming keys, which identify single entries in a database instance. The uniqueness property is usually guaranteed by sampling from a uniform distribution and adjusting the sample size to the output size of the table such that there are no collisions. However, when it comes to real composite keys, where only the combination of the key attribute has the uniqueness property, a different strategy needs to be employed. In this paper, we present a novel approach on how to generate composite keys within a parallel data generation framework. We compute a joint probability distribution that incorporates the distributions of the key attributes and use the unique sequence positions of entries to address distinct values in the key domain.
Create date
22/08/2017 14:11
Last modification date
21/08/2019 6:16
Usage data